On Mon, May 28, 2012 at 10:43 PM, Aaron Meurer <[email protected]> wrote:
> Hi.
>
> I've been working on fixing the Python 3.3 issues, most of which stem
> from hash randomization. I was thinking, instead of trying to fix
> every algorithm that recurses through .args, it would be easier to
> just make .args canonical. Are there any suggestions on a key that we
> could use to sort the args for Add and Mul that would be just as fast
> as hash, but be the same everywhere?
>
> I was thinking of using _hashable_content. This should be identical
> to sorting by hash, in the sense that the has is computed from the
> _hashable_content, but unlike the hash, it will always compare the
> same everywhere. There are a few issues with this idea:
>
> 1. The hash is not exactly computed from _hashable_content. Rather,
> the args are in _hashable_content, and the name of the object is
> injected manually by __hash__. Assumedly this can be fixed, though.
>
> 2. This would put a restriction on _hashable_content that it must not
> only be hashable, but also comparable. This is not an issue now, as
> far as I can tell, because it consists only of tuples, ints, and
> strings, but for example if we ever wanted to use frozenset, we
> wouldn't be able to (we'd rather have to sort it into a tuple, which
> could potentially be a performance issue). I haven't made a deep
> study of _hashable_content yet, but I don't think this is currently an
> issue, but I'm wondering if anyone can foresee it being one.
Ah, I guess this is not true. _hashable_content generally contains
other Basic objects. So what I really want is the "fully denested"
_hashable_content, where each Basic object is recursively replaced
with its _hashable_content. I've no idea how expensive that would be
to compute.
I'm also open to other suggestions for an equitable sort key.
Aaron Meurer
>
> 3. I made a change to use this, and from what I can tell, it isn't
> slower, but I'm not 100% convinced on my benchmarking. Can anyone
> suggest any benchmarks that would be good to test this? So far, I've
> been testing
>
> a = [Symbol('x%d' %i) for i in xrange(10000)]
> random.shuffle(a)
> Mul(*a)
> Add(*a)
>
> using IPython's %timeit magic, and there seems to be no significant
> difference. If anyone can suggest a better way to test it, let me
> know. Or better yet, test it yourself. I'll post a PR that
> implements it as soon as I change _hashable_content to include the
> name (hopefully that won't be too difficult).
>
> Aaron Meurer
--
You received this message because you are subscribed to the Google Groups
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sympy?hl=en.