On 04/03/2013 04:22 AM, Neil Hodgson wrote:
rusi:

Can you please try one more experiment Neil?
Knock off all non-ASCII strings (paths) from your dataset and try
again.

    Results are the same 0.40 (well, 0.001 less but I don't think the
timer is that accurate) for Python 3.2 and 0.78 for Python 3.3.

    Neil

That would seem to imply that the speed regression on your data is NOT caused by the differing size encodings. Perhaps it is the difference in MSC compiler version, or other changes made between 3.2 and 3.3

Of course, I can't then explain why Steven didn't get the same results. Perhaps the difference between 32bit Python and 64 on Windows? Or perhaps you have significantly more (or significantly fewer) "collisions" than Steven did.


Before I saw this message, I was thinking of suggesting that you supply a key= parameter to sort, specifying as a key the Unicode character 65536 higher than the one supplied. That way all the keys to be sorted would be 32 bits in size. If this made the timings change noticeably, it could be a big clue.

--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to