Le vendredi 28 décembre 2012 00:17:53 UTC+1, Ian a écrit : > On Thu, Dec 27, 2012 at 3:17 PM, Terry Reedy <tjre...@udel.edu> wrote: > > >> PS Py 3.3 warranty: ~30% slower than Py 3.2 > > > > > > > > > Do you have any actual timing data to back up that claim? > > > If so, please give specifics, including build, os, system, timing code, and > > > result. > > > > There was another thread about this one a while back. Using IDLE on Windows > XP: > > > > >>> import timeit, locale > > >>> li = ['noël', 'noir', 'nœud', 'noduleux', 'noétique', 'noèse', 'noirâtre'] > > >>> locale.setlocale(locale.LC_ALL, 'French_France') > > 'French_France.1252' > > > > >>> # Python 3.2 > > >>> min(timeit.repeat("sorted(li, key=locale.strxfrm)", "import locale; from > >>> __main__ import li", number=100000)) > > 1.1581226105552531 > > > > >>> # Python 3.3.0 > > >>> min(timeit.repeat("sorted(li, key=locale.strxfrm)", "import locale; from > >>> __main__ import li", number=100000)) > > 1.4595282361305697 > > > > 1.460 / 1.158 = 1.261 > > > > >>> li = li * 100 > > >>> import random > > >>> random.shuffle(li) > > > > >>> # Python 3.2 > > >>> min(timeit.repeat("sorted(li, key=locale.strxfrm)", "import locale; from > >>> __main__ import li", number=1000)) > > 1.233450899485831 > > > > >>> # Python 3.3.0 > > >>> min(timeit.repeat("sorted(li, key=locale.strxfrm)", "import locale; from > >>> __main__ import li", number=1000)) > > 1.5793845307155152 > > > > 1.579 / 1.233 = 1.281 > > > > So about 26% slower for sorting a short list of French words and about > > 28% slower for a longer list. Replacing the strings with ASCII and > > removing the 'key' argument gives a comparable result for the long > > list but more like a 40% slowdown for the short list.
---- Not related to this thread, for information. My sorting algorithm is doing a little bit more than a "locale.strxfrm". locale.strxfrm works precisely fine with the list I gave as an exemple, it fails in many cases. One of the bottlenecks is the "œ", which must be seen as "oe". It is not the place to discuss this kind of linguistic aspects here. My algorithm does not use unicodedata or unicode normalization. Mainly a lot of chars / substrings substitution for the creation of the primary keys. jmf -- http://mail.python.org/mailman/listinfo/python-list