On Sun, Sep 2, 2012 at 1:36 AM, <wxjmfa...@gmail.com> wrote: > I still remember my thoughts when I read the PEP 393 > discussion: "this is not logical", "they do no understand > typography", "atomic character ???", ...
That would indicate one of two possibilities. Either: 1) Everybody in the PEP 393 discussion except for you is clueless about how to implement a Unicode type; or 2) You are clueless about how to implement a Unicode type. Taking into account Occam's razor, and also that you seem to be unable or unwilling to offer a solid rationale for those thoughts, I have to say that I'm currently leaning toward the second possibility. > Real world exemples. > >>>> import libfrancais >>>> li = ['noël', 'noir', 'nœud', 'noduleux', \ > ... 'noétique', 'noèse', 'noirâtre'] >>>> r = libfrancais.sortfr(li) >>>> r > ['noduleux', 'noël', 'noèse', 'noétique', 'nœud', 'noir', > 'noirâtre'] libfrancais does not appear to be publicly available. It's not listed in PyPI, and googling for "python libfrancais" turns up nothing relevant. Rewriting the example to use locale.strcoll instead: >>> li = ['noël', 'noir', 'nœud', 'noduleux', 'noétique', 'noèse', 'noirâtre'] >>> import locale >>> locale.setlocale(locale.LC_ALL, 'French_France') 'French_France.1252' >>> import functools >>> sorted(li, key=functools.cmp_to_key(locale.strcoll)) ['noduleux', 'noël', 'noèse', 'noétique', 'nœud', 'noir', 'noirâtre'] # Python 3.2 >>> import timeit >>> timeit.repeat("sorted(li, key=functools.cmp_to_key(locale.strcoll))", >>> "import functools; import locale; li = ['noël', 'noir', 'nœud', 'noduleux', >>> 'noétique', 'noèse', 'noirâtre']", number=10000) [0.5544277025009592, 0.5370117249557325, 0.5551836677925053] # Python 3.3 >>> import timeit >>> timeit.repeat("sorted(li, key=functools.cmp_to_key(locale.strcoll))", >>> "import functools; import locale; li = ['noël', 'noir', 'nœud', 'noduleux', >>> 'noétique', 'noèse', 'noirâtre']", number=10000) [0.1421166788364303, 0.12389078130001963, 0.13184190553613462] As you can see, Python 3.3 is about 77% faster than Python 3.2 on this example. If this was intended to show that the Python 3.3 Unicode representation is a regression over the Python 3.2 implementation, then it's a complete failure as an example. -- http://mail.python.org/mailman/listinfo/python-list