On 1 February 2012 20:27, Ed Hagen <eha...@gmail.com> wrote: > Hi, > > Let me preface this request by saying that when it comes to django, > I'm an advanced beginner (so this might be a dumb request). > > The motivation for my request involved users of a django-based > database of international scholars who wanted their names sorted > "correctly." I explained that different languages sorted characters > differently, and therefore there was no single correct sort order, but > I promised to see if I could easily implement language-specific > orderings. What I found was that django seems to rely on the database > for this feature: > > https://docs.djangoproject.com/en/dev/ref/databases/#collation-settings > > which (if I've understood things correctly) makes sense for > performance reasons, but makes it more difficult to change things on > the fly, e.g., to provide language-specific ordering.
Performance is the main concern here. Any query with ordering on a text field would have to fetch all results and sort it on the application side. It's just terrible. > > Using suggestions on this page: > > http://stackoverflow.com/questions/1097908/how-do-i-sort-unicode-strings-alphabetically-in-python > Weirdly enough, I was looking at this thread lately, trying to explain to a beginner, why Python doesn't provide an easy way to do this, which actually works :(. Summary of options: 1) Use PyICU - this would solve a lot of problems (some which Django already solves by itself). But it's quite a big dependency on an external package (written in C++, so I guess it won't run on PyPy, Jython nor App Engine). Django currently has no external dependencies and that's good :) 2) Use the "locale" module: it will work... if you have all the possible locales compiled on your OS... and you're not running on Windows... or using threads. AFAIK, switching locale is also quite slow. 3) Use some other listed libraries: none of them looks like maintained by authors. 4) Write UCA ourselves from scratch. This involves including 1.6MB collation table in Django. All those solutions of course, still have the problem of needing all the data on the application side. > I fixed things well-enough for my present purposes, but I thought it > would be useful to abstract this capability away from the database, > with django itself providing some version of the Unicode collation > algorithm: > > http://unicode.org/reports/tr10/ > > This might hook into django's internationalization and localization > features, and/or be accessible at a lower level, e.g., with a keyword > argument to, or variant of, "order_by". Could you describe your current solution in more detail? -- Łukasz Rekucki -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.