#8340: MySQL case sensitivity and utf8_bin / 2710
------------------------------------+---------------------------------------
          Reporter:  hrauch         |         Owner:  nobody
            Status:  new            |     Milestone:        
         Component:  Uncategorized  |       Version:  SVN   
        Resolution:                 |      Keywords:        
             Stage:  Unreviewed     |     Has_patch:  0     
        Needs_docs:  0              |   Needs_tests:  0     
Needs_better_patch:  0              |  
------------------------------------+---------------------------------------
Comment (by Karen Tracey <[EMAIL PROTECTED]>):

 Replying to [comment:6 hrauch]:>
 > When I changed most of our var chars field to utf8_bin on my developing
 system (SuSE 11), everything works fine. But if I do the same with our
 production server (Ubuntu 8.04), I'll sometimes get UnicodeDecodeErrors,
 since some of the var char fields do return normal strings instead of
 unicode strings. Both systems use the same django version; both systems
 use the same database by dumping and restorung the database.
 >

 Two things could cause the !UnicodeDeocdeErrors only occurring on one
 machine.  First, only MySQLdb 1.2.2 returns character fields with binary
 collation as bytestrings, MySQL 1.2.1p2 returns them as unicode.  So if
 your SuSE 11 system is using the older MySQLdb then you would not see this
 problem there.  (The problem with the behavior of the older level is that
 it will either throw an exception or corrupt truly binary data when it
 tries to convert it to unicode assuming a utf8 encoding; the fix for that
 bug introduced the behavior you see now.)  Second, you'll only get the
 !UnicodeDecodeErrors when you actually access data with non-ASCII chars,
 so if you did not happen to access problematic data on your development
 system, it would appear to work even if it too is running the latest
 MySQLdb level.

 [[BR]]
 > If I turn back theses fields from utf8_bin to ut8_unicode_ci, the
 production system works well again.

 Alternatively you could keep the utf8_bin collation and use
 django.utils.encoding.force_unicode() on the values of your binary-
 collated character character fields to properly transform them to unicode.
 See this thread on django-users:  http://groups.google.com/group/django-
 users/browse_thread/thread/d7dd21493ab5f1fa/eafc9959bb3302f6


 [[BR]]
 >
 > Another problem: If I use utf8_bin for a field and use the query
 operator !__iexect, it does work correctly.
 >

 I think you mean it does not work correctly?  As it stands today if you
 use MySQL, depending on how you have your collation set, either `exact` or
 `iexact` will work correctly, not both.  Django does not specify a
 collation for either comparison, so both use the column's default
 collation, which will only be correct for one of them.  To get them both
 to work would require, I believe, some changes in the way Django interacts
 with the backend to construct the query, see some of the comments I made
 on #8102.

 [[BR]]
 > So I must say, that case sensitivity with mysql and utf8_bin doesn't
 work well.
 I don't think anyone sees the current behavior as ideal, just the best
 that could be achieved for the 1.0 timeframe.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/8340#comment:7>
Django Code <http://code.djangoproject.com/>
The web framework for perfectionists with deadlines
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to