#8340: MySQL case sensitivity and utf8_bin / 2710
------------------------------------+---------------------------------------
Reporter: hrauch | Owner: nobody
Status: new | Milestone:
Component: Uncategorized | Version: SVN
Resolution: | Keywords:
Stage: Unreviewed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
------------------------------------+---------------------------------------
Comment (by Karen Tracey <[EMAIL PROTECTED]>):
Replying to [comment:6 hrauch]:>
> When I changed most of our var chars field to utf8_bin on my developing
system (SuSE 11), everything works fine. But if I do the same with our
production server (Ubuntu 8.04), I'll sometimes get UnicodeDecodeErrors,
since some of the var char fields do return normal strings instead of
unicode strings. Both systems use the same django version; both systems
use the same database by dumping and restorung the database.
>
Two things could cause the !UnicodeDeocdeErrors only occurring on one
machine. First, only MySQLdb 1.2.2 returns character fields with binary
collation as bytestrings, MySQL 1.2.1p2 returns them as unicode. So if
your SuSE 11 system is using the older MySQLdb then you would not see this
problem there. (The problem with the behavior of the older level is that
it will either throw an exception or corrupt truly binary data when it
tries to convert it to unicode assuming a utf8 encoding; the fix for that
bug introduced the behavior you see now.) Second, you'll only get the
!UnicodeDecodeErrors when you actually access data with non-ASCII chars,
so if you did not happen to access problematic data on your development
system, it would appear to work even if it too is running the latest
MySQLdb level.
[[BR]]
> If I turn back theses fields from utf8_bin to ut8_unicode_ci, the
production system works well again.
Alternatively you could keep the utf8_bin collation and use
django.utils.encoding.force_unicode() on the values of your binary-
collated character character fields to properly transform them to unicode.
See this thread on django-users: http://groups.google.com/group/django-
users/browse_thread/thread/d7dd21493ab5f1fa/eafc9959bb3302f6
[[BR]]
>
> Another problem: If I use utf8_bin for a field and use the query
operator !__iexect, it does work correctly.
>
I think you mean it does not work correctly? As it stands today if you
use MySQL, depending on how you have your collation set, either `exact` or
`iexact` will work correctly, not both. Django does not specify a
collation for either comparison, so both use the column's default
collation, which will only be correct for one of them. To get them both
to work would require, I believe, some changes in the way Django interacts
with the backend to construct the query, see some of the comments I made
on #8102.
[[BR]]
> So I must say, that case sensitivity with mysql and utf8_bin doesn't
work well.
I don't think anyone sees the current behavior as ideal, just the best
that could be achieved for the 1.0 timeframe.
--
Ticket URL: <http://code.djangoproject.com/ticket/8340#comment:7>
Django Code <http://code.djangoproject.com/>
The web framework for perfectionists with deadlines
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---