On Wed, Mar 26, 2008 at 10:10 AM, sabrina.miller <[EMAIL PROTECTED]> wrote:
> I have a little problem. My data base is a mysql ENGINE=MyISAM > CHARSET=utf8. I'm using Django revision: 6411 > > mysql> select id,name from core_category; > +----+----------------------+ > | id | name | > +----+----------------------+ > | 1 | a�os ha que no te v� | > | 2 | ámigo que onda!!! | > | 3 | La lúna!!! | > +----+----------------------+ > 3 rows in set (0.04 sec) Note what you see here when you run mysql is dependent on a couple of things: -the character encoding of your terminal (I'm guessing yous is utf-8) -the mysql connection characteristics, which default to latin1. That is, mysql defaults to assuming incoming data from the client is in latin1, and defaults to sending back latin1 results. So even if your table has charset utf8, the mysql command is going to translate it to latin1 for display here unless you issue a command like 'set names utf8'. For full details on this you may want to check out: http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html Assuming that your terminal character encoding is utf-8 and you have not issued a 'set names utf8' from mysql, the entries that look wrong above are actually the last two, not the first one. I believe it is the methods by which the second two were inserted that are causing the problem here, because you are winding up with mysql supplying supposedly 'latin1' data that looks correct when it is assumed to be utf8. The first value was insert by the django admin console and is is > latin1, The bd see it wrong. Django doesn't have any latin1 defaults, so it seems unlikely Django turned utf8 data into latin1. It is more likely MySQL turned the utf8 supplied by Django into latin1 for storage in a table with CHARSET latin1, or for sending over a connection that it thinks is expecting latin1 (the defaults). As an aside, I don't know what you mean by "the bd" here? > The second value was insert by a simple direct insert in the bd. Which means I also don't know what this means, exactly. But it's easy enough to directly insert utf8 data into a latin1 charset table via the mysql command, especially since MySQL defaults to thinking data supplied by the client is in latin1 charset. You need to issue the command 'set names utf8' if you want to supply utf-8 encoded data. And the third was insert by the django shell (python manage.py shell) > encoding in utf8 and is OK!!! More details on how this one was done would be helpful in understanding what happened here. Django should have issued the 'set names utf8' for you, so I am surprised this one is coming out the same as the 'direct insert' case. See it yourself: > > >>> a=pymy.connect(host='localhost', > db='test',user='sabri',passwd='sabri') > >>> c=a.cursor() > >>> c.execute('select id,name from core_category') > >>> d=c.fetchall() > >>> d > ((1L, 'a\xf1os ha que no te v\xed'), (2L, '\xc3\xa1migo que onda!!!'), > (3L, 'La l\xc3\xbana!!!')) > >>> name=d[0][1] > >>> name.decode('utf8') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "encodings/utf_8.py", line 16, in decode > UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-4: > invalid data > >>> print name.decode('latin1') > años ha que no te ví > >>> name2=d[1][1] > >>> print name2.decode('utf8') > ámigo que onda!!! > >>> name2=d[2][1] > >>> print name2.decode('utf8') > La lúna!!! > Again absent your issuing a 'set names utf8' on the connection, MySQL is going to send back latin1, regardless of the table encoding. Which makes the 2nd two results the odd looking ones. When i see my data through the django admin console the only one which > i see ok is the first one. See it: > años ha que no te ví > La lúna!!! > ámigo que onda!!! > > Thing is than django takes data as latin1 and save it in that > encoding. When show it, (my utf8 encoding data) do it in a bizarre > way, because interpret it in latin1. It's MySQL that has the default of latin1 everywhere, not Django. You need to be careful when not using Django and supplying utf8 data to MySQL that you have told MySQL that the data is utf8-encoded. I believe there is a way (described in the MySQL doc page I cited above) to globally change your MySQL config so it will expect/supply utf8 instead of latin1, so you might want to look into that. Karen --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---