On Thu, Dec 11, 2008 at 10:02 PM, I.A <iq...@marimore.co.jp> wrote: > > Hi, > > I'm wondering the correct way to save unicode data in MySQL through > django's models. > The application I'm building reads a raw email and then stores the > text data into the database. I process the email like so: > > import email > mess822 = email.message_from_string(email_raw_data) > mail_unicode_data = unicode(mess822.get_payload(), > mess822.get_content_charset('utf-8')) > > Nearly all of the email that I get are from Japanese clients emails, > mess822.get_content_charset() will usually return 'iso-2022-jp' > > Below is the model i'm using > > class ClientsEmail(models.Model): > mail_text = models.CharField(max_length=255) > > And below is the mysql character settings: > > mysql> SHOW VARIABLES LIKE 'char%'; > > +--------------------------+----------------------------------+ > | Variable_name | Value | > +--------------------------+----------------------------------+ > | character_set_client | utf8 | > | character_set_connection | utf8 | > | character_set_database | utf8 | > | character_set_filesystem | binary | > | character_set_results | utf8 | > | character_set_server | utf8 | > | character_set_system | utf8 | > | character_sets_dir | /usr/local/share/mysql/charsets/ | > +--------------------------+----------------------------------+ > > After getting the payload data from the email, I do > > ce = ClientsEmail(mail_text = mail_unicode_data) > ce.save() > > and everything looks fine. My question is, is it required/better to > do > ce = ClientsEmail(mail_text = mail_unicode_data.encode('utf-8')) > instead since the db is set for utf-8? or is this part silently > handled by django models? > > From what I gather in this thread > > http://groups.google.com/group/django-users/browse_thread/thread/9ba1edf317a9c3e7/caf70ecc9ea72d97?lnk=gst&q=unicode+mysql#caf70ecc9ea72d97 > > It looks like it's required, but I would like to be sure. > > Better to rely on the current documentation:
http://docs.djangoproject.com/en/dev/ref/unicode/ than a 2.5 year old conversation on the user's list (there have been many changes in 2.5 years, including Unicode support throughout Django). In short, no you don't have to re-encode into a utf-8 bytestring to create your model instance. Django accepts unicode, and the unicode will be encoded to a bytestring using the correct encoding before it gets stored in the DB. Karen --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---