Malcolm, I noticed that Django's current serialization doesn't deal with unicode well.
When using MySQLdb with utf8 as the encoding, charfields return unicode objects. base.get_string_field breaks because it calls str() on the field, forcing python to use sys.getdefaultencoding(). As a result, the XML serializer dies. The (default) json serializer sidesteps the issue by using ensure_ascii, which escapes unicode using \uXXXX notation. I did a minimal patch to make unicode serialization work for me, but I think this is an area that needs some love on the unicode branch. My patch probly isn't a reasonable solution in general. Another issue is that mysql requires "utf8" as the encoding string, but python's SAX parser requires "utf-8", so that writing a file (using DEFAULT_CHARSET='utf8' (with my patch) works, but I have to hand-edit the encoding declaration to 'utf-8' to make SAX happy. Cheers, Jeremy --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---
Index: django/core/serializers/base.py =================================================================== --- django/core/serializers/base.py (revision 164) +++ django/core/serializers/base.py (working copy) @@ -59,6 +59,9 @@ value = getattr(obj, "get_%s_url" % field.name, lambda: None)() else: value = field.flatten_data(follow=None, obj=obj).get(field.name, "") + if isinstance(value, unicode): + value = value.encode('utf-8') + return str(value) def start_serialization(self): Index: tests/regressiontests/serializers_regress/tests.py =================================================================== --- tests/regressiontests/serializers_regress/tests.py (revision 164) +++ tests/regressiontests/serializers_regress/tests.py (working copy) @@ -91,6 +91,7 @@ (data_obj, 13, CharData, "null"), (data_obj, 14, CharData, "NULL"), (data_obj, 15, CharData, None), + (data_obj, 16, CharData, u"\u2665"), (data_obj, 20, DateData, datetime.date(2006,6,16)), (data_obj, 21, DateData, None), (data_obj, 30, DateTimeData, datetime.datetime(2006,6,16,10,42,37)),