I have a Django app which processes emails. It is often handed emails with unicode characters in them. My understanding is that Python and Django handle unicode just fine and somewhat transparently. I was, however, told that I need to set my database tables to UTF-8 encoding. I have done this. Yet I still frequently get errors such as this when my app encounters unicode:
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/django/core/handlers/base.py", line
92, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/var/spool/filter/email_archive/store_emails/views.py", line 84, in
mail_detail
return render_to_response('mail_detail.html', {'mail': ourmail,
File "/usr/lib/python2.4/site-packages/django/shortcuts/__init__.py", line
20, in render_to_response
return HttpResponse(loader.render_to_string(*args, **kwargs),
**httpresponse_kwargs)
File "/usr/lib/python2.4/site-packages/django/template/loader.py", line 108,
in render_to_string
return t.render(context_instance)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
178, in render
return self.nodelist.render(context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
779, in render
bits.append(self.render_node(node, context))
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
792, in render_node
return node.render(context)
File "/usr/lib/python2.4/site-packages/django/template/loader_tags.py", line
97, in render
return compiled_parent.render(context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
178, in render
return self.nodelist.render(context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
779, in render
bits.append(self.render_node(node, context))
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
792, in render_node
return node.render(context)
File "/usr/lib/python2.4/site-packages/django/template/loader_tags.py", line
24, in render
result = self.nodelist.render(context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
779, in render
bits.append(self.render_node(node, context))
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
792, in render_node
return node.render(context)
File "/usr/lib/python2.4/site-packages/django/template/defaulttags.py", line
243, in render
return self.nodelist_true.render(context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
779, in render
bits.append(self.render_node(node, context))
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
792, in render_node
return node.render(context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
831, in render
return _render_value_in_context(output, context)
File "/usr/lib/python2.4/site-packages/django/template/__init__.py", line
811, in _render_value_in_context
value = force_unicode(value)
File "/usr/lib/python2.4/site-packages/django/utils/encoding.py", line 92,
in force_unicode
raise DjangoUnicodeDecodeError(s, *e.args)
DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in
position 1468: unexpected code byte. You passed in "\nGood
Day,\n\n\n\nWe offer a part time job on your computer.
<text of spam containing unicode deleted>
There is a 0x92 in position 1468 just as the error says.
Do I need to be doing a .encode('utf-8') before putting anything into
the db? I cannot seem to get a clear answer on this. Some say no, some
say yes. Do I need to do any decoding or anything on data pulled out
of the db? I have been told that MySQL should be handling all of this
for me.
I have been banging my head on this particular error off and on for a
couple of weeks and cannot seem to find the solution.
Any pointers appreciated.
--
Tracy Reed
http://tracyreed.org
pgpptIxTeQJx0.pgp
Description: PGP signature

