#12849: django's development server raises an encoding exception when trying to
colorize non-ascii text
------------------------------------------------+---------------------------
Reporter: jype | Owner: nobody
Status: closed | Milestone: 1.2
Component: django-admin.py runserver | Version: SVN
Resolution: fixed | Keywords:
Stage: Accepted | Has_patch: 1
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
------------------------------------------------+---------------------------
Comment (by russellm):
Replying to [comment:13 kmtracey]:
> I'm not sure the committed fix is the best alternative, though. When
printing out management command errors I think it would be better to use
sys.stderr.encoding, if it exists, since that is where we are sending the
output. Windows for example won't be using utf-8 as the terminal encoding
(by default), so the fix as committed will result in unreadable output for
non-ASCII exception data on Windows. Better than an exception I suppose
but I think it would be better to attempt to use the right encoding, and
also specify replace rather than strict for error handling so that if the
data can't be encoded in the target charset then we still don't raise an
exception.
There are two issues here.
Firstly, whatever encoding we choose, there are going to be problems. On
platforms where stderr is ASCII (or equivalent), there is no reliable way
to print non-ASCII characters. So, we need to choose ignore/replace (to
fail silently) or strict (which will raise the same errors being reported
by this ticket). So on ASCII terminals, we're always going to have
problems -- it's just a matter of how we hide (or raise) the errors. That
said: For the record, my ANSI_X3.4-1968 test box actually manages to print
the special unicode characters correctly as long as the bytestring is
encoded utf-8. This is why I checked in the patch I did.
Secondly, if we do anything in this area, we're going to need to audit
pretty much all the current management commands. At the moment, there is a
certain amount of confusion regarding how and when text is encoded for
display; for example, sqlall builds everything in unicode, and encodes to
UTF-8 before returning the value for the base command framework to print
the value. We would be well served to do a full teardown here and ensure
we keep unicode right up to the last moment -- but that's a much bigger
patch.
[12849] fixed every observable case that I could generate, on UTF-8 and
ASCII terminals. I have no doubt that there are cp1252 or KOI8-R terminals
that will still have problems, but we'll need some new test cases (and
platforms on which to test them).
> Which brings me to: I'd still like to understand when this problem crops
up, for the colorize case. Unicode query string parms are percent-encoded
in output -- under what circumstances is the server being asked to
colorize unicode data containing non-ASCII characters?
The test case I've been using is to have a database model with a field
that has db_column=u'hello\u00c2\u00c3'. This is output as the column name
under sqlall, which broke when it was passed into str() at the start of
colorize.
--
Ticket URL: <http://code.djangoproject.com/ticket/12849#comment:14>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-updates?hl=en.