Andrew McNamara wrote:
> 
> I'm not altogether sure there. The parsing state machine is all
> written in C, and deals with signed chars - I expect we'll need two
> versions of that (or one version that's compiled twice using
> pre-processor macros). Quite a large job. Suggestions gratefully
> received.

How about using UTF-8 internally?  Change nothing in _csv.c, but in
csv.py encode/decode any unicode strings into UTF-8 on the way to/from
_csv.  File-like objects passed in by the user can be wrapped in
proxies that take care of encoding and decoding user strings, as well
as trans-coding between UTF-8 and the users chosen file encoding.

All that coding work may slow things down, but your original fast _csv
module will still be there when you need it.

- Anders
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to