Jervis Whitley <jervi...@gmail.com> added the comment:

Hi all,

This patch takes the approach of assuming utf-8 format encoding
for files opened with 'rb' directive. 

That is:

1. Check if each line is Unicode Or Bytes Type.
2. If Bytes, get char array reference to internal buffer.
3. use PyUnicode_FromString to create a new unicode object from the
char* - This step assumes UTF-8.
4. get a Py_UNICODE reference to internal unicode object buffer and 
   continue as before.

Is this in the right direction at all?

Cheers,

Jervis

----------
message_count: 9.0 -> 10.0
nosy: +jdwhitley
nosy_count: 5.0 -> 6.0
Added file: http://bugs.python.org/file13279/csv.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue4847>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to