Andrew McNamara wrote:
Yes, although it would be nice to also retain the 8-bit versions as well.

You can do so by using latin-1 as default encoding. Works great !

Yep, although that means we wear the cost of decoding and encoding for all 8 bit input.

Right, but it makes the code very clean and straight forward. Again, it depends on what you need. If performance is critical then you probably need a C version written using the same trick as _sre.c...

What does the _sre.c code do?

It comes in two versions: one for 8-bit the other for Unicode.

Depends on your needs: CSV files tend to be small enough
to do the decoding in one call in memory.

We are routinely dealing with multi-gigabyte csv files - which is why the
original 2001 vintage csv module was written as a C state machine.

I see, but are you sure that the typical Python user will have the same requirements to make it worth the effort (and complexity) ?

I've written a few CSV parsers and writers myself over the years
and the requirements were different every time, in terms
of being flexible in the parsing phase, the interfaces and
the performance needs. Haven't yet found a one fits all
solution and don't really expect to any more :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to