On Jul 6, 5:31 am, Neil Cerutti <[EMAIL PROTECTED]> wrote: > > Mostly you can use the default 'excel' dialect and be quite > happy, since Excel is the main reason anybody still cares about > this unecessarily hard to parse (it requires more than one > character of lookahead for no reason except bad design) data > format.
One cares about this format because people create data files of millions of rows (far exceeding the capacity of Excel (pre-2007)) in many imaginative xSV dialects, some of which are not handled by the Python csv module. I don't know what you mean by "requires more than one character of lookahead" -- any non-Mickey-Mouse implementation of a csv reader will use a finite state machine with about half-a-dozen states, and data structures no more complicated than (1) completed rows received so far (2) completed fields in current row (3) bytes in current field. When a new input byte arrives, what to do can be determined based on only that byte and the current state; no look- ahead into the input stream is required, nor is any look-back into those data structures. -- http://mail.python.org/mailman/listinfo/python-list