On Tue, 05 Apr 2011 01:44:34 -0400, Jesse Phillips
<[email protected]> wrote:
I have implemented an input range based CSV parser that works on text
input[1]. I combined my original implementation with some details of
David's implementation[2]. It is not ready for formal review as I need to
update and polish documentation and probably consolidate unit tests.
[snip]
* You should input ranges. It's fine to detect slicing and optimize for
it, but you should support simple input ranges as well.
* I'd think being able to retrieve the headings from the csv would be a
good [optional] feature.
* Exposing the tokenizer would be useful.
* Regarding buffering, it's okay for the tokenizer to expose buffering in
it's API (and users should be able to supply their own buffers), but I
don't think an unbuffered version of csvText itself is correct;
csvByRecord or csvText!(T).byRecord would be more appropriate. And
anyways, since you're only using strings, why is there any buffering going
on at all? string values should simply be sliced, not buffered. Buffering
should only come into play with input ranges.
* There should be a way to specify other separators; I've started using
tab separated files as ','s show up in a lot of data.
* Any thought of parsing a file into a tuple of arrays? Writing csv?