Re: [phobos] CSVRange: RFC

David Simcha Sat, 29 Jan 2011 21:53:03 -0800

Jesse,

I was unaware of your efforts. At first glance, your lib looks prettygood. I definitely think Phobos needs a real CSV parser, as I seem towrite ad-hoc ones all the time. Since your module mostly looks a littlefurther along and better engineered than mine (mine was really just aprototype that I spent about half a day on), maybe we should focus ongetting yours up to Phobos quality. The one major feature yours ismissing, though, is the ability for csvText() to extract a subset of theavailable columns by header. I also like the idea of doing things bycolumn header instead of hard coding the column order because it's lessbrittle if the layout changes.


--David Simcha

On 1/29/2011 10:47 PM, Jesse Phillips wrote:

That is about the same as what I have, though I was attempting to
handle custom delimiters for fields, records, and quote.

https://github.com/he-the-great/JPDLibs/tree/csv

But about your code. I was getting a Range Violation with your
unittests active. Also you don't handle a quoted empty field
correctly. Otherwise you pass the unittest I ported from mine:

https://gist.github.com/802502

On Sat, Jan 29, 2011 at 3:44 PM, David Simcha<[email protected]>  wrote:

I've written a small module for reading CSV and similar delimited files.
  I've been meaning to do this for a while.  Basically, it allows reading a
CSV file with O(1) memory usage (i.e. it can be parsed one character at a
time) to a range of ranges of cells.  Quotes, escaped quotes, etc. are
handled properly.  I tested it on a nasty CSV file produced by Affymetrix,
and it works rather well.

CSVRange also allows for iteration over rows as a range of structs.  For
example, let's say you had a file:

Height,Weight,Shoe Size
6.5,210,13
...

You could read this file lazily into a range of structs with something like:

struct Person
{
    float height;
    uint weight;
    uint shoeSize;
}

auto csvRange = csvFile(someCharacterRange, ',');
auto structs = csvStructRange(csvRange, ["Height", "Weight", "Shoe Size"]);

// Iterate lazily through the rows.
foreach(s; structs) {
    // Do stuff.
}

Note that this still works even if you have tons of columns you don't care
about in the file.

Code:

http://dsource.org/projects/scrapple/browser/trunk/csvRange/csvRange.d

Docs:

http://cis.jhu.edu/~dsimcha/csvRange.html


_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos


_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Re: [phobos] CSVRange: RFC

Reply via email to