I believe/hope the proposed solution will work for most cases, although there's 
still a bunch of performance work left to be done. I think the decoupling 
problem isn't as hard as it might seem since there are very clearly distinct 
stages in parsing a CSV file. But we'll find out if the indirection I've 
introduced causes performance problems when things can't be inlined.

While writing this package, I found the two most challenging problems to be:

(A) The disconnect between CSV files providing one row at a time and Julia's 
usage of column major arrays, which encourage reading one column at a time.
(B) The inability to easily resize! a matrix.

 -- John

On Dec 8, 2014, at 5:16 AM, Stefan Karpinski <[email protected]> wrote:

> Doh. Obfuscate the code quick, before anyone uses it! This is very nice and 
> something I've always felt like we need for data formats like CSV – a way of 
> decoupling the parsing of the format from the populating of a data structure 
> with that data. It's a tough problem.
> 
> On Mon, Dec 8, 2014 at 8:08 AM, Tom Short <[email protected]> wrote:
> Exciting, John! Although your documentation may be "very sparse", the code is 
> nicely documented.
> 
> On Mon, Dec 8, 2014 at 12:35 AM, John Myles White <[email protected]> 
> wrote:
> Over the last month or so, I've been slowly working on a new library that 
> defines an abstract toolkit for writing CSV parsers. The goal is to provide 
> an abstract interface that users can implement in order to provide functions 
> for reading data into their preferred data structures from CSV files. In 
> principle, this approach should allow us to unify the code behind Base's 
> readcsv and DataFrames's readtable functions.
> 
> The library is still very much a work-in-progress, but I wanted to let others 
> see what I've done so that I can start getting feedback on the design.
> 
> Because the library makes heavy use of Nullables, you can only try out the 
> library on Julia 0.4. If you're interested, it's available at 
> https://github.com/johnmyleswhite/CSVReaders.jl
> 
> For now, I've intentionally given very sparse documentation to discourage 
> people from seriously using the library before it's officially released. But 
> there are some examples in the README that should make clear how the library 
> is intended to be used.
> 
>  -- John
> 
> 
> 

Reply via email to