[Jprogramming] big csvs

Scott Locklin Mon, 11 Nov 2013 16:50:39 -0800

So, I'm having a hard time loading some very simple, but relatively large csv 
files into J for processing. The problem arises because J boxes all the atoms 
individually. Since in general (headers excluded), columns of the csv are all 
of the same type, this seems wasteful. In this case, everything's an int (about 
100M rows of 3 columns of longints), so it seems particularly wasteful. Having 
to invoke ". on all the elements to create the array also seems wasteful for 
routine csv work. Is there some csv loader trick which I am missing out on, or 
a particularly J-like way of doing this?



I was going to write a quick C function to do it with numeric arrays, but 
solving the problem in general for numeric arrays seems like a good idea for 
solving real world problems. One can always use tr, cut or sed to strip out the 
character columns for processing later by standard tables/csv means.


It's a bit frustrating in that J is outrageously good at the sort of task I 
needed to do on the ints (I. basically), but the overhead of loading the data 
was time consuming.

-Scott
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jprogramming] big csvs

Reply via email to