Scott, I'd be interesting in hearing the results and any timings your final
approach. I've worked in this area too after finding the addons to be
slower than I wanted. I had some success using memory mapped files on csvs
by making it a fixed line width by padding all lines to the longest found
line with extra spaces.

I've also had excellent performance when using memory mapped files (JINT)
on 5M+ int64s. Basically instantaneous






On Mon, Nov 11, 2013 at 9:02 PM, bill lam <[email protected]> wrote:

> If you are sure they are well formed and numeric only and no missing items,
> then you do not need that addons, eg
>
>    a=: 0 : 0
> 1,2,3
> 4,5,6
> )
>    a
> 1,2,3
> 4,5,6
>
>    ".;._2 a
> 1 2 3
> 4 5 6
>
> beware if it contains negative numbers, you might need to replace
> the - with _ first.
>
> Пн, 11 ноя 2013, Scott Locklin писал(а):
> > Pascal wrote:
> >
> > >Can you be more specific about the code?
> > >I assume that you looked into cut ;. ?
> > >an alternative to boxing might be to strip out the commas, and then run
> 0&". on the string.  Not sure that is faster though.
> >
> > I'm sorry for being unclear. I did something like this:
> >
> > loadd 'tables/csv'
> > datloc=: '/path/to/csvs/'
> >
> > ip=: ".> readcsv datloc,'chunk1.csv'
> > ip=: ip, ".> readcsv datloc,'chunk2.csv'
> > (etc)
> >
> > ip is fairly small, but the boxed array read in by readcsv is bloody
> enormous. readcsv is also pretty slow. I solved the problem with chunking
> the csvs, but waiting around for several minutes seemed a very un J-like
> experience. By comparison, the binsearch I needed to do took a fraction of
> a second (it took almost a half hour in R, which has no native binsearch).
> >
> > -Scott, who will definitely be taking Eric up on his kind offer
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
> --
> regards,
> ====================================================
> GPG key 1024D/4434BAB3 2008-08-24
> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to