I repeated your test with much large size, dat1=: 1e6#dat
the memory usage is 36x times the byte size of csv.
I think this is reasonable for J, because it used several integer
arrays of the same length as the csv character. But each integer is
8 byte long and total byte size of 4 such integer array is already 32x the
byte size of csv.

I don't think this is a bug in J. If you concern memory usage efficiency,
you should do it in C. Putting it the other way, if efficient csv can be
done using
J script, then special csv code in Jd is not needed.


On Tue, May 5, 2020 at 11:55 AM Aaron Ash <[email protected]> wrote:

> Hi,
>
> I've noticed that the tables/dsv addon seems to have an extremely high
> memory growth factor when processing csv data:
>
> load 'tables/csv'dat=: (34;'45';'hello';_5.34),:
> 12;'32';'goodbye';1.23d=: makecsv dat# d
> NB. 45 chars longtimespacex 'fixcsv d'NB. 2.28e_5 48644864 % 45 NB.
> 108.089 factor of memory growth
>
> This makes loading many datasets effectively impossible even on
> reasonably specced machines.
>
> A 1GB csv file would require 108GB of memory to load which seems
> fairly extreme to the point where I would consider this a bug.
>
> Someone on irc mentioned that generally larger datasets should be
> loaded in to jd and that's fair enough but I still would expect to be
> able to load csv data reasonably quickly and memory efficiently.
>
> Is this a bug? Is there a better library to use for csv data?
>
> Cheers,
>
> Aaron.
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to