Thanks for the links, I didn't use them in this particular case but I've
bookmarked them
for the future.
I used 'taketo' and 'dropto' on a 2.5G file and it's surprising to me
how fast they are.
On 26 Dec 2015, at 9:56, Devon McCormick wrote:
I've developed an adverb for working with large files:
http://code.jsoftware.com/wiki/NYCJUG/2014-05-13#Streaming_Through_Large_Files
. An example of using this code can be found here -
http://code.jsoftware.com/wiki/User:Devon_McCormick/Code/largeFileVet
- and
the updated version of the code here -
http://code.jsoftware.com/wiki/User:Devon_McCormick/Code/WorkOnLargeFiles
.
A good place to start might be with an example of using a simple
version of
an adverb that makes minimal assumptions about the logical structure
of the
file:
http://code.jsoftware.com/wiki/User:Devon_McCormick/Code/WorkOnLargeFiles/SimpleFile
.
The complete code is here:
http://code.jsoftware.com/wiki/User:Devon_McCormick/Code/workOnLargeFile.ijs
.
On Fri, Dec 25, 2015 at 7:37 AM, Raul Miller <[email protected]>
wrote:
When you want to see how J will proceed, you can set up an
experiment,
and use echo to show what is happening when.
That said, your "0 verb will operate a box at a time (or a pair of
boxes at a time, since it's dyadic - the "0/ verb thus operating a
pair of boxes at a time but being monadic...).
So... you'll be reading in a pair of files at a time, and
accumulating
the results of your myverb in your J session.
I hope this helps,
--
Raul
On Fri, Dec 25, 2015 at 4:51 AM, Ryan Eckbo <[email protected]>
wrote:
I'm processing some big files on the order of 2G, extracting >= 250M
of
data
from
each. I have to memory map them to get the data:
readbigfile=: 3 : 0
JCHAR map_jmf_ 'f';y
NB. get data from f
unmap_jmf_'f'
)
I have about 150 of these files together with matching smaller ones,
and
I
need to
do something like this:
(fread@[ myverb readbigfile@])"0/ SmallFiles,.Bigfiles
My question is how is the J runtime going to execute this: is it
going to
proceed line
by line or try and read all the big files at once? If the former,
is the
memory freed
right after execution? In general I don't know how to deal with huge
arrays.
Thanks for any help,
Ryan
----------------------------------------------------------------------
For information about J forums see
http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see
http://www.jsoftware.com/forums.htm
--
Devon McCormick, CFA
Quantitative Consultant
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm