Thanks for the links. I tried the fapplylines adverb but the computer grinds along for 30 minutes or so before I pulled the plug. It ends up using 10Gb of (mainly virtual) memory. There are 40M lines in my file.
I will use the unix split command to make lots of little files and (myverb fapplylines)&.> fname to solve the problem. Thanks, Matthew. 2009/8/26 Sherlock, Ric <[email protected]>: >> From: Raul Miller >> >> On Tue, Aug 25, 2009 at 6:46 PM, Matthew >> Brand<[email protected]> wrote: >> > I have a 5GB text file with data which I want to process one line at >> a >> > time. Is there any way to stream through the file one line at a time >> > without reading in the whole thing and splitting it into lines? I >> > could break the file into smaller file and >> > proc...@split_into_lines@open them, but I wonder if there is an >> easier >> > way? >> >> The way most languages implement this involves reading in >> blocks, finding line boundaries (lines will often span blocks) >> assembling lines and making them available one at a time >> to the program. >> >> I do not know if anyone has bothered making this kind of >> facility for J, but J does support reading blocks. See >> "Indexed Read" at http://www.jsoftware.com/help/dictionary/dx001.htm > > See the verb freadblock and the adverb fapplylines in the files script. > open '~system/main/files.ijs' > open 'files' > > See also this post/thread > http://www.jsoftware.com/pipermail/programming/2007-May/006694.html > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
