Link should be http://www.jsoftware.com/jwiki/Scripts/Working%20with%20Big%20Files
R.E. Boss > -----Oorspronkelijk bericht----- > Van: [email protected] [mailto:programming- > [email protected]] Namens Devon McCormick > Verzonden: donderdag 27 augustus 2009 3:56 > Aan: Programming forum > Onderwerp: Re: [Jprogramming] streaming through a large text file > > These could be made to work on files >4GB using the bigfiles code (Windows > only) but they would have to be re-written to do that. You'd have to use > "bixread" instead of 1!:11 and deal with extended integers - see > http://www.jsoftware.com/jwiki/Scripts/Working with Big Files for more on > this if you're interested. > > On Wed, Aug 26, 2009 at 6:42 PM, Sherlock, Ric > <[email protected]>wrote: > > > Is the reason that fapplylines & freadblock doesn't work on files >4GB > > because a 32bit system can't represent the index into the file as an > 32bit > > integer? > > In other words they may well work OK on a 64bit system? > > > > I think bigfiles.ijs is Windows only? It so it would be an alternative > if > > using a 32bit Windows system, but it sounds like Matthew is on Linux. > > > > > From: Don Guinn > > > > > > Use bigfiles.ijs > > > > > > On Wed, Aug 26, 2009 at 4:09 PM, Devon McCormick wrote: > > > > > > > I thought I'd try this code but it doesn't work with very large > files > > > (>4 > > > > GB). > > > > > > > > On Wed, Aug 26, 2009 at 11:46 AM, R.E. Boss wrote: > > > > > > > > > > Chris Burke wrote: > > > > > > > Matthew Brand wrote: > > > > > > > Thanks for the links. I tried the fapplylines adverb but the > > > computer > > > > > > > grinds along for 30 minutes or so before I pulled the plug. It > > > ends > > > > up > > > > > > > using 10Gb of (mainly virtual) memory. There are 40M lines in > > > my > > > > file. > > > > > > > > > > > > > > I will use the unix split command to make lots of little files > > > and > > > > > > > (myverb fapplylines)&.> fname to solve the problem. > > > > > > > > > > > > There should be little difference between processing lots of > > > small > > > > > > files, and one big file in chunks. > > > > > > > > > > > > What processing is being done? What result is being accumulated? > > > > > > > > > > > > Why not test on a small file first and find out what is taking > > > time - > > > > > > and only then try on the full file? > > > > > > > > > > > > > > > My guess is we can improve the efficiency of your code by at least > > > a > > > > factor > > > > > 2 (= Hui's constant). > > > > > > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > -- > Devon McCormick, CFA > ^me^ at acm. > org is my > preferred e-mail > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
