"getlines" by name "freadblock" is now part of "files" script.
[Jprogramming] Scanning a large file [EMAIL PROTECTED] Tue, 16 May 2006 http://www.jsoftware.com/pipermail/programming/2006-May/002277.html Jforum: Appeal for efficient way of handling csv files http://www.jsoftware.com/pipermail/general/2001-February/005259.html Also there is fapplylines R=: '' 3 : 'R=: R,0".y' fapplylines 'sumcol-input.txt' +/R 500 Or better, R=: 0 3 : 'R=: R+0".y' fapplylines 'sumcol-input.txt' R 500 A new definition can be given with catenation, +/ 0&". fcatlines 'sumcol-input.txt' 500 Further, it can be made to accept stdin, but it can only be read entirely. So there is no special processing except direct reading. In stand-alone script: echo +/ 0&". ;._2 CR -.~ LF ,~^:(~: {:) (1!:1) 3 exit '' > jconsole test.ijs < sumcol-input.txt 500 Also: fcatall=: (;._2)( @ (CR -.~ LF ,~^:(~: {:) 1!:1) ) echo +/ 0&". fcatall 3 exit '' NB. ========================================================= fcatlines=: 1 : 0 0 u fcatlines y : y=. 8 u: y s=. 1!:4 <y if. s = _1 do. return. end. p=. 0 dat=. '' r=. '' while. p < s do. b=. 1e6 <. s-p dat=. dat, 1!:11 y;p,b p=. p + b if. p = s do. len=. #dat=. dat, LF -. {:dat elseif. (#dat) < len=. 1 + dat i:LF do. 'file not in LF-delimited lines' 13!:8[3 end. if. x do. r=. r,u ;.2 len {. dat else. r=. r,u ;._2 CR -.~ len {. dat end. dat=. len }. dat end. r ) --- Joey K Tuttle <[EMAIL PROTECTED]> wrote: > But about line at a time processing -- I have, on occasion, wanted > to process lines in a file. A utility I wrote to do that is: > > getlines =: 3 : 0 > 100000 getlines y NB. Default BS is 100,000 bytes > : > bs =. x > fs =. fsize fn =. > 0{y > fl =. bs <. fs -fp =. > 1{y > buf =. ir fn;fp,fl > if. (fs = fp =. fp + fl) do. fp =. _1 end. > drop =. (<:#buf)-buf i: NL > if. ((drop ~: 0) *. fp = _1 ) do. echo '** Unexpected EOF **' end. > fp =. _1 >. fp - drop > fn;fp;buf }.~ -drop > ) > > > NB. The right argument is 3 boxed things > NB. 'filename' ; file_pointer ; line_buffer > NB. So, to use you on the file above and limit the input to 100 > NB. bytes (you can see why this is quite silly! > > ] work =: 50 getlines 'sumcol-input.txt';0;'' > +----------------+--+------------------------------------------------+ > |sumcol-input.txt|48|276 498 -981 770 -401 702 966 950 -853 -53 -293 | > +----------------+--+------------------------------------------------+ > ] work =: 50 getlines work > +----------------+--+-------------------------------------------------+ > |sumcol-input.txt|97|604 288 892 -697 204 96 408 880 -7 -817 422 -261 | > +----------------+--+-------------------------------------------------+ > ] work =: 50 getlines work > +----------------+---+-------------------------------------------------+ > |sumcol-input.txt|146|-485 -77 826 184 864 -751 626 812 -369 -353 -371 | > +----------------+---+-------------------------------------------------+ > > NB. Eventually, we get to the following results - and you > NB. can see that in an iteration we would use 2{work for > NB. our calculation and checking the file pointer to see > NB. if EOF (indicated by _1) had come along yet ... > > ] work =: 50 getlines work > +----------------+----+-----------------------------------------------+ > |sumcol-input.txt|4362|338 248 494 130 404 358 600 -639 -125 -33 -965 | > +----------------+----+-----------------------------------------------+ > ] work =: 50 getlines work > +----------------+--+-------------------------------+ > |sumcol-input.txt|_1|752 474 -731 758 -573 4 38 264 | > +----------------+--+-------------------------------+ > > NB. Clearly this isn't a "j way" to do things. The default > NB. buffer size of 100000 is something of a reality check. > NB. That size buffer will work through a very large file > NB. about as fast as 1000000 byte chunks. But there is a > NB. big overhead if you have very small buffers (or one > NB. line at a time.) ____________________________________________________________________________________Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. http://tv.yahoo.com/ ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
