Hi -
I'm processing a 9 GBs of CSV files (the biggest file is 220MB or so). I'm not
sure if its because of the size of the files or the code I've written to keep
track of the domain objects I'm interested in, but I'm getting out of memory
errors & crashes in Pharo 3 on Mac with the latest VM. I haven't checked other
vms.
I'm going to profile my own code and attempt to split the files manually for
now to see what else it could be.
Right now I'm doing something similar to
|file reader|
file:= '/path/to/file/myfile.csv' asFileReference readStream.
reader: NeoCSVReader on: file
reader
recordClass: MyClass;
skipHeader;
addField: #myField:;
....
reader do:[:eachRecord | self seeIfRecordIsInterestingAndIfSoKeepIt:
eachRecord].
file close.
Is there a facility in NeoCSVReader to read a file in batches (e.g. 1000 lines
at a time) or an easy way to do that ?
Thanks
Paul