On Wed, 13 Feb 2002, James Edward Gray II wrote:

> I know Perl frees you from a lot of memory concerns

Well, you don't have to fiddle around with pointers & such, but that
doesn't mean that you're really "free from memory concerns". The issues
are still there; they just aren't shoved in your face as they are in C. 

> I have a script that reads in these very large text files, does
> calculations on the data, and spits out some output. 

Does the whole file have to be read in before producing any of the
calculations or output? Can you get away with reading in chunks at a time? 

It isn't exactly an answer to your question (well, maybe it is...), but it
might be instructive to examine the two main strategies for parsin XML
documents: SAX and DOM. The SAX model is event driven, providing document
data as it is scanned from the file while not storing anything or allowing
you to modify anything. The DOM model, by contrast, slurps everything into
an internal data structure that can be modified at will -- but can't begin
working until the whole file has been scanned. 

Either of these models could be appropriate in a given circumstance, but
in your case, where the file sizes are big and the available resources are
unknown, I would think that the most efficient approach would be more like
SAX than like DOM. This isn't necessarily the case if, for example, you
need to get the whole data structure into memory at once (some problems
just can't be subdivided), but it's worth considering alternatives. 

As for probing how much RAM is available, I can't answer that one for you,
but I suspect that even if you can get that piece of information it won't
necessarily be very helpful, other than to confirm that processing is
going to take a long time some of the time... 


--
Chris Devers

"Okay, Gene... so, -1 x -1 should equal what?" "A South American!"    
[....] "no human can understand the Timecube" and Gene responded
 without missing a beat "Yeah.  I'm not human."

Reply via email to