At 7:53 PM +0000 1/22/01, John Delacour wrote:
>Reading from a 7 Mb file, I'm getting an "out of memory!" error
>after succeding in printing asome hits using the following script
>
>
>$tmp = $ENV{TMPDIR};
>mkdir $tmp, 0;
>$fout = $temp."temp.out";
>open FOUT, ">$fout";
>$fin = 'BU2:Gutenberg Folder:gutenberg_html.tar';
>open FIN, $fin;
>while (<FIN>) {
>/akespear/ and print FOUT;
> }
>
>What am I doing wrong? I thought that doing things this way I was
>only putting one line in memory and the hits are printed as lines
>in the out file just as I wish. The in file is a Unix tar file.
The definition of "line" is determined by whatever is in $/, which is
usually "\015" on a Mac. If the tar file was produced on a Unix
machine, the end-of-line character is almost certainly "\012", and
eventually you'll skip by whatever \015's the tar format happens to
put in the file and try to read a huge chunk of it at once.
tar files aren't really designed to be read this way. You'll be much
happier extracting the individual files in the archive and reading
them directly one by one. A recent Archive::Tar can do the
extraction without reading everything into memory first, or you can
use something like Stuffit Expander.
>JD
--
--
Paul Schinder
[EMAIL PROTECTED]