On Wed, 2004-11-10 at 23:13 -0500, Uri Guttman wrote:

> sorry i missed the meeting but i have a nasty cold i am fighting off.

I just got over that one :-(

As for transposing a matrix that won't fit in ram... that's easy. Mail
it to someone who has more ram.

Aaron "Gordian Knot" Sherman, at your service ;-)

Seriously, while mmap is ideal in C, in Perl I would just build an array
of tell()s for each line in the file and then walk through the lines,
storing the offset of the last delimiter that I'd seen. That way, every
read looks like this:

        # XXX WARNING, untested pseudo-code
        next if $offsets[$thisline] == -1;
        seek(INPUT,$tells[$thisline]+$offsets[$thisline],0);
        $bigbuf = '';
        for($i=0;$i;$i++) {
                sysread(INPUT,$buf,$blocksz);
                # Proper CSV parsing left as an exercise...
                if ($buf =~ /[,\n]/) {
                        $pos = length($`)+1;
                        $sep = $&;
                        if ($sep eq "\n") {
                                $offsets[$thisline] = -1; # Done here
                        } else {
                                $offsets[$thisline] += $i*$blocksz+$pos;
                        }
                        last;
                } else {
                        $bigbuf .= $buf;
                }
        }
        handle_one_field($bigbuf);

Let the kernel file buffer do your heavy lifting for you.

-- 

_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to