Vishal Vasan <[EMAIL PROTECTED]> asked:
> I am trying to sort a file with records. I have tried to 
> slurp all the records present in the file to memory but the 
> memory has run out.

Read your input file in several large chunks, and sort
each as you go, saving the output to a temporary file.
Once you have a set of sorted temp files, do a merge 
of the various files to a new output file.

Here's some sample code to do the merging. You'll have
to provide your own code to pick out the key to sort on.
@tempfiles is a list of filenames with sorted partitions
of your data.



use IO::File;

sub dateline {
  my $obj = shift;

  if( defined( my $line = $obj->{'fh'}->getline() ) ){

    $obj->{'line'} = $line;
    $obj->{'val'} = generate_sorting_value( $line );

  } else {

    if( $obj->{'fh'}->eof() ){
      $obj->{'fh'}->close();
    } else {
      die "Error reading from '" . $obj->{'name'} . "': $!";
    }

    $obj = undef;
  }

  return $obj;
}

my @infh;

foreach my $file ( @tempfiles ){

  if( my $fh = new IO::File $file ){

    my $obj = dateline( { 'fh' => $fh, 'name' => $filename } );

    push @infh, $obj if defined $obj;

  } else {
    die "Failed to open input file '$file': $!";
  }

}

die "Failed to open output file '$outfilename': $!" unless my $outfh = new
IO::File $outfile;

while( @infh >= 2 ){
  @infh = sort { $a->{'val'} <=> $b->{'val'} } @infh;

  # write the line with the lowest sorting value
  $outfh->print( $infh[0]->{'line'} );

  # discard filehandle "object" if there are no more lines pending
  shift @infh unless dateline( $infh[0] );
}

$outfh->print( $infh[0]->{'line'} );

while( !$infh[0]->{'fh'}->eof() &&  !$infh[0]->{'fh'}->error() ){
  $outfh->print( $infh[0]->{'fh'}->getline() );
}

__END__

HTH
Thomas

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to