Re: ~9M lines of data

David Reitz Mon, 14 Oct 2002 19:13:22 -0700

If you're trying to grab the entire CREATE TABLE statement, why not use a
regex (of course, assuming you have the RAM for this...)?  Also, I have no
idea how fast/slow this would be for your case...


something like this:

 --- (begin snip) ---
 #!/usr/bin/perl

 my( $fileContents );

 $/ = undef;

 open( DDL, "infile" ) || die( "can't open infile" );
$fileContents = <DDL>;
close( DDL );

 open( OUT, "> outfile" ) || die( "cannot open outfile" );

 # NOTE -- I've used this regex before for a small project, so that's why
# there's (in this case) unneeded () matches...
while( $fileContents =~
m/(CREATE\s+TABLE\s+(\w+)\s+\(\s+((.|\s)*?)\)\;)/ig )
{
  print OUT $1;
}

 close( OUT );
 --- (end snip) ---

> ----- Original Message -----
> From: "iudicium ferat" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Monday, October 14, 2002 12:25 PM
> Subject: ~9M lines of data
>
>
> > I am somewhat beating my head against a brick wall here - so I think
"Hey!
> > This sounds like a Fun With Perl project :)"
> >
> > Here is the challenge -
> >
> > You are presented with a MySQL Schema dump that is less than 9 million
> rows;
> > you should read the data row by row, finding each CREATE TABLE
statement,
> > and displaying the next ~50 lines INCLUDING this line - do this
> recursively
> > until end of file is reached.
> >
> > My brute force efforts, while slow, are generally working - I am
> wondering -
> > could this become MORE FUN ?
> >
> > My brute force method -
> >
> > #!/usr/local/bin/perl -w
> >
> > use strict;
> > use diagnostics;
> >
> > $|++;
> >
> > my $ctr;
> > my $flag;
> >
> > open (RFILE, "<Daily_bb551_20021410000103286.dump") or
> >     die "\n\nCan't open Schema.Raw for reading: $!\n\n";
> > open (NFILE, ">Schema.Cooked") or
> >     die "\n\nCan't open Schema.Cooked for writing: $!\n\n";
> >
> > while (<RFILE>) {
> >
> >   $ctr++;    # Count lines read...
> >   s/\n/ /;   # newline to space...
> >   s/^\s+//;  # compress leading whitespace...
> >   s/\s+$//;  # compress trailing whitespace...
> >   next unless length; # anything to process?
> >
> >   # CREATE TABLE
> >   if (/^CREATE\sTABLE\s\(/) {
> >      print NFILE "\n\n$_\n";
> >      $flag++;
> >      next;
> >   }
> >
> >   # End of segment
> >   if (/^\#\sDumpings\datas\for\stable\s\'/) {
> >      --$flag;
> >      next;
> >   }
> >
> >   # Get data, while $flag
> >   if ($flag) {
> >      print NFILE " $_\n ";
> >      next;
> >   }
> >
> >   # Everything else is skipped
> >   # during natural looping...
> >
> > } # end while loop
> >
> > close (RFILE) or die "\n\nCan't close Schema.Raw: $!\n\n";
> > close (NFILE) or die "\n\nCan't close Schema.Cooked: $!\n\n";
> > print "\nProcessed $ctr lines...\n\n";
> >
> > __END__
> >
> > Example of the snapshot -
> >
> > # Table structure for table 'addressbook'
> > #
> > CREATE TABLE addressbook (
> > ..
> >   .  Get everything here ...
> >     .
> >   PRIMARY KEY (pk1,sos_id_pk2),
> >   KEY addrbk_users_fk_i (users_pk1,users_sos_id_pk2)
> > );
> >
> > #
> > # Dumping data for table 'addressbook'
> >
> > The line above denotes end of this segment;
> > skip all data until next segment area...
> >
> >
> > Any helpful mumblings would be appreciated :)
> >
> > Thx!
> > -Bill-  :]
> > _Sx____________________
> >   ('>    iudicium ferat
> >   //\   Have Computer -
> >   v_/_    Will Hack...
> >
> >
>

Re: ~9M lines of data

Reply via email to