On 8/13/07, Luba Pardo <[EMAIL PROTECTED]> wrote: > Dear list: > I wrote a script that takes a list of ids from an input file and store these > in an array in a pairwise-like manner (if total list is n then the array is > (2 ^n)-n). I need to extract for each pair of ids a certain value from a > huge file that contains the pair of ids and the value (format of the file: > col1 col2 id1 id2 value). > The script works but it is takes too long, specially because the second file > is too big (more than 600 MB). > I would like to increase the speed of the script, but I haven't quite worked > what is the best way to do it. > Any tip? > Thanks in advance, > L. Pardo > ps, I am attaching the script > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > http://learn.perl.org/ > >
Beyond being a mess of poorly indented code that is using C-style idioms (instead of Perl idioms), your biggest problem is that you are splitting the values in the same arrays over and over again. You should move the splitting of @a3 and @a4 outside of the nested loops at the end. Other wastes of time and space include (but are not limited to) building a file just to read it in again and reading entire files into memory when all that is done with the array is to loop over it. Overall, your description of the problem seems to lend itself to a hash tied to a dbm file whose keys are the combined ids from the big file (rebuild the dbm if the big file is newer than the dbm). Once you have that your complicated loop that checks to see if the paired ids are in the big file becomes for my $pair (@pairs) { my $key = "@$pair"; if ($ids{$key}) { print "$ids{$key}\n"; } else { print $not_found "@$pair"; } } The code to build the dbm file would look something like this my %ids; tie %ids, DB_File, "bigfile_db"; while (<$bigfile>) { my @fields = (split /\s+/)[3,4,5,6]; #store this line with either configuration of the keys $ids{"@fields[0,1]"} = "@fields"; $ids{"@fields[1,0]"} = "@fields"; } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/