Jan Eden wrote:
I had the following task: Open a file, read it and merge all pairs
of lines containing a certain number of tabs. Example:

Blablabla
abc cab bca
123 453 756
Blablabla
Blablabla

Here, lines 2 and three should be merged, while the other lines
should remain untouched. Expected result:

Blablabla
abc 123
cab 453
bca 756
Blablabla
Blablabla

While I managed to get this done, I doubt that I found a good
(fast) solution. So before I move on to the large files which have
to be processed, I'd like to get your input for a better solution.

This is how I did it:

#!/usr/bin/perl -w

use strict;

my (@merge_one, @merge_two, @merge_three);

open (FILE, "file.txt") or die "Cannot open the input file";

my @input_file = <FILE>;

foreach (0..$#input_file) {
    chomp $input_file[$_];
    my $next = $_+ 1;
    chomp $input_file[$next] if $input_file[$next];
    if ($input_file[$_] =~ m/\t/ && $input_file[$next] =~ m/\t/) {
        @merge_one = split /\t/, $input_file[$_];
        @merge_two = split /\t/, $input_file[$next];
        for (0..$#merge_two) {
            $merge_three[$_] = $merge_two[$_] . " " . $merge_one[$_];
        }
        $input_file[$_] = join "\n", @merge_three;
        print $input_file[$_], "\n\n";
        $input_file[$next] = '';
        (@merge_one, @merge_two, @merge_three) = ();
    }
}

my $output = join "\n", @input_file;

open (OUTFILE, ">input_file2.txt");
print OUTFILE $output;

Not bad IMO.

One thing that would be an improvement is to not read the whole file
into memory, but instead process it line by line. The example below requires two tabs for merging:


    open my $infile, 'file.txt' or die "Can't open ... $!";
    open my $outfile, '> input_file2.txt' or die "Can't open ... $!";
    my @pairs;

    sub merge {
        my $ref = shift;
        my @merged;
        while ( my $line = shift @$ref ) {
            chomp $line;
            my @tmp = split /\t/, $line;
            push @{ $merged[$_] }, $tmp[$_] for 0..$#tmp;
        }
        @merged
    }

    while (<$infile>) {
        if ( tr/\t// == 2 and @pairs <= 1 ) {
            push @pairs, $_;
        } elsif ( @pairs == 1 ) {
            print $outfile shift @pairs;
            print $outfile $_;
        } else {
            print $outfile "@$_\n" for merge( [EMAIL PROTECTED] );
            print $outfile $_;
        }
    }
    print $outfile "@$_\n" for merge( [EMAIL PROTECTED] );

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to