This should do:

#!/usr/bin/perl

use strict;
use warnings;

open my $a, '<:encoding(UTF-8)', 'a' or die "Unable to open a: $!";
open my $b, '<:encoding(UTF-8)', 'b' or die "Unable to open b: $!";

my %pair = ();

while ( my $line = <$a> ) {
  my @line = split(" ", $line);
  $pair{$line[0]} = 1;
}

while ( my $line = <$b> ) {
  my @line = split(" ", $line);
  if ( $pair{$line[0]} ) {
    next;
  } else {
    print $line
  }
}

close $a;
close $b;

A bit simplified of course using strict because as your code grows its the
only way to stay sane.
Now there are a few issues with this, the main one being that this is all
done in memory, so as your files grow you might run into trouble with
memory usage. So you might want to read chunks at a time rather than the
whole file. Also keep in mind that every action (like chomp) takes time
time that in this case is totally not needed, so stripping those pointless
steps out will help make things go faster which certainly as the file sizes
grow will make a difference.

Lastly I would suggest adding comments to the code so you can much more
easily hand this over to the next person that might want to understand what
you are doing or how you are doing that even though you are no longer there
to ask those questions (or after a few years you no longer remember)

Regards,

Rob

On Wed, Oct 30, 2019 at 7:04 AM Uri Guttman <u...@stemsystems.com> wrote:

> On 10/29/19 10:48 PM, 刘东 wrote:
>
> Dear every one:
> I try to write a perl script to delet the content of file
> carp01_1_both.txt as same as from another file
> carp-carp01_TKD181002053-1_1_sg.txt, so to get a new file from file
> carp-carp01_TKD181002053-1_1_sg.txt but excluding file carp01_1_both.txt.
> However, when I run this scrip, it does not work, and display the
> information as follows:
> ...
> Semicolon seems to be missing at carp01_1_both.txt line 44993.
> Number found where operator expected at carp01_1_both.txt line 44994, near
> "55659 1"
>     (Missing operator before  1?)
> Number found where operator expected at carp01_1_both.txt line 44994, near
> "ATCACG    55"
>     (Do you need to predeclare ATCACG?)
> Number found where operator expected at carp01_1_both.txt line 44994, near
> "55    116"
>     (Missing operator before     116?)
> syntax error at carp01_1_both.txt line 1, near "979:"
>
>
> it appears that perl is trying to compile one of your data files. show the
> command line where you run your script
>
>
> perl script:
> #!/usr/bin/perl -w
>
> better to use warnings than -w. also use strict is important
>
> open(NAME,"<$ARGV[0]")|| die;
> open(SECON,"<$ARGV[1]")|| die;
> open(SELEC,">$ARGV[2]")|| die;
>
> your die lines should say which file failed to open
>
> uri
>
>

Reply via email to