Hi to all,
I need to delete in a list of files, and in each file of newsgroups,
all the posts which contents quoted lines. ONLY IF these lines are
effectively repeating the non quoted text.
At the end, I'm working on a single file in the directory, but the
moment in which I have to work on a multi-file directory is appraching
to me, and I'm not able to make works two files in the same time.

FIRST step:
I create a file in which there are stored all the quoted lines:

#!/usr/bin/perl -w
use strict;
my $var = 1;
$^I = '';
@ARGV = </Users/pes/Desktop/TC4ctrl/*.txt>;
while(<>)
     {
     next if /^>/x;
         print;
         print STDOUT ++$var, " - $ARGV extracted\n" if (eof);
     }

It works, I show it only because my english isn't a good english. So
you can undestand what I mean.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SECOND step:
I clean all the repeated lines in my control-file using:
perl -e'while(<>){print unless $seen{$_}++}' <test.txt >test2.txt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
THIRD step:
I open the control-file then I verify if each quoted line is repeating
other text.
If a line is doing it, I delete all the post.
This step does not work, so, I ask myself if I could re-think all the
process:

#!/usr/bin/perl -w

use strict;
my ($temp,$actual,$var);
my $del = 0;

my $fileName = "/Users/pes/Desktop/TC4ctrl/test2.txt";
open (INPUT, $fileName) or die "File not opnd cos $!";
my $control = <INPUT>;
close(INPUT);
$^I = '';

@ARGV = </Users/pes/Desktop/TC4/*.txt>;
while(<>){
# if it's the end of a post...
         if (/={8}/g)
                 {
# ... the script  knows if it must print the entire post...
                 if ($del == 0)
                         {
                         $temp .= "\n========\n";
                         print $temp;
                         $temp = "";
                         }
# ... or if it must delete it
                 else
                         {
                         $temp = "";
                         $del = 0;
                         }
                 }
# if there's a quoted line it compare the line without quoting
# with all the line in the control files
# in my intentions when there's the first repetition it decide
# set the erasure-variable $del to 1 ("yes, delete it!") and continue
# the while cycle...... this block does not work
         elsif (/^>/g)
                 {
                 $actual = $=B4;
                 while ($control)
                         {
                         $del = ($control =~ /$actual/m);
                         next;
                         }
                 }
         else
                 {
                 $temp .= $_;
                 }
         print STDOUT ++$var, " - I cleaned $ARGV\n" if (eof);
         }
print "Done!\n";
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

thank you all


all'adr



-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to