Mihir Kamdar wrote:
Hi,
Hello,
Need your help with the following:-
I have a csv file having many records.
I want to remove duplicate records. But the record might not be entirely
duplicate. I only have to check if the 2nd, 3rd, 7th and 8th field of a
record is same as the earlier records. If it is same, then remove the
previous or the last entry. I have written something like below to achieve
this.
#!/usr/bin/perl
open(FILE,"</home/user71/RangerDatasource/Customization/TelekomMalaysia/Scripts/Tests/cprogs/files/sample1");
my $line;
my %hash;
my @file;
while ($line=readline(FILE))
{
my @cdr=split (/,/, $line) ;
$hash{$cdr[2],$cdr[3],$cdr[6],$cdr[7]}="@cdr"; #Add some more cdr key
fields if u want.
}
close FILE ;
open my $f, '>', 'outputsample1' or
die 'Failed to open outputsample1';
while (($key, $value) = each %hash)
{
print $f $value."\n";
}
close $f;
But I am not getting the desired result.
You don't need two loops for that, just one:
#!/usr/bin/perl
my $in_file =
'/home/user71/RangerDatasource/Customization/TelekomMalaysia/Scripts/Tests/cprogs/files/sample1';
open my $in, '<', $in_file or die "Cannot open '$in_file' $!";
open my $out, '>', 'outputsample1' or die "Failed to open outputsample1 $!";
my %hash;
while ( <$in> ) {
my $key = join ',', ( split /,/ )[ 2, 3, 6, 7 ];
print $out $_ unless $hash{ $key }++;
}
close $out;
close $in;
__END__
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/