brahmanyam <[EMAIL PROTECTED]> wrote:
: > Hi,
: > Iam reading flat text file of 100000 lines. Each line has got data of
: > maximum 10 characters.
: > I want to eliminate duplicate lines and blank lines out of that file.
: > i.e. something like sort -u in unix.
:
: Got plenty of memory? =o)
:
: open IN, $file or die $!;
: my %uniq;
: while(<IN>) {
: $uniq{$_}++;
: }
: print sort keys %uniq;
If you want to keep the lines in their original order, you have
a couple of options:
* Keep a hash of lines already seen and push lines onto an array
that are not already in the hash ([EMAIL PROTECTED] mentioned
this one already)
* Use the Tie::IxHash module from CPAN:
open IN, $file or die $!;
tie my %uniq => 'Tie::IxHash';
while (<IN>) {
$uniq{$_}++
}
print keys %uniq;
close IN;
When you tie() a hash to this module, keys() will return the hash keys
in the order in which they were created.
-- tdk