I want to get a daily list of all the directories under a kind of large (by home standards) news heirarchy.
I know a little about using File::Find but wonder if there is a better way.
Here are the things one runs into with File::Find.
if you run it looking for type d (-d) directories it still takes a really long time, and then returns all the stub names that don't actually end in files too.
like comp/os or the like.
I can think of a few ways to get down to the uniq directories that actually have files at there end like:
comp/os/linux/misc
But not without actually finding the numbered files in there.
For example, If I set File::Find looking for /^\d+$/ then in my case that will have to be a full path to postings. The trouble there is that there are literally millions of numbered files under those paths.
I was trying to think of something crazy like putting File::Find in a while loop that lasts out soon as a numbered file is found.
Then some way to force a chngdir but not to that same path.
Can someone help me with this... but understand this is not really urgent since I do know how to get it done the long way. Though I'm sure this a pretty sorry way of doing it. I used Cwd because I couldn't quite figure out how to use File::Find's nochgdir operator. Also probably a pretty holey way of getting those duplicate paths down to one by cramming them into a hash as keys and letting them cancel.
#! /usr/bin/perl -w
use File::Find;
use Cwd;
if(!$ARGV[0] || $ARGV[0] eq "help"){
usage();
exit;
}else{
@top_dir = @ARGV;
@ARGV = ();
}
$file = "./uniq_dir_under_news";
my ($our_dir, $absolute, $uniq_dirs, %uniq_dirs);
find(\&wanted, @top_dir);
open(FILE,">$file") or die "Can't open $file: $!";
sub wanted {
$our_dir = getcwd;
if($_ =~ /^\d+/){
## This print is just to let me know its running
print "$our_dir/$_\n";
$uniq_dirs{$our_dir} = $_;
}
}
foreach $key (keys %uniq_dirs){
push @uniq_dirs,$key;
}
for(sort @uniq_dirs){
print FILE "$_\n";
print "$_\n";
}
close(FILE);
As you may guess this takes quite a while with 6.3 GIGs under /news.
You are doing too much work as File::Find::find() already supplies the full path name.
#!/usr/bin/perl use warnings; use strict;
use File::Find;
if ( [EMAIL PROTECTED] or $ARGV[0] eq 'help' ) { usage(); exit 0; }
my @top_dir = splice @ARGV; my $file = './uniq_dir_under_news';
my %uniq_dirs; find( sub { if ( /^\d/ ) { ## This print is just to let me know its running print "$File::Find::name\n"; } $uniq_dirs{ $File::Find::dir }++; }, @top_dir );
open FILE, '>', $file or die "Can't open $file: $!";
for ( sort keys %uniq_dirs ) { print FILE "$_\n"; print "$_\n"; }
close FILE;
__END__
John -- use Perl; program fulfillment
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>