Harry Putnam wrote:
I want to get a daily list of all the directories under a kind of
large (by home standards) news heirarchy.

I know a little about using File::Find but wonder if there is a better
way.

Here are the things one runs into with File::Find.

if you run it looking for type d (-d) directories it still takes a
really long time, and then returns all the stub names that don't
actually end in files too.

like comp/os or the like.

I can think of a few ways to get down to the uniq directories that
actually have files at there end like:

comp/os/linux/misc

But not without actually finding the numbered files in there.

For example, If I set File::Find looking for /^\d+$/ then in my case
that will have to be a full path to postings.  The trouble there is
that there are literally millions of numbered files under those paths.

I was trying to think of something crazy like putting File::Find in a
while loop that lasts out soon as a numbered file is found.

Then some way to force a chngdir but not to that same path.

Can someone help me with this... but understand this is not really
urgent since I do know how to get it done the long way. Though I'm
sure this a pretty sorry way of doing it.  I used Cwd because I
couldn't quite figure out how to use File::Find's nochgdir operator.
Also probably a pretty holey way of getting those duplicate paths down
to one by cramming them into a hash as keys and letting them cancel.

#! /usr/bin/perl -w
use File::Find;
use Cwd;
if(!$ARGV[0] || $ARGV[0] eq "help"){
usage();
exit;
}else{
@top_dir = @ARGV;
@ARGV = ();
}
$file = "./uniq_dir_under_news";
my ($our_dir, $absolute, $uniq_dirs, %uniq_dirs);
find(\&wanted, @top_dir);
open(FILE,">$file") or die "Can't open $file: $!";
sub wanted {
$our_dir = getcwd;
if($_ =~ /^\d+/){
## This print is just to let me know its running
print "$our_dir/$_\n";
$uniq_dirs{$our_dir} = $_;
}
}
foreach $key (keys %uniq_dirs){
push @uniq_dirs,$key;
}
for(sort @uniq_dirs){
print FILE "$_\n";
print "$_\n";
}
close(FILE);


As you may guess this takes quite a while with 6.3 GIGs under
/news.

You are doing too much work as File::Find::find() already supplies the full path name.


#!/usr/bin/perl
use warnings;
use strict;

use File::Find;

if ( [EMAIL PROTECTED] or $ARGV[0] eq 'help' ) {
    usage();
    exit 0;
    }

my @top_dir = splice @ARGV;
my $file = './uniq_dir_under_news';

my %uniq_dirs;
find( sub {
    if ( /^\d/ ) {
        ## This print is just to let me know its running
        print "$File::Find::name\n";
        }
    $uniq_dirs{ $File::Find::dir }++;
    }, @top_dir );

open FILE, '>', $file or die "Can't open $file: $!";

for ( sort keys %uniq_dirs ) {
     print FILE "$_\n";
     print "$_\n";
    }

close FILE;

__END__



John
--
use Perl;
program
fulfillment

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to