All the timing notes here are from my 700MHz laptop with filesystem
encryption.

I did this in Perl because Python doesn't have a `readdir` that lets
you iterate over the files in the directory one at a time; it only has
a `listdir` that constructs a list of all of them in memory and
returns it.  Maybe that would have been fine.

Like everything else posted to kragen-hacks without any notice to the
contrary, this program is in the public domain; I abandon any
copyright in it.

# wrote this to clean up a directory with 500 000 files in it.  After
# 8 hours of rm -rf, rm had only cut it down from 800 000 to 500 000
# files.

# without the unlink, this took only 99 seconds to run over all 
# 500 000 files.

# with unlink, it got through 18479 files in 3m2.7s, so that's
# actually 100 files per second, so it should be done after 5000
# seconds.

# A second run got through 60893 files in 5m23s, or 323s, which is 188
# files per second.  This isn't reassuring that I'm measuring
# performance accurately but at least I know there's no substantial
# N^2 term.

# third run: 234502 files in 20m45s.  Again 188 files per second.  But
# then the next time around, it took 80 seconds without successfully
# readdirring anything.

time perl -e 'opendir MB, "mboxtmp" or die; 
     while (my $x = readdir MB) { print "$x\n" }
' | perl -ne '$| = 1; 
              chomp; 
              unlink "mboxtmp/$_" or warn "$_: $!";
              print "$.\r"'

time perl -e 'opendir MB, "mboxtmp.2" or die; 
     while (my $x = readdir MB) { print "$x\n" }
' | perl -ne '$| = 1; 
              chomp; 
              unlink "mboxtmp.2/$_" or warn "$_: $!";
              print "$.\r"'

Reply via email to