All the timing notes here are from my 700MHz laptop with filesystem
encryption.
I did this in Perl because Python doesn't have a `readdir` that lets
you iterate over the files in the directory one at a time; it only has
a `listdir` that constructs a list of all of them in memory and
returns it. Maybe that would have been fine.
Like everything else posted to kragen-hacks without any notice to the
contrary, this program is in the public domain; I abandon any
copyright in it.
# wrote this to clean up a directory with 500 000 files in it. After
# 8 hours of rm -rf, rm had only cut it down from 800 000 to 500 000
# files.
# without the unlink, this took only 99 seconds to run over all
# 500 000 files.
# with unlink, it got through 18479 files in 3m2.7s, so that's
# actually 100 files per second, so it should be done after 5000
# seconds.
# A second run got through 60893 files in 5m23s, or 323s, which is 188
# files per second. This isn't reassuring that I'm measuring
# performance accurately but at least I know there's no substantial
# N^2 term.
# third run: 234502 files in 20m45s. Again 188 files per second. But
# then the next time around, it took 80 seconds without successfully
# readdirring anything.
time perl -e 'opendir MB, "mboxtmp" or die;
while (my $x = readdir MB) { print "$x\n" }
' | perl -ne '$| = 1;
chomp;
unlink "mboxtmp/$_" or warn "$_: $!";
print "$.\r"'
time perl -e 'opendir MB, "mboxtmp.2" or die;
while (my $x = readdir MB) { print "$x\n" }
' | perl -ne '$| = 1;
chomp;
unlink "mboxtmp.2/$_" or warn "$_: $!";
print "$.\r"'