All the timing notes here are from my 700MHz laptop with filesystem encryption.
I did this in Perl because Python doesn't have a `readdir` that lets you iterate over the files in the directory one at a time; it only has a `listdir` that constructs a list of all of them in memory and returns it. Maybe that would have been fine. Like everything else posted to kragen-hacks without any notice to the contrary, this program is in the public domain; I abandon any copyright in it. # wrote this to clean up a directory with 500 000 files in it. After # 8 hours of rm -rf, rm had only cut it down from 800 000 to 500 000 # files. # without the unlink, this took only 99 seconds to run over all # 500 000 files. # with unlink, it got through 18479 files in 3m2.7s, so that's # actually 100 files per second, so it should be done after 5000 # seconds. # A second run got through 60893 files in 5m23s, or 323s, which is 188 # files per second. This isn't reassuring that I'm measuring # performance accurately but at least I know there's no substantial # N^2 term. # third run: 234502 files in 20m45s. Again 188 files per second. But # then the next time around, it took 80 seconds without successfully # readdirring anything. time perl -e 'opendir MB, "mboxtmp" or die; while (my $x = readdir MB) { print "$x\n" } ' | perl -ne '$| = 1; chomp; unlink "mboxtmp/$_" or warn "$_: $!"; print "$.\r"' time perl -e 'opendir MB, "mboxtmp.2" or die; while (my $x = readdir MB) { print "$x\n" } ' | perl -ne '$| = 1; chomp; unlink "mboxtmp.2/$_" or warn "$_: $!"; print "$.\r"'