On Tue, Nov 05, 2013 at 02:29:10PM +0000, Jonathan Dowland wrote: > On Tue, Nov 05, 2013 at 03:13:10PM +0400, Reco wrote: > > find . -type f -name 'popularity-*' -print0 | xargs -0rn 20 rm -f > > I idly wonder (don't know) to what extend find might parallelize the > unlinks with -delete. A cursory scan of the semantics would suggest it > could potentially do so: it's not clear that a single unlink failing > should stop future unlinks (merely spew errors and consider the -delete > operation as a whole to have failed)
xargs parallelism is optional. The point is that you have one process which finds files, and another one (or another group of) who are deleting files. Helps utilizing multiple cpus. > > Arguably the fastest way to delete all this mess should be > > > > perl -e 'for(<popularity-*>){((stat)[9]<(unlink))}' > > Not sure why loading perl (>1.6M) should be faster than find (~300K) > and I think '-delete' behaviour is essentially unlink under the hood. It's not the binary size which matters, it's the algorithm: $ for x in $(seq 1 500000); do echo somefile > $x; done $ time perl -e 'for(<*>){(stat)[9]>(unlink))}' real 0m24.047s user 0m4.785s sys 0m16.926s $ for x in $(seq 1 500000); do echo somefile > $x; done $ time find -type f -delete real 4m27.799s user 0m0.831s sys 0m17.961s Basically, the difference is in the fact that find uses fstatat64 syscall for each file, and this perl one-liner uses lstat64 and stat64 syscalls. Use strace to check it in your environment. On another OS results could be different. Reco -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131105151518.GA19598@x101h