Re: rm -rf is too slow on large files and directory structure(Around 30000)
Bilal mk bilalh...@gmail.com writes: I am using xfs filesystem and also did the fsck. DMA is enabled. Also perfomed xfs defragmentation( xfs_fsr). But still an issue not only rm -rf but also cp command Traditionally XFS is super slow when deleting lots of little files -- much, _much_, slower than ext3, for instance... [I guess it's supposed to be better now, but I dunno, Ive only experienced the slow version.] -Miles -- Freebooter, n. A conqueror in a small way of business, whose annexations lack of the sanctifying merit of magnitude. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/buolintkjer@dhlpc061.dev.necel.com
Re: rm -rf is too slow on large files and directory structure(Around 30000)
Is there any chance the OP might have the filesystem mounted with the 'sync' option? -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1bobsw2fc7@pfeifferfamily.net
Re: Re: rm -rf is too slow on large files and directory structure(Around 30000)
Christofer C. Bell wrote: cbell@circe:~/test$ time find rm -type f -exec rm {} \+ There isn't any need to escape the '+' character. time find rm -type f -exec rm {} + It doesn't seem possible to run a similar test for unlink as it appears it only operates on 1 file at a time. So it does seem that rm with the find and/or xargs options you provided is the best way to go (at least for this test case). I definitely recommend using -exec rm {} + over using xargs because the find method has been incorporated into the POSIX standard. All operating systems will have it. The xargs -0 method is a GNU extension and won't be available portably. Once you decide to use GNU extensions (such as xargs -0) then you might as well use a different GNU extension and use -delete instead. In for a penny, in for a pound. Using -delete is almost certainly the fastest method since it doesn't spawn any external processes. time find rm -type f -delete Bob signature.asc Description: Digital signature
Re: rm -rf is too slow on large files and directory structure(Around 30000)
On Thu, 16 Feb 2012, Bilal mk wrote: I am using xfs filesystem and also did the fsck. DMA is enabled. Also perfomed xfs defragmentation( xfs_fsr). But still an issue not only rm -rf but also cp command Until quite recently XFS was notable for being slow to delete. Others have noted that this is greatly improved in recent kernels but even with older kernels there is quite a bit of tuning that you can do to improve the delete performance. Your favourite search engine should give you good results. I put down some notes for myself here a while back: http://www.practicalsysadmin.com/wiki/index.php/XFS_optimisation Cheers, Rob -- Email: rob...@timetraveller.org Linux counter ID #16440 IRC: Solver (OFTC Freenode) Web: http://www.practicalsysadmin.com Free Open Source: The revolution that quietly changed the world One ought not to believe anything, save that which can be proven by nature and the force of reason -- Frederick II (26 December 1194 – 13 December 1250)
Re: Re: rm -rf is too slow on large files and directory structure(Around 30000)
On Wed, Feb 15, 2012 at 4:51 PM, Clive Standbridge list-u...@tgstandbridges.plus.com wrote: But may provide some benefit when removing a large number (3) of files (at least empty ones). cbell@circe:~/test$ time find rm -type f -exec rm {} \; real 0m48.127s user 1m32.926s sys 0m38.750s First thought - how much of that 48 seconds was spent on launching 3 instances of rm? It would be instructive to try time find rm -type f -exec rm {} \+ or the more traditional xargs: time find rm -type f -print0 | xargs -0 -r rm Both of those commands should minimise the number of rm instances. Similarly for unlink. Here are the test results: cbell@circe:~/test$ time find rm -type f -exec rm {} \+ real0m0.953s user0m0.064s sys 0m0.884s cbell@circe:~/test$ cbell@circe:~/test$ time find rm -type f -print0 | xargs -0 -r rm real0m0.823s user0m0.080s sys 0m0.824s cbell@circe:~/test$ It doesn't seem possible to run a similar test for unlink as it appears it only operates on 1 file at a time. So it does seem that rm with the find and/or xargs options you provided is the best way to go (at least for this test case). -- Chris -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAOEVnYuAu8Q7Vh+ECsg=kkmv0j6N3zQqvm=mdg-wz+4qckt...@mail.gmail.com
Re: rm -rf is too slow on large files and directory structure(Around 30000)
Jude DaShiell jdash...@shellworld.net wrote: Anyone heard of the unlink command? Yes. And your point is...? Chris -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/dvls09xg3r@news.roaima.co.uk
Re: rm -rf is too slow on large files and directory structure(Around 30000)
On Wed, Feb 15, 2012 at 1:38 AM, Jude DaShiell jdash...@shellworld.net wrote: Anyone heard of the unlink command? unlink is slower than rm removing a 1.5GB file (at least on ext3): cbell@circe:~$ time rm test1 real0m0.278s user0m0.000s sys 0m0.264s cbell@circe:~$ time unlink test2 real0m0.375s user0m0.000s sys 0m0.364s cbell@circe:~$ But may provide some benefit when removing a large number (3) of files (at least empty ones). cbell@circe:~/test$ time find rm -type f -exec rm {} \; real0m48.127s user1m32.926s sys 0m38.750s cbell@circe:~/test$ time find unlink -type f -exec unlink {} \; real0m46.167s user1m32.194s sys 0m39.346s cbell@circe:~/test$ I suspect that removing a large number of non-zero byte files will be slower with unlink than rm. -- Chris -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAOEVnYs3775N=sbvgto0y-odzrlnrj5t3bqrpjkt1fh5w7v...@mail.gmail.com
Re: rm -rf is too slow on large files and directory structure(Around 30000)
Christofer C. Bell wrote: unlink is slower than rm removing a 1.5GB file (at least on ext3): ... I suspect that removing a large number of non-zero byte files will be slower with unlink than rm. If it is then it is pointing to a kernel performance issue. Because there is very little difference between them. Until recently rm used unlink(2) and there would have been no difference. But recent versions of coreutils now use unlinkat(2) for improved security now instead. Any difference in performace would be in the realm of the kernel internals. It doesn't seem to me like there should be any significant difference. Bob signature.asc Description: Digital signature
Re: Re: rm -rf is too slow on large files and directory structure(Around 30000)
But may provide some benefit when removing a large number (3) of files (at least empty ones). cbell@circe:~/test$ time find rm -type f -exec rm {} \; real 0m48.127s user 1m32.926s sys 0m38.750s First thought - how much of that 48 seconds was spent on launching 3 instances of rm? It would be instructive to try time find rm -type f -exec rm {} \+ or the more traditional xargs: time find rm -type f -print0 | xargs -0 -r rm Both of those commands should minimise the number of rm instances. Similarly for unlink. -- Cheers, Clive -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120215225049.ga8...@rimmer.esmertec.com
Re: rm -rf is too slow on large files and directory structure(Around 30000)
On Wed, Feb 15, 2012 at 12:14 PM, Bob Proulx b...@proulx.com wrote: Bilal mk wrote: I tried to remove 5GB directory. In that directory around 3 files and directory. It will take more than 30 min to complete. A large number of files consuming a large number of blocks will take a significant amount of time to process. That is all there is to it. Some filesystems are faster than others. What filesystem are you using? On what type of cpu? If you happen to be destroying an entire filesystem then you could simply destroy the entire filesystem by unmounting it and then making a new filesystem on top of it.. There is no other cpu intensive process running. After sometime it goes to D state and unbale to kii that process. If you have processes stuck in the D state (uninterruptible sleep) then something bad has happened. This would indicate a bug. It sounds like you are having kernel bugs. You may need to fsck your filesystems. I would double check that dma is enabled to your drives. I am using xfs filesystem and also did the fsck. DMA is enabled. Also perfomed xfs defragmentation( xfs_fsr). But still an issue not only rm -rf but also cp command USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 1134 0.0 0.0 0 0 ?*D*10:18 0:00 [kdmflush] I have also tested disk with smartmontools. But reported no issues. My kernel version is 2.6.32-5-amd64. I have also used same configuartion(same kernel) and same hardware on another machine. But on that machine there is no issue. Is it a kernel bug or hardware issue. Any suggestion for troubleshooting or fix this issue. Thanks I have also tried find with xargs method to remove. It will also take long time to complete find /directory | xargs rm -rf I doubt the problem is in rm since it has already been optimized to be quite fast. The newer versions have even more optimization. But it isn't worth the trouble to do anything other than wait. Most of the time will be spent in the kernel organizing the now free blocks. If you want to experiment you could try find. find /directory -depth -delete That is basically the same as rm -rf but using find only. Bob
Re: rm -rf is too slow on large files and directory structure(Around 30000)
Bilal mk wrote: I tried to remove 5GB directory. In that directory around 3 files and directory. It will take more than 30 min to complete. A large number of files consuming a large number of blocks will take a significant amount of time to process. That is all there is to it. Some filesystems are faster than others. What filesystem are you using? On what type of cpu? If you happen to be destroying an entire filesystem then you could simply destroy the entire filesystem by unmounting it and then making a new filesystem on top of it.. There is no other cpu intensive process running. After sometime it goes to D state and unbale to kii that process. If you have processes stuck in the D state (uninterruptible sleep) then something bad has happened. This would indicate a bug. It sounds like you are having kernel bugs. You may need to fsck your filesystems. I would double check that dma is enabled to your drives. I have also tried find with xargs method to remove. It will also take long time to complete find /directory | xargs rm -rf I doubt the problem is in rm since it has already been optimized to be quite fast. The newer versions have even more optimization. But it isn't worth the trouble to do anything other than wait. Most of the time will be spent in the kernel organizing the now free blocks. If you want to experiment you could try find. find /directory -depth -delete That is basically the same as rm -rf but using find only. Bob signature.asc Description: Digital signature
Re: rm -rf is too slow on large files and directory structure(Around 30000)
Bilal mk: I tried to remove 5GB directory. In that directory around 3 files and directory. It will take more than 30 min to complete. There is no other cpu intensive process running. After sometime it goes to D state and unbale to kii that process. Removing that many files is I/O bound. The CPU doesn't play any significant role in it. The only way to speed things up is to either get faster storage (an SSD with high IOPS value for random writing) or you can try another filesystem. IIRC XFS is good at what you are doing. I cannot recommend it, though, as I don't have any recent experience with it. J. -- I wish I was gay. [Agree] [Disagree] http://www.slowlydownward.com/NODATA/data_enter2.html signature.asc Description: Digital signature
Re: rm -rf is too slow on large files and directory structure(Around 30000)
On 2/15/2012 12:55 AM, Jochen Spieker wrote: Bilal mk: I tried to remove 5GB directory. In that directory around 3 files and directory. It will take more than 30 min to complete. There is no other cpu intensive process running. After sometime it goes to D state and unbale to kii that process. Removing that many files is I/O bound. This isn't correct. Removing a kernel source tree is fast, and it contains on the order of 4500 directories and 50K files. EXT4 can 'rm -rf' the kernel source in 2-3 seconds. XFS prior to delaylog could take a minute or two, with delaylog it's 4 seconds. So that's 2-4 seconds to remove a directory tree of 50k files. The OP's system is taking forever then freezing. So if it's EXT4 he's using, this isn't an IO problem but a bug, or something else, maybe bad hardware. The CPU doesn't play any significant role in it. CPU, and memory, play a very significant role here if the filesystem is XFS. Delayed logging takes all of the log journal writes and buffers them, so duplicate changes to the metadata are rolled up into a single physical IO. With enough metadata changes it becomes CPU bound. But we're talking lots of metadata if we have a modern fast CPU. I can't speak to EXTx behavior in this regard as I'm not familiar with it. The only way to speed things up is to either get faster storage (an SSD with high IOPS value for random writing) or you can try another filesystem. IIRC XFS is good at what you are doing. I cannot recommend it, though, as I don't have any recent experience with it. XFS is absolutely *horrible* with this workload prior to kernel 2.6.35, when delayed logging was introduced. So if this is your workload, and you want XFS, you need mainline 2.6.35, better still 3.0.0. -- Stan -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f3b5eca.4010...@hardwarefreak.com
Re: rm -rf is too slow on large files and directory structure(Around 30000)
Anyone heard of the unlink command?On Wed, 15 Feb 2012, Stan Hoeppner wrote: On 2/15/2012 12:55 AM, Jochen Spieker wrote: Bilal mk: I tried to remove 5GB directory. In that directory around 3 files and directory. It will take more than 30 min to complete. There is no other cpu intensive process running. After sometime it goes to D state and unbale to kii that process. Removing that many files is I/O bound. This isn't correct. Removing a kernel source tree is fast, and it contains on the order of 4500 directories and 50K files. EXT4 can 'rm -rf' the kernel source in 2-3 seconds. XFS prior to delaylog could take a minute or two, with delaylog it's 4 seconds. So that's 2-4 seconds to remove a directory tree of 50k files. The OP's system is taking forever then freezing. So if it's EXT4 he's using, this isn't an IO problem but a bug, or something else, maybe bad hardware. The CPU doesn't play any significant role in it. CPU, and memory, play a very significant role here if the filesystem is XFS. Delayed logging takes all of the log journal writes and buffers them, so duplicate changes to the metadata are rolled up into a single physical IO. With enough metadata changes it becomes CPU bound. But we're talking lots of metadata if we have a modern fast CPU. I can't speak to EXTx behavior in this regard as I'm not familiar with it. The only way to speed things up is to either get faster storage (an SSD with high IOPS value for random writing) or you can try another filesystem. IIRC XFS is good at what you are doing. I cannot recommend it, though, as I don't have any recent experience with it. XFS is absolutely *horrible* with this workload prior to kernel 2.6.35, when delayed logging was introduced. So if this is your workload, and you want XFS, you need mainline 2.6.35, better still 3.0.0. Jude jdashiel-at-shellworld-dot-net http://www.shellworld.net/~jdashiel/nj.html -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/alpine.bsf.2.01.1202150237390@freire1.furyyjbeyq.arg