Re: [gentoo-user] Speed up `du'

2008-05-25 Thread Stroller


On 25 May 2008, at 03:56, Hemmann, Volker Armin wrote:

...
reiserfs and xfs your barriers by default.


This sentence no parse.

Stroller.
--
gentoo-user@lists.gentoo.org mailing list



[gentoo-user] Speed up `du'

2008-05-24 Thread reader
Is there any way to speed up the du command?  I mean short of having
cron run it on target directories and store results.  (not really
speeding up but at least not having to wait for a result)

I've seen various mention of du being slow but don't recall any
mentions of how to speed it up.

I use Reiserfs with default sizes.  In some situations like a large
cache of nntp messages of several GB.  I might wait 5-10 minutes or more
for du to get the size of the directory. 

Are there other file systems that can return a result of `du' faster?

I'm curious how `df' computes sizes so much quicker.  Even after
rm'ing a large amount of data... `df' sees the difference right away.

Or maybe there is some other tool or technique that can quickly tell
me the size of a directory or set of directories.

-- 
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Speed up `du'

2008-05-24 Thread Allan Gottlieb
At Sat, 24 May 2008 16:49:09 -0500 [EMAIL PROTECTED] wrote:

 Is there any way to speed up the du command?  I mean short of having
 cron run it on target directories and store results.  (not really
 speeding up but at least not having to wait for a result)

 I've seen various mention of du being slow but don't recall any
 mentions of how to speed it up.

 I use Reiserfs with default sizes.  In some situations like a large
 cache of nntp messages of several GB.  I might wait 5-10 minutes or more
 for du to get the size of the directory. 

 Are there other file systems that can return a result of `du' faster?

 I'm curious how `df' computes sizes so much quicker.  Even after
 rm'ing a large amount of data... `df' sees the difference right away.

I can't help with speeding up du, but can explain df's speed.
This information is kept in the superblock.  Each operation that
changes size updates the superblock and df just reads the result.
(In a sense it is like your cron soln above for du :-) .)

 Or maybe there is some other tool or technique that can quickly tell
 me the size of a directory or set of directories.

allan
-- 
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Speed up `du'

2008-05-24 Thread Willie Wong
On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover [EMAIL PROTECTED] 
squawked:
 Is there any way to speed up the du command?  I mean short of having
 cron run it on target directories and store results.  (not really
 speeding up but at least not having to wait for a result)
 
 I've seen various mention of du being slow but don't recall any
 mentions of how to speed it up.
 
 I use Reiserfs with default sizes.  In some situations like a large
 cache of nntp messages of several GB.  I might wait 5-10 minutes or more
 for du to get the size of the directory. 
 
 Are there other file systems that can return a result of `du' faster?
 
 I'm curious how `df' computes sizes so much quicker.  Even after
 rm'ing a large amount of data... `df' sees the difference right away.
 
 Or maybe there is some other tool or technique that can quickly tell
 me the size of a directory or set of directories.

I am pretty sure the problem with du is that it actually looks,
recursively, at every single file and computes the size that way. So
the time you have to wait is mostly due to disk IO (and caching would
also explain why if you run du twice in a row the answer returns much
more quickly). So, if you know what the bottle-neck directory is (for
example, the directory of nntp messages), the tricks in 

 http://gentoo-wiki.com/TIP_Speeding_up_portage

should probably work just as well. 

HTH, 

W
-- 
You're very sure of your facts,  he said at last, I 
couldn't trust the thinking of a man who takes the Universe 
- if there is one - for granted. 
Sortir en Pantoufles: up 533 days, 21:55
-- 
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Speed up `du'

2008-05-24 Thread Stroller


On 25 May 2008, at 00:24, Willie Wong wrote:

On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover  
[EMAIL PROTECTED] squawked:

...
I use Reiserfs with default sizes.  In some situations like a large
cache of nntp messages of several GB.  I might wait 5-10 minutes  
or more

for du to get the size of the directory.


I am pretty sure the problem with du is that it actually looks,
recursively, at every single file and computes the size that way.


What he said.


Or maybe there is some other tool or technique that can quickly tell
me the size of a directory or set of directories.


Keep all the files in a honkin' big tarball.
:P
If you need to read these files on the fly then I'm afraid you'll  
have to write a kernel filesystem extension (or find one?) that will  
read them out of the tar file, slowing all read  write actions down.  
But, hey, `du` on the tarball will complete in no time at all!! ;)


In seriousness, another thing to do is keep these files on a separate  
partition, if you can. Basically a user's ~ which includes  
both .maildir and My HiDef Videos is non-optimal.



Are there other file systems that can return a result of `du' faster?



All filesystems have their advantages  disadvantages.

http://www.debian-administration.org/articles/388
Reading the above I _think_ the test most similar in function to  
running `du` on many small files is the Directory listing and file  
search into the previous file tree test, at which ResiderFS is fastest.


I need to look into this myself soon, to try  get best speed at a  
3gig corpus of email. I was expecting EXT3 to be best - when you  
create the filesystem you can specify the blocksize. It's possible  
that the author of the filesystems comparison could have chosen  
options when formatting his EXT3 disk that affected the speed of the  
results - a journal would make writes slower, for instance (not sure  
about reads).


Stroller.
--
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Speed up `du'

2008-05-24 Thread Hemmann, Volker Armin
On Sonntag, 25. Mai 2008, Stroller wrote:
 On 25 May 2008, at 00:24, Willie Wong wrote:
  On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover
 
  [EMAIL PROTECTED] squawked:
  ...
  I use Reiserfs with default sizes.  In some situations like a large
  cache of nntp messages of several GB.  I might wait 5-10 minutes
  or more
  for du to get the size of the directory.
 
  I am pretty sure the problem with du is that it actually looks,
  recursively, at every single file and computes the size that way.

 What he said.

  Or maybe there is some other tool or technique that can quickly tell
  me the size of a directory or set of directories.

 Keep all the files in a honkin' big tarball.

 :P

 If you need to read these files on the fly then I'm afraid you'll
 have to write a kernel filesystem extension (or find one?) that will
 read them out of the tar file, slowing all read  write actions down.
 But, hey, `du` on the tarball will complete in no time at all!! ;)

 In seriousness, another thing to do is keep these files on a separate
 partition, if you can. Basically a user's ~ which includes
 both .maildir and My HiDef Videos is non-optimal.

  Are there other file systems that can return a result of `du' faster?

 All filesystems have their advantages  disadvantages.

 http://www.debian-administration.org/articles/388

one thing the article does not mention:

reiserfs and xfs your barriers by default.

ext3 does not. And if you turn on barriers (as mount option) you loose 30% of 
its speed.

Of course, if you care about data integrity, LVM is ruled out too - for the 
same reason.

So if you care about data integrity and speed at the same time, ext3 is ruled 
out.  XFS is broken on a monthly basis (just search the lkml archives for 
xfs. It is sickening). Leaves reiserfs as only sane choice.
-- 
gentoo-user@lists.gentoo.org mailing list