On Thu, Mar 15, 2012 at 11:42:24AM +0100, Jacek Luczak wrote:
That was not a SVN server. It was a build host having checkouts of SVN
projects.
The many files/dirs case is common for VCS and the SVN is not the only
that would be affected here.
Well, with SVN it's 2x or 3x the number of
2012/3/10 Ted Ts'o ty...@mit.edu:
Hey Jacek,
I'm curious parameters of the set of directories on your production
server. On an ext4 file system, assuming you've copied the
directories over, what are the result of this command pipeline when
you are cd'ed into the top of the directory
2012/3/11 Ted Ts'o ty...@mit.edu:
Well, my goal in proposing this optimization is that helps for the
medium size directories in the cold cache case. The ext4 user who
first kicked off this thread was using his file system for an SVN
server, as I recall. I could easily believe that he has
On Tue, 13 Mar 2012, Ted Ts'o wrote:
On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
I think a format change would be preferable to runtime sorting.
Are you volunteering to spearhead the design and coding of such a
thing? Run-time sorting is backwards compatible, and a
On Wed, Mar 14, 2012 at 4:12 PM, Lukas Czerner lczer...@redhat.com wrote:
On Tue, 13 Mar 2012, Ted Ts'o wrote:
On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
I think a format change would be preferable to runtime sorting.
Are you volunteering to spearhead the design and
On Wed, Mar 14, 2012 at 09:12:02AM +0100, Lukas Czerner wrote:
I kind of like the idea about having the separate btree with inode
numbers for the directory reading, just because it does not affect
allocation policy nor the write performance which is a good thing. Also
it has been done before
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destination inode isn't used to index
entries for a readdir tree because of (wait for it) hard links. You end
up
On 3/13/2012 5:33 PM, Ted Ts'o wrote:
Are you volunteering to spearhead the design and coding of such a
thing? Run-time sorting is backwards compatible, and a heck of a lot
easier to code and test...
Do you really think it is that much easier? Even if it is easier, it is
still an ugly
On Wed, 14 Mar 2012, Ted Ts'o wrote:
On Wed, Mar 14, 2012 at 09:12:02AM +0100, Lukas Czerner wrote:
I kind of like the idea about having the separate btree with inode
numbers for the directory reading, just because it does not affect
allocation policy nor the write performance which is a
On Wed, Mar 14, 2012 at 10:17:37AM -0400, Zach Brown wrote:
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destination inode isn't used to index
entries for a
On Wed, Mar 14, 2012 at 10:28:20AM -0400, Phillip Susi wrote:
Do you really think it is that much easier? Even if it is easier,
it is still an ugly kludge. It would be much better to fix the
underlying problem rather than try to paper over it.
I don't think the choice is obvious. A
On Wed, Mar 14, 2012 at 03:34:13PM +0100, Lukas Czerner wrote:
You can make it be a RO_COMPAT change instead of an INCOMPAT change,
yes.
Does it have to be RO_COMPAT change though ? Since this would be both
forward and backward compatible.
The challenge is how do you notice if the file
On 03/14/2012 12:48 PM, Ted Ts'o wrote:
On Wed, Mar 14, 2012 at 10:17:37AM -0400, Zach Brown wrote:
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destination
On Wed, Mar 14, 2012 at 08:50:02AM -0400, Ted Ts'o wrote:
On Wed, Mar 14, 2012 at 09:12:02AM +0100, Lukas Czerner wrote:
I kind of like the idea about having the separate btree with inode
numbers for the directory reading, just because it does not affect
allocation policy nor the write
On 3/9/2012 11:48 PM, Ted Ts'o wrote:
I suspect the best optimization for now is probably something like
this:
1) Since the vast majority of directories are less than (say) 256k
(this would be a tunable value), for directories which are less than
this threshold size, the entire directory is
On Tue, Mar 13, 2012 at 03:05:59PM -0400, Phillip Susi wrote:
Why not just separate the hash table from the conventional, mostly
in inode order directory entries? For instance, the first 200k of
the directory could be the normal entries that would tend to be in
inode order ( and e2fsck -D
On 3/13/2012 3:53 PM, Ted Ts'o wrote:
Because that would be a format change.
I think a format change would be preferable to runtime sorting.
What we have today is not a hash table; it's a hashed tree, where we
use a fixed-length key for the tree based on the hash of the file
name. Currently
On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
I think a format change would be preferable to runtime sorting.
Are you volunteering to spearhead the design and coding of such a
thing? Run-time sorting is backwards compatible, and a heck of a lot
easier to code and test...
The
What if we use inode number as the hash value? Does it work?
Yongqiang.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 14, 2012 at 10:48:17AM +0800, Yongqiang Yang wrote:
What if we use inode number as the hash value? Does it work?
The whole point of using the tree structure is to accelerate filename
- inode number lookups. So the namei lookup doesn't have the inode
number; the whole point is to
On 2012-03-09, at 9:48 PM, Ted Ts'o wrote:
On Fri, Mar 09, 2012 at 04:09:43PM -0800, Andreas Dilger wrote:
Just reading this on the plane, so I can't find the exact reference
that I want, but a solution to this problem with htree was discussed
a few years ago between myself and Coly Li.
On Sun, Mar 11, 2012 at 04:30:37AM -0600, Andreas Dilger wrote:
if the userspace process could
feed us the exact set of filenames that will be used in the directory,
plus the exact file sizes for each of the file names...
Except POSIX doesn't allow anything close to this at all.
On Fri, Mar 09, 2012 at 12:29:29PM +0100, Lukas Czerner wrote:
Hi,
I have created a simple script which creates a bunch of files with
random names in the directory and then performs operation like list,
tar, find, copy and remove. I have run it for ext4, xfs and btrfs with
the 4k size
On Fri, Mar 09, 2012 at 04:09:43PM -0800, Andreas Dilger wrote:
I have also run the correlation.py from Phillip Susi on directory with
10 4k files and indeed the name to block correlation in ext4 is pretty
much random :)
Just reading this on the plane, so I can't find the exact
On 2/29/2012 11:44 PM, Theodore Tso wrote:
You might try sorting the entries returned by readdir by inode number
before you stat them.This is a long-standing weakness in
ext3/ext4, and it has to do with how we added hashed tree indexes to
directories in (a) a backwards compatible way, that
2012/3/4 Jacek Luczak difrost.ker...@gmail.com:
2012/3/3 Jacek Luczak difrost.ker...@gmail.com:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 11:05:56AM +0100,
On Fri 02-03-12 14:32:15, Ted Tso wrote:
On Fri, Mar 02, 2012 at 09:26:51AM -0500, Chris Mason wrote:
It would be interesting to have a project where someone added
fallocate() support into libelf, and then added some hueristics into
ext4 so that if a file is fallocated to a precise size, or if
On Mon, Mar 05, 2012 at 12:32:45PM +0100, Jacek Luczak wrote:
2012/3/4 Jacek Luczak difrost.ker...@gmail.com:
2012/3/3 Jacek Luczak difrost.ker...@gmail.com:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
2012/3/2 Chris Mason
2012/3/3 Jacek Luczak difrost.ker...@gmail.com:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
I've took both on tests.
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
I've took both on tests. The subject is acp and spd_readdir used with
tar,
2012/3/1 Ted Ts'o ty...@mit.edu:
On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
Yep, ext4 is close to my wife's closet.
Were all of the file systems freshly laid down, or was this an aged
ext4 file system?
Always fresh, recreated for each tests - that's why it takes quite
2012/3/1 Chris Mason chris.ma...@oracle.com:
On Wed, Feb 29, 2012 at 11:44:31PM -0500, Theodore Tso wrote:
You might try sorting the entries returned by readdir by inode number before
you stat them. This is a long-standing weakness in ext3/ext4, and it has
to do with how we added hashed
On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
I've took both on tests. The subject is acp and spd_readdir used with
tar, all on ext4:
1) acp: http://91.234.146.107/~difrost/seekwatcher/acp_ext4.png
2) spd_readdir: http://91.234.146.107/~difrost/seekwatcher/tar_ext4_readir.png
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
I've took both on tests. The subject is acp and spd_readdir used with
tar, all on ext4:
1) acp: http://91.234.146.107/~difrost/seekwatcher/acp_ext4.png
2) spd_readdir:
On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
2012/3/2 Chris Mason chris.ma...@oracle.com:
On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
I've took both on tests. The subject is acp and spd_readdir used with
tar, all on ext4:
1) acp:
On Fri, Mar 02, 2012 at 09:26:51AM -0500, Chris Mason wrote:
filefrag will tell you how many extents each file has, any file with
more than one extent is interesting. (The ext4 crowd may have better
suggestions on measuring fragmentation).
You can get a *huge* amount of information
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
2012/2/29 Chris Mason chris.ma...@oracle.com:
On Wed, Feb 29, 2012 at 03:07:45PM +0100, Jacek Luczak wrote:
[ btrfs faster than ext for find and cp -a ]
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
I will try to answer the question from
On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak difrost.ker...@gmail.com wrote:
While I was about to grab acp I've noticed seekwatcher with made my day :)
seekwatcher run of tar cf to eliminate writes (all done on 3.2.7):
1) btrfs: http://dozzie.jarowit.net/~dozzie/luczajac/tar_btrfs.png
2)
2012/3/1 Hillf Danton dhi...@gmail.com:
On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak difrost.ker...@gmail.com wrote:
While I was about to grab acp I've noticed seekwatcher with made my day :)
seekwatcher run of tar cf to eliminate writes (all done on 3.2.7):
1) btrfs:
On Thu, Mar 01, 2012 at 03:03:53PM +0100, Jacek Luczak wrote:
2012/3/1 Hillf Danton dhi...@gmail.com:
On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak difrost.ker...@gmail.com
wrote:
While I was about to grab acp I've noticed seekwatcher with made my day :)
seekwatcher run of tar cf to
On Wed, Feb 29, 2012 at 11:44:31PM -0500, Theodore Tso wrote:
You might try sorting the entries returned by readdir by inode number before
you stat them.This is a long-standing weakness in ext3/ext4, and it has
to do with how we added hashed tree indexes to directories in (a) a backwards
2012/3/1 Chris Mason chris.ma...@oracle.com:
On Thu, Mar 01, 2012 at 03:03:53PM +0100, Jacek Luczak wrote:
2012/3/1 Hillf Danton dhi...@gmail.com:
On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak difrost.ker...@gmail.com
wrote:
While I was about to grab acp I've noticed seekwatcher with
On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
2012/3/1 Chris Mason chris.ma...@oracle.com:
XFS will probably beat btrfs in this test. Their directory indexes
reflect on disk layout very well.
True, but not that fast on small files.
Except the question I've raised in
2012/3/1 Chris Mason chris.ma...@oracle.com:
On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
2012/3/1 Chris Mason chris.ma...@oracle.com:
XFS will probably beat btrfs in this test. Their directory indexes
reflect on disk layout very well.
True, but not that fast on small
On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
Yep, ext4 is close to my wife's closet.
Were all of the file systems freshly laid down, or was this an aged
ext4 file system?
Also you should beware that if you have a workload which is heavy
parallel I/O, with lots of random,
On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
Hi All,
Long story short: We've found that operations on a directory structure
holding many dirs takes ages on ext4.
The Question: Why there's that huge difference in ext4 and btrfs? See
below test results for real values.
Hi Chris,
the last one was borked :) Please check this one.
-jacek
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
Hi All,
/*Sorry for sending incomplete email, hit wrong button :) I guess I
can't use Gmail */
Long story short: We've found that operations on a directory structure
On Wed, 29 Feb 2012, Chris Mason wrote:
On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
Hi All,
Long story short: We've found that operations on a directory structure
holding many dirs takes ages on ext4.
The Question: Why there's that huge difference in ext4 and
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
Hi Chris,
the last one was borked :) Please check this one.
-jacek
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
Hi All,
/*Sorry for sending incomplete email, hit wrong button :) I guess I
can't use Gmail */
Long story short: We've
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
Hi Chris,
the last one was borked :) Please check this one.
-jacek
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
Hi All,
/*Sorry for sending incomplete email, hit wrong button :) I guess
On Wed, Feb 29, 2012 at 08:51:58AM -0500, Chris Mason wrote:
On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
Ext4 results:
| Type | 2.6.39.4-3 | 3.2.7
| Dir cnt | 17m 40sec | 11m 20sec
| File cnt | 17m 36sec | 11m 22sec
| Copy| 1h 28m| 1h 27m
|
On Wed, Feb 29, 2012 at 03:07:45PM +0100, Jacek Luczak wrote:
[ btrfs faster than ext for find and cp -a ]
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
I will try to answer the question from the broken email I've sent.
@Lukas, it was always a fresh FS on top of LVM logical volume.
2012/2/29 Chris Mason chris.ma...@oracle.com:
On Wed, Feb 29, 2012 at 03:07:45PM +0100, Jacek Luczak wrote:
[ btrfs faster than ext for find and cp -a ]
2012/2/29 Jacek Luczak difrost.ker...@gmail.com:
I will try to answer the question from the broken email I've sent.
@Lukas, it was
You might try sorting the entries returned by readdir by inode number before
you stat them.This is a long-standing weakness in ext3/ext4, and it has to
do with how we added hashed tree indexes to directories in (a) a backwards
compatible way, that (b) was POSIX compliant with respect to
54 matches
Mail list logo