Re: ChunkFS - measuring cross-chunk references

2007-05-19 Thread Karuna sagar K

On 4/25/07, Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:

On Wed, Apr 25, 2007 at 05:50:55AM +0530, Karuna sagar K wrote:

One more set of numbers to calculate would be an estimate of cross-references
across chunks of block groups -- 1 (=128MB), 2 (=256MB), 4 (=512MB), 8(=1GB)
as suggested by Kalpak.



Here is the tool to make such calculations.

Result of running the tool on / partition of ext3 file system (each
chunk is 4 times a block group):

./cref.sh /dev/hda1 dmp /mnt/test 4

---

Number of files = 221763

Number of directories = 24456

Total size = 8193116 KB

Total data stored = 7179200 KB

Size of block groups = 131072 KB

Number of inodes per block group = 16288

Chunk size = 524288 KB

No. of cross references between directories and sub-directories = 869

No. of cross references between directories and file = 584

Total no. of cross references = 13806 (dir ref = 1453, file ref = 12353)

---



Once we have that, it would be nice if we can get data on results with
the tool from other people, especially with larger filesystem sizes.




Thanks,
Karuna


cref.tar.bz2
Description: BZip2 compressed data


Re: ChunkFS - measuring cross-chunk references

2007-05-06 Thread Valerie Henson
On Mon, Apr 23, 2007 at 08:13:06PM -0400, Theodore Tso wrote:
> 
> There may also be special things we will need to do to handle
> scenarios such as BackupPC, where if it looks like a directory
> contains a huge number of hard links to a particular chunk, we'll need
> to make sure that directory is either created in the right chunk
> (possibly with hints from the application) or migrated to the right
> chunk (but this might cause the inode number of the directory to
> change --- maybe we allow this as long as the directory has never been
> stat'ed, so that the inode number has never been observed).

Yeah, this is an oddball but real case.  What are the consequences of
inode number changing - increased backup bandwidth?  It seems like it
would have the same effect as "cp -a dir tmp; rm -rf dir; mv tmp dir",
which is certainly legal (and a good way to defragment subtrees).

> The other thing which we should consider is that chunkfs really
> requires a 64-bit inode number space, which means either we only allow
> it on 64-bit systems, or we need to consider a migration so that even
> on 32-bit platforms, stat() functions like stat64(), insofar that it
> uses a stat structure which returns a 64-bit ino_t.

A 32-bit inode space probably won't be that hard to do for chunkfs,
although it would limit total file system size.  This problem needs to
be solved in general, I'm afraid - 4 billion inodes is just not that
many now.

-VAL
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-05-06 Thread Valerie Henson
On Mon, Apr 23, 2007 at 02:05:47AM +0530, Karuna sagar K wrote:
> Hi,
> 
> The attached code contains program to estimate the cross-chunk
> references for ChunkFS file system (idea from Valh). Below are the
> results:

Nice work!  Thank you very much for doing this!

-VAL
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-05-06 Thread Valerie Henson
On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote:
> 
> Also, is it considered a cross-chunk reference if a directory entry is
> referencing an inode in another group?  Should there be a continuation
> inode in the local group, or is the directory entry itself enough?

(Sorry for the delay; just moved to Portland these last couple of
weeks.)

It is a cross-chunk reference - we can't calculate the correct link
count for the target file unless we have a quick way to get all the
directory entries pointing to an inode.  My current scheme is to
create a continuation inode for the directory in the chunk containing
the inode (if the chunk containing the inode is full, create new
continuation inodes for both in a new chunk).

-VAL
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-25 Thread Suparna Bhattacharya
On Wed, Apr 25, 2007 at 05:50:55AM +0530, Karuna sagar K wrote:
> On 4/24/07, Theodore Tso <[EMAIL PROTECTED]> wrote:
> >On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote:
> .
> >It would also be good to distinguish between directories referencing
> >files in another chunk, and directories referencing subdirectories in
> >another chunk (which would be simpler to handle, given the topological
> >restrictions on directories, as compared to files and hard links).
> >
> 
> Modified the tool to distinguish between
> 1. cross references between directories and files
> 2. cross references between directories and sub directories
> 3. cross references within a file (due to huge file size)

One more set of numbers to calculate would be an estimate of cross-references
across chunks of block groups -- 1 (=128MB), 2 (=256MB), 4 (=512MB), 8(=1GB)
as suggested by Kalpak.

Once we have that, it would be nice if we can get data on results with
the tool from other people, especially with larger filesystem sizes.

Regards
Suparna

> 
> Below is the result from / partition of ext3 file system:
> 
> Number of files = 221794
> Number of directories = 24457
> Total size = 8193116 KB
> Total data stored = 7187392 KB
> Size of block groups = 131072 KB
> Number of inodes per block group = 16288
> No. of cross references between directories and sub-directories = 7791
> No. of cross references between directories and file = 657
> Total no. of cross references = 62018 (dir ref = 8448, file ref = 53570)
> 
> Thanks for the suggestions.
> 
> >There may also be special things we will need to do to handle
> >scenarios such as BackupPC, where if it looks like a directory
> >contains a huge number of hard links to a particular chunk, we'll need
> >to make sure that directory is either created in the right chunk
> >(possibly with hints from the application) or migrated to the right
> >chunk (but this might cause the inode number of the directory to
> >change --- maybe we allow this as long as the directory has never been
> >stat'ed, so that the inode number has never been observed).
> >
> >The other thing which we should consider is that chunkfs really
> >requires a 64-bit inode number space, which means either we only allow
> >it on 64-bit systems, or we need to consider a migration so that even
> >on 32-bit platforms, stat() functions like stat64(), insofar that it
> >uses a stat structure which returns a 64-bit ino_t.
> >
> >   - Ted
> >
> 
> 
> Thanks,
> Karuna



-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-24 Thread Karuna sagar K

On 4/24/07, Theodore Tso <[EMAIL PROTECTED]> wrote:

On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote:

.

It would also be good to distinguish between directories referencing
files in another chunk, and directories referencing subdirectories in
another chunk (which would be simpler to handle, given the topological
restrictions on directories, as compared to files and hard links).



Modified the tool to distinguish between
1. cross references between directories and files
2. cross references between directories and sub directories
3. cross references within a file (due to huge file size)

Below is the result from / partition of ext3 file system:

Number of files = 221794
Number of directories = 24457
Total size = 8193116 KB
Total data stored = 7187392 KB
Size of block groups = 131072 KB
Number of inodes per block group = 16288
No. of cross references between directories and sub-directories = 7791
No. of cross references between directories and file = 657
Total no. of cross references = 62018 (dir ref = 8448, file ref = 53570)

Thanks for the suggestions.


There may also be special things we will need to do to handle
scenarios such as BackupPC, where if it looks like a directory
contains a huge number of hard links to a particular chunk, we'll need
to make sure that directory is either created in the right chunk
(possibly with hints from the application) or migrated to the right
chunk (but this might cause the inode number of the directory to
change --- maybe we allow this as long as the directory has never been
stat'ed, so that the inode number has never been observed).

The other thing which we should consider is that chunkfs really
requires a 64-bit inode number space, which means either we only allow
it on 64-bit systems, or we need to consider a migration so that even
on 32-bit platforms, stat() functions like stat64(), insofar that it
uses a stat structure which returns a 64-bit ino_t.

   - Ted




Thanks,
Karuna


cref.tar.bz2
Description: BZip2 compressed data


Re: ChunkFS - measuring cross-chunk references

2007-04-24 Thread Theodore Tso
On Mon, Apr 23, 2007 at 06:02:29PM -0700, Arjan van de Ven wrote:
> 
> > The other thing which we should consider is that chunkfs really
> > requires a 64-bit inode number space, which means either we only allow
> 
> does it?
> I'd think it needs a "chunk space" number and a 32 bit local inode
> number ;) (same for blocks)
> 

But that means that the number which gets exported to userspace via
the stat system call will need more than 32 bits worth of ino_t

- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Amit Gud

On Mon, 23 Apr 2007, Amit Gud wrote:


On Mon, 23 Apr 2007, Arjan van de Ven wrote:



>  The other thing which we should consider is that chunkfs really
>  requires a 64-bit inode number space, which means either we only allow

 does it?
 I'd think it needs a "chunk space" number and a 32 bit local inode
 number ;) (same for blocks)



For inodes, yes, either 64-bit inode or some field for the chunk id in which 
the inode is. But for block numbers, you don't. Because individual chunks 
manage part of the whole file system in an independent way. They have their 
block bitmaps starting at an offset. Inode bitmaps, however, remains same.




In that sense, we also can do away without having chunk identifier encoded 
into inode number and chunkfs would still be fine with it. But we will 
then loose inode uniqueness property, which could well be OK as it is with 
other file systems in which inode number is not sufficient for unique 
identification of an inode.



AG
--
May the source be with you.
http://www.cis.ksu.edu/~gud
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Amit Gud

On Mon, 23 Apr 2007, Arjan van de Ven wrote:




The other thing which we should consider is that chunkfs really
requires a 64-bit inode number space, which means either we only allow


does it?
I'd think it needs a "chunk space" number and a 32 bit local inode
number ;) (same for blocks)



For inodes, yes, either 64-bit inode or some field for the chunk id in 
which the inode is. But for block numbers, you don't. Because individual 
chunks manage part of the whole file system in an independent way. They 
have their block bitmaps starting at an offset. Inode bitmaps, however, 
remains same.



AG
--
May the source be with you.
http://www.cis.ksu.edu/~gud
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Arjan van de Ven

> The other thing which we should consider is that chunkfs really
> requires a 64-bit inode number space, which means either we only allow

does it?
I'd think it needs a "chunk space" number and a 32 bit local inode
number ;) (same for blocks)

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Theodore Tso
On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote:
> > With a blocksize of 4KB, a block group would be 128 MB. In the original
> > Chunkfs paper, Valh had mentioned 1GB chunks and I believe it will be
> > possible to use 2GB, 4GB or 8GB chunks in the future. As the chunk size
> > increases the number of cross-chunk references will reduce and hence it
> > might be a good idea to present these statistics considering different
> > chunk sizes starting from 512MB upto 2GB.
> 
> Also, given that cross-chunk references will be more expensive to fix, I
> can imagine the allocation policy for chunkfs will try to avoid this if
> possible, further reducing the number of cross-chunk inodes.  I guess it
> should be more clear whether the cross-chunk references are due to inode
> block references, or because of e.g. directories referencing inodes in
> another chunk.

It would also be good to distinguish between directories referencing
files in another chunk, and directories referencing subdirectories in
another chunk (which would be simpler to handle, given the topological
restrictions on directories, as compared to files and hard links).

There may also be special things we will need to do to handle
scenarios such as BackupPC, where if it looks like a directory
contains a huge number of hard links to a particular chunk, we'll need
to make sure that directory is either created in the right chunk
(possibly with hints from the application) or migrated to the right
chunk (but this might cause the inode number of the directory to
change --- maybe we allow this as long as the directory has never been
stat'ed, so that the inode number has never been observed).

The other thing which we should consider is that chunkfs really
requires a 64-bit inode number space, which means either we only allow
it on 64-bit systems, or we need to consider a migration so that even
on 32-bit platforms, stat() functions like stat64(), insofar that it
uses a stat structure which returns a 64-bit ino_t.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Andreas Dilger
On Apr 23, 2007  15:04 +0530, Kalpak Shah wrote:
> On Mon, 2007-04-23 at 12:49 +0530, Karuna sagar K wrote:
> > The tool estimates the cross-chunk references from an extt2/3 file
> > system. It considers a block group as one chunk and calcuates how many
> > block groups does a file span across. So, the block group size gives
> > the estimate of chunk size.
> > 
> > The file systems were aged for about 3-4 months on a developers laptop.
> 
> With a blocksize of 4KB, a block group would be 128 MB. In the original
> Chunkfs paper, Valh had mentioned 1GB chunks and I believe it will be
> possible to use 2GB, 4GB or 8GB chunks in the future. As the chunk size
> increases the number of cross-chunk references will reduce and hence it
> might be a good idea to present these statistics considering different
> chunk sizes starting from 512MB upto 2GB.

Also, given that cross-chunk references will be more expensive to fix, I
can imagine the allocation policy for chunkfs will try to avoid this if
possible, further reducing the number of cross-chunk inodes.  I guess it
should be more clear whether the cross-chunk references are due to inode
block references, or because of e.g. directories referencing inodes in
another chunk.

Also, is it considered a cross-chunk reference if a directory entry is
referencing an inode in another group?  Should there be a continuation
inode in the local group, or is the directory entry itself enough?

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Kalpak Shah
On Mon, 2007-04-23 at 12:49 +0530, Karuna sagar K wrote:
> Hi,
> 
> The tool estimates the cross-chunk references from an extt2/3 file
> system. It considers a block group as one chunk and calcuates how many
> block groups does a file span across. So, the block group size gives
> the estimate of chunk size.
> 
> The file systems were aged for about 3-4 months on a developers laptop.
> 

With a blocksize of 4KB, a block group would be 128 MB. In the original
Chunkfs paper, Valh had mentioned 1GB chunks and I believe it will be
possible to use 2GB, 4GB or 8GB chunks in the future. As the chunk size
increases the number of cross-chunk references will reduce and hence it
might be a good idea to present these statistics considering different
chunk sizes starting from 512MB upto 2GB.

Thanks,
Kalpak.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-23 Thread Karuna sagar K

Hi,

The tool estimates the cross-chunk references from an extt2/3 file
system. It considers a block group as one chunk and calcuates how many
block groups does a file span across. So, the block group size gives
the estimate of chunk size.

The file systems were aged for about 3-4 months on a developers laptop.

Should have given the background before. Below is the explanations for
the tool. Valh and others came up with this idea.

-
Chunkfs will only work if we have "few" cross-chunk references.  We
can estimate the effect of chunk size on the number of these
references using an existing ext2/3 file system and treating the block
groups as though they are chunks.  The basic idea is that we figure
out what the block group boundaries are and then find out which files
and directories span two or more block groups.

Step 1:
---

Get a real-world ext2/3 file system. A file system which has been in
use is required. One from a laptop or a server of any sort will do
fine.

Step 2:
---

Figure out where the block group boundaries are on disk. Two things
are to be known:

1. Which inode numbers are in which block group?
2. Which blocks are in which block group?

At the end of this step we should have a list that looks something like:

Block group 1: Inodes 11-343, blocks 1000-2
Block group 2: Inodes 344-576, blocks 2-4
[...]

Step 3:
---

For each file, get the inode number and use mapping from step 2 to
figure out which block group it is in.  Now use bmap() on each block
in the file, and find out the block number.  Use mapping from step 2
to figure out which block groups it has data in. For each file, record
the list of all block groups.

For each directory, get the inode number and map that to a block
group. Then get the inode numbers of all entries in the directory
(ignore symlinks) and map them to a block group.  For each directory,
record the list of all block groups.

Step 4:
---

Count the number of cross-chunk references this file system would
need.  This is done by going through each directory and file, and
adding up the number of block groups it uses MINUS one.  So if a file
was in block groups 3, 7, and 24, then you would add 2 to the total
number of cross-chunk references.  If a file was only in block group
2, then you would add 0 to the total.


On 4/22/07, Amit Gud <[EMAIL PROTECTED]> wrote:

Karuna sagar K wrote:
> Hi,
>
> The attached code contains program to estimate the cross-chunk
> references for ChunkFS file system (idea from Valh). Below are the
> results:
>

Nice to see some numbers! But would be really nice to know:

- what the chunk size is
- how the files were created or, more vaguely, how 'aged' the fs is
- what is the chunk allocation algorithm


Best,
AG
--
May the source be with you.
http://www.cis.ksu.edu/~gud





Thanks,
Karuna
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ChunkFS - measuring cross-chunk references

2007-04-22 Thread Amit Gud

Karuna sagar K wrote:

Hi,

The attached code contains program to estimate the cross-chunk
references for ChunkFS file system (idea from Valh). Below are the
results:



Nice to see some numbers! But would be really nice to know:

- what the chunk size is
- how the files were created or, more vaguely, how 'aged' the fs is
- what is the chunk allocation algorithm


Best,
AG
--
May the source be with you.
http://www.cis.ksu.edu/~gud

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html