Re: ChunkFS - measuring cross-chunk references
On 4/25/07, Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: On Wed, Apr 25, 2007 at 05:50:55AM +0530, Karuna sagar K wrote: One more set of numbers to calculate would be an estimate of cross-references across chunks of block groups -- 1 (=128MB), 2 (=256MB), 4 (=512MB), 8(=1GB) as suggested by Kalpak. Here is the tool to make such calculations. Result of running the tool on / partition of ext3 file system (each chunk is 4 times a block group): ./cref.sh /dev/hda1 dmp /mnt/test 4 --- Number of files = 221763 Number of directories = 24456 Total size = 8193116 KB Total data stored = 7179200 KB Size of block groups = 131072 KB Number of inodes per block group = 16288 Chunk size = 524288 KB No. of cross references between directories and sub-directories = 869 No. of cross references between directories and file = 584 Total no. of cross references = 13806 (dir ref = 1453, file ref = 12353) --- Once we have that, it would be nice if we can get data on results with the tool from other people, especially with larger filesystem sizes. Thanks, Karuna cref.tar.bz2 Description: BZip2 compressed data
Re: ChunkFS - measuring cross-chunk references
On Mon, Apr 23, 2007 at 08:13:06PM -0400, Theodore Tso wrote: > > There may also be special things we will need to do to handle > scenarios such as BackupPC, where if it looks like a directory > contains a huge number of hard links to a particular chunk, we'll need > to make sure that directory is either created in the right chunk > (possibly with hints from the application) or migrated to the right > chunk (but this might cause the inode number of the directory to > change --- maybe we allow this as long as the directory has never been > stat'ed, so that the inode number has never been observed). Yeah, this is an oddball but real case. What are the consequences of inode number changing - increased backup bandwidth? It seems like it would have the same effect as "cp -a dir tmp; rm -rf dir; mv tmp dir", which is certainly legal (and a good way to defragment subtrees). > The other thing which we should consider is that chunkfs really > requires a 64-bit inode number space, which means either we only allow > it on 64-bit systems, or we need to consider a migration so that even > on 32-bit platforms, stat() functions like stat64(), insofar that it > uses a stat structure which returns a 64-bit ino_t. A 32-bit inode space probably won't be that hard to do for chunkfs, although it would limit total file system size. This problem needs to be solved in general, I'm afraid - 4 billion inodes is just not that many now. -VAL - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Mon, Apr 23, 2007 at 02:05:47AM +0530, Karuna sagar K wrote: > Hi, > > The attached code contains program to estimate the cross-chunk > references for ChunkFS file system (idea from Valh). Below are the > results: Nice work! Thank you very much for doing this! -VAL - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote: > > Also, is it considered a cross-chunk reference if a directory entry is > referencing an inode in another group? Should there be a continuation > inode in the local group, or is the directory entry itself enough? (Sorry for the delay; just moved to Portland these last couple of weeks.) It is a cross-chunk reference - we can't calculate the correct link count for the target file unless we have a quick way to get all the directory entries pointing to an inode. My current scheme is to create a continuation inode for the directory in the chunk containing the inode (if the chunk containing the inode is full, create new continuation inodes for both in a new chunk). -VAL - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Wed, Apr 25, 2007 at 05:50:55AM +0530, Karuna sagar K wrote: > On 4/24/07, Theodore Tso <[EMAIL PROTECTED]> wrote: > >On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote: > . > >It would also be good to distinguish between directories referencing > >files in another chunk, and directories referencing subdirectories in > >another chunk (which would be simpler to handle, given the topological > >restrictions on directories, as compared to files and hard links). > > > > Modified the tool to distinguish between > 1. cross references between directories and files > 2. cross references between directories and sub directories > 3. cross references within a file (due to huge file size) One more set of numbers to calculate would be an estimate of cross-references across chunks of block groups -- 1 (=128MB), 2 (=256MB), 4 (=512MB), 8(=1GB) as suggested by Kalpak. Once we have that, it would be nice if we can get data on results with the tool from other people, especially with larger filesystem sizes. Regards Suparna > > Below is the result from / partition of ext3 file system: > > Number of files = 221794 > Number of directories = 24457 > Total size = 8193116 KB > Total data stored = 7187392 KB > Size of block groups = 131072 KB > Number of inodes per block group = 16288 > No. of cross references between directories and sub-directories = 7791 > No. of cross references between directories and file = 657 > Total no. of cross references = 62018 (dir ref = 8448, file ref = 53570) > > Thanks for the suggestions. > > >There may also be special things we will need to do to handle > >scenarios such as BackupPC, where if it looks like a directory > >contains a huge number of hard links to a particular chunk, we'll need > >to make sure that directory is either created in the right chunk > >(possibly with hints from the application) or migrated to the right > >chunk (but this might cause the inode number of the directory to > >change --- maybe we allow this as long as the directory has never been > >stat'ed, so that the inode number has never been observed). > > > >The other thing which we should consider is that chunkfs really > >requires a 64-bit inode number space, which means either we only allow > >it on 64-bit systems, or we need to consider a migration so that even > >on 32-bit platforms, stat() functions like stat64(), insofar that it > >uses a stat structure which returns a 64-bit ino_t. > > > > - Ted > > > > > Thanks, > Karuna -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On 4/24/07, Theodore Tso <[EMAIL PROTECTED]> wrote: On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote: . It would also be good to distinguish between directories referencing files in another chunk, and directories referencing subdirectories in another chunk (which would be simpler to handle, given the topological restrictions on directories, as compared to files and hard links). Modified the tool to distinguish between 1. cross references between directories and files 2. cross references between directories and sub directories 3. cross references within a file (due to huge file size) Below is the result from / partition of ext3 file system: Number of files = 221794 Number of directories = 24457 Total size = 8193116 KB Total data stored = 7187392 KB Size of block groups = 131072 KB Number of inodes per block group = 16288 No. of cross references between directories and sub-directories = 7791 No. of cross references between directories and file = 657 Total no. of cross references = 62018 (dir ref = 8448, file ref = 53570) Thanks for the suggestions. There may also be special things we will need to do to handle scenarios such as BackupPC, where if it looks like a directory contains a huge number of hard links to a particular chunk, we'll need to make sure that directory is either created in the right chunk (possibly with hints from the application) or migrated to the right chunk (but this might cause the inode number of the directory to change --- maybe we allow this as long as the directory has never been stat'ed, so that the inode number has never been observed). The other thing which we should consider is that chunkfs really requires a 64-bit inode number space, which means either we only allow it on 64-bit systems, or we need to consider a migration so that even on 32-bit platforms, stat() functions like stat64(), insofar that it uses a stat structure which returns a 64-bit ino_t. - Ted Thanks, Karuna cref.tar.bz2 Description: BZip2 compressed data
Re: ChunkFS - measuring cross-chunk references
On Mon, Apr 23, 2007 at 06:02:29PM -0700, Arjan van de Ven wrote: > > > The other thing which we should consider is that chunkfs really > > requires a 64-bit inode number space, which means either we only allow > > does it? > I'd think it needs a "chunk space" number and a 32 bit local inode > number ;) (same for blocks) > But that means that the number which gets exported to userspace via the stat system call will need more than 32 bits worth of ino_t - Ted - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Mon, 23 Apr 2007, Amit Gud wrote: On Mon, 23 Apr 2007, Arjan van de Ven wrote: > The other thing which we should consider is that chunkfs really > requires a 64-bit inode number space, which means either we only allow does it? I'd think it needs a "chunk space" number and a 32 bit local inode number ;) (same for blocks) For inodes, yes, either 64-bit inode or some field for the chunk id in which the inode is. But for block numbers, you don't. Because individual chunks manage part of the whole file system in an independent way. They have their block bitmaps starting at an offset. Inode bitmaps, however, remains same. In that sense, we also can do away without having chunk identifier encoded into inode number and chunkfs would still be fine with it. But we will then loose inode uniqueness property, which could well be OK as it is with other file systems in which inode number is not sufficient for unique identification of an inode. AG -- May the source be with you. http://www.cis.ksu.edu/~gud - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Mon, 23 Apr 2007, Arjan van de Ven wrote: The other thing which we should consider is that chunkfs really requires a 64-bit inode number space, which means either we only allow does it? I'd think it needs a "chunk space" number and a 32 bit local inode number ;) (same for blocks) For inodes, yes, either 64-bit inode or some field for the chunk id in which the inode is. But for block numbers, you don't. Because individual chunks manage part of the whole file system in an independent way. They have their block bitmaps starting at an offset. Inode bitmaps, however, remains same. AG -- May the source be with you. http://www.cis.ksu.edu/~gud - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
> The other thing which we should consider is that chunkfs really > requires a 64-bit inode number space, which means either we only allow does it? I'd think it needs a "chunk space" number and a 32 bit local inode number ;) (same for blocks) - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Mon, Apr 23, 2007 at 02:53:33PM -0600, Andreas Dilger wrote: > > With a blocksize of 4KB, a block group would be 128 MB. In the original > > Chunkfs paper, Valh had mentioned 1GB chunks and I believe it will be > > possible to use 2GB, 4GB or 8GB chunks in the future. As the chunk size > > increases the number of cross-chunk references will reduce and hence it > > might be a good idea to present these statistics considering different > > chunk sizes starting from 512MB upto 2GB. > > Also, given that cross-chunk references will be more expensive to fix, I > can imagine the allocation policy for chunkfs will try to avoid this if > possible, further reducing the number of cross-chunk inodes. I guess it > should be more clear whether the cross-chunk references are due to inode > block references, or because of e.g. directories referencing inodes in > another chunk. It would also be good to distinguish between directories referencing files in another chunk, and directories referencing subdirectories in another chunk (which would be simpler to handle, given the topological restrictions on directories, as compared to files and hard links). There may also be special things we will need to do to handle scenarios such as BackupPC, where if it looks like a directory contains a huge number of hard links to a particular chunk, we'll need to make sure that directory is either created in the right chunk (possibly with hints from the application) or migrated to the right chunk (but this might cause the inode number of the directory to change --- maybe we allow this as long as the directory has never been stat'ed, so that the inode number has never been observed). The other thing which we should consider is that chunkfs really requires a 64-bit inode number space, which means either we only allow it on 64-bit systems, or we need to consider a migration so that even on 32-bit platforms, stat() functions like stat64(), insofar that it uses a stat structure which returns a 64-bit ino_t. - Ted - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Apr 23, 2007 15:04 +0530, Kalpak Shah wrote: > On Mon, 2007-04-23 at 12:49 +0530, Karuna sagar K wrote: > > The tool estimates the cross-chunk references from an extt2/3 file > > system. It considers a block group as one chunk and calcuates how many > > block groups does a file span across. So, the block group size gives > > the estimate of chunk size. > > > > The file systems were aged for about 3-4 months on a developers laptop. > > With a blocksize of 4KB, a block group would be 128 MB. In the original > Chunkfs paper, Valh had mentioned 1GB chunks and I believe it will be > possible to use 2GB, 4GB or 8GB chunks in the future. As the chunk size > increases the number of cross-chunk references will reduce and hence it > might be a good idea to present these statistics considering different > chunk sizes starting from 512MB upto 2GB. Also, given that cross-chunk references will be more expensive to fix, I can imagine the allocation policy for chunkfs will try to avoid this if possible, further reducing the number of cross-chunk inodes. I guess it should be more clear whether the cross-chunk references are due to inode block references, or because of e.g. directories referencing inodes in another chunk. Also, is it considered a cross-chunk reference if a directory entry is referencing an inode in another group? Should there be a continuation inode in the local group, or is the directory entry itself enough? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
On Mon, 2007-04-23 at 12:49 +0530, Karuna sagar K wrote: > Hi, > > The tool estimates the cross-chunk references from an extt2/3 file > system. It considers a block group as one chunk and calcuates how many > block groups does a file span across. So, the block group size gives > the estimate of chunk size. > > The file systems were aged for about 3-4 months on a developers laptop. > With a blocksize of 4KB, a block group would be 128 MB. In the original Chunkfs paper, Valh had mentioned 1GB chunks and I believe it will be possible to use 2GB, 4GB or 8GB chunks in the future. As the chunk size increases the number of cross-chunk references will reduce and hence it might be a good idea to present these statistics considering different chunk sizes starting from 512MB upto 2GB. Thanks, Kalpak. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
Hi, The tool estimates the cross-chunk references from an extt2/3 file system. It considers a block group as one chunk and calcuates how many block groups does a file span across. So, the block group size gives the estimate of chunk size. The file systems were aged for about 3-4 months on a developers laptop. Should have given the background before. Below is the explanations for the tool. Valh and others came up with this idea. - Chunkfs will only work if we have "few" cross-chunk references. We can estimate the effect of chunk size on the number of these references using an existing ext2/3 file system and treating the block groups as though they are chunks. The basic idea is that we figure out what the block group boundaries are and then find out which files and directories span two or more block groups. Step 1: --- Get a real-world ext2/3 file system. A file system which has been in use is required. One from a laptop or a server of any sort will do fine. Step 2: --- Figure out where the block group boundaries are on disk. Two things are to be known: 1. Which inode numbers are in which block group? 2. Which blocks are in which block group? At the end of this step we should have a list that looks something like: Block group 1: Inodes 11-343, blocks 1000-2 Block group 2: Inodes 344-576, blocks 2-4 [...] Step 3: --- For each file, get the inode number and use mapping from step 2 to figure out which block group it is in. Now use bmap() on each block in the file, and find out the block number. Use mapping from step 2 to figure out which block groups it has data in. For each file, record the list of all block groups. For each directory, get the inode number and map that to a block group. Then get the inode numbers of all entries in the directory (ignore symlinks) and map them to a block group. For each directory, record the list of all block groups. Step 4: --- Count the number of cross-chunk references this file system would need. This is done by going through each directory and file, and adding up the number of block groups it uses MINUS one. So if a file was in block groups 3, 7, and 24, then you would add 2 to the total number of cross-chunk references. If a file was only in block group 2, then you would add 0 to the total. On 4/22/07, Amit Gud <[EMAIL PROTECTED]> wrote: Karuna sagar K wrote: > Hi, > > The attached code contains program to estimate the cross-chunk > references for ChunkFS file system (idea from Valh). Below are the > results: > Nice to see some numbers! But would be really nice to know: - what the chunk size is - how the files were created or, more vaguely, how 'aged' the fs is - what is the chunk allocation algorithm Best, AG -- May the source be with you. http://www.cis.ksu.edu/~gud Thanks, Karuna - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ChunkFS - measuring cross-chunk references
Karuna sagar K wrote: Hi, The attached code contains program to estimate the cross-chunk references for ChunkFS file system (idea from Valh). Below are the results: Nice to see some numbers! But would be really nice to know: - what the chunk size is - how the files were created or, more vaguely, how 'aged' the fs is - what is the chunk allocation algorithm Best, AG -- May the source be with you. http://www.cis.ksu.edu/~gud - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html