Re: How were some of the lustre e2fsprogs test cases generated?
On Feb 18, 2008 19:36 -0500, Theodore Ts'o wrote: One minor correction --- the clusterfs e2fsprogs extents code checks to see if the ee_leaf_hi field is non-zero, and complains if so. However, it ignores the ee_start_hi field for interior (non-leaf) nodes in the extent tree, and a number of tests do have non-zero ee_start_hi fields which cause my version of e2fsprogs to (rightly) complain. If you fix this, a whole bunch of tests will fail as a result, and not exercise the code paths that the tests were apparently trying to exercise. Which is what is causing me a bit of worry and wonder about how those test cases were originally generated The original CFS extents kernel patch had a bug where the _hi fields were not initialized correctly to zero. The CFS exents e2fsck patches would clear the _hi fields in the extents and index blocks, but I disabled that in the upstream patch submission because it will be incorrect for 48-bit filesystems. That's the high_bits_ok check in e2fsck_ext_block_verify() for error PR_1_EXTENT_HI, that only allows the high bits when there are 2^32 blocks in the filesystem. It's possible I made a mistake when I added that part of the patch, but the regression tests still passed. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How were some of the lustre e2fsprogs test cases generated?
On Tue, Feb 19, 2008 at 04:40:32AM -0700, Andreas Dilger wrote: No, it hasn't always been true that we cleared the _hi fields in the kernel code. But, it has been a year or more since we found this bug, and all CFS e2fsprogs releases since then have cleared the _hi fields, and there has not been any other e2fsprogs that supports extents, so we expect that there are no filesystems left in the field with this issue, and even then the current code will prefer to clear the _hi bits instead of considering the whole extent corrupt. I checked again, and it looks like the interim code is indeed clearing the _hi bits. I managed to confuse myself into thinking it didn't for index nodes, but I checked again and it seems to be doing the right thing. The reason why I asked is that the extents code in the 'next' branch of e2fsprogs *does* consider the whole extent to be corrupt, since in the long run once we start 64-bit block number extent blocks, if the physical block number (including the high 16 bits) is greater than s_blocks_count, simply masking off the high 16 bits of the 48 bit extent block is probably not the right way of dealing with the problem. I think that's probably a safe thing to do since all of your customers who might have had a filesystem with non-zero _hi fields have almost certainly run e2fsck to clear the _hi bits at least once; do you concur that is a safe assumption? Or would you prefer that I add some code that tries to clear just the _hi bits, perhaps controlled by a configuration flag in e2fsck.conf? Regards, - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How were some of the lustre e2fsprogs test cases generated?
On Feb 19, 2008 07:29 -0500, Theodore Ts'o wrote: On Tue, Feb 19, 2008 at 04:40:32AM -0700, Andreas Dilger wrote: No, it hasn't always been true that we cleared the _hi fields in the kernel code. But, it has been a year or more since we found this bug, and all CFS e2fsprogs releases since then have cleared the _hi fields, and there has not been any other e2fsprogs that supports extents, so we expect that there are no filesystems left in the field with this issue, and even then the current code will prefer to clear the _hi bits instead of considering the whole extent corrupt. I checked again, and it looks like the interim code is indeed clearing the _hi bits. I managed to confuse myself into thinking it didn't for index nodes, but I checked again and it seems to be doing the right thing. The reason why I asked is that the extents code in the 'next' branch of e2fsprogs *does* consider the whole extent to be corrupt, since in the long run once we start 64-bit block number extent blocks, if the physical block number (including the high 16 bits) is greater than s_blocks_count, simply masking off the high 16 bits of the 48 bit extent block is probably not the right way of dealing with the problem. I think that's probably a safe thing to do since all of your customers who might have had a filesystem with non-zero _hi fields have almost certainly run e2fsck to clear the _hi bits at least once; do you concur that is a safe assumption? Or would you prefer that I add some code that tries to clear just the _hi bits, perhaps controlled by a configuration flag in e2fsck.conf? I'm OK with either. We might consider patching e2fsck to return to the more permissive CFS behaviour with _hi bits for our own releases, or just leave it. Checking back in our patches, we fixed the kernel code in July '06 and the e2fsck code in Jan '07, so I hope people have run an e2fsck on their filesystems in the last 1.5 years. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How were some of the lustre e2fsprogs test cases generated?
On Mon, Feb 18, 2008 at 07:06:58PM -0500, Theodore Ts'o wrote: The clusterfs e2fsprogs code doesn't notice this, because it apparently ignores ee_start_hi field entirely. One minor correction --- the clusterfs e2fsprogs extents code checks to see if the ee_leaf_hi field is non-zero, and complains if so. However, it ignores the ee_start_hi field for interior (non-leaf) nodes in the extent tree, and a number of tests do have non-zero ee_start_hi fields which cause my version of e2fsprogs to (rightly) complain. If you fix this, a whole bunch of tests will fail as a result, and not exercise the code paths that the tests were apparently trying to exercise. Which is what is causing me a bit of worry and wonder about how those test cases were originally generated Regards, - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html