Hi Bob,
On 27/01/14 15:46, Bob Peterson wrote:
----- Original Message -----
| This function included naive implementations of gfs2_setbit and
| gfs2_bitfit so these have been replaced with calls to those functions.
| The 'type' parameter has been replaced with 'state' and the type
| defines removed in favour of the GFS2_BLKST_* values. As these were the
| same for both meta and data block allocations, data_alloc() has been
| removed and its callers updated to use meta_alloc().
|
| Signed-off-by: Andrew Price <[email protected]>
Hi,
I've been staring at this patch for a long time now.
The patch looks correct. However, my concern is this:
The fsck.gfs2 tool is designed to repair either GFS2 or GFS1 file systems,
and GFS1 uses bitmaps differently from GFS2. An indirect block, for example,
will be marked as a data block in a GFS2 file system, but "meta" in GFS1,
(which translates to unlinked dinode in a GFS2 bitmap).
Of course, since this is upstream code, it's tempting to say that there
won't be any GFS1 file systems hanging around. However, in theory, users
might take a GFS1 file system from a legacy system and try migrate it to
a newer system, upstream, RHEL7, Fedora, whatever, in which case they
want to use gfs2_convert. But gfs2_convert doesn't do any error checking,
so we recommend running fsck before the convert. However, on newer OSes,
there isn't a gfs_fsck, there's only fsck.gfs2, which should handle both.
I'm certainly not suggesting dropping gfs1 support in fsck.gfs2 with
this patch, and it really shouldn't be semantically different to before
these changes anyway since it's just a code reorganisation.
So we need to ask some hard questions:
(1) Does it matter? If the fsck is only being run for the sake of sanity
for gfs2_convert, allocating those "metadata" blocks as "data" blocks
will likely be all right, since gfs2_convert will need to convert them
anyway.
I'm not sure where the problem you're describing might be. Before this
patch both meta_alloc() and data_alloc() allocated blocks using the
GFS2_BLKST_USED state:
- case DATA:
- case META:
- state = GFS2_BLKST_USED;
- break;
so their effects were identical and that's why I felt it ok to combine
those functions. I've not got my hands very dirty with gfs1 code but I
understand that, whereas gfs2 has _UNLINKED and _DINODE, gfs1 has
_FREEMETA and _USEDMETA, so perhaps we just need to look at whether the
gfs1 code is calling the correct allocation function for the block state
it requires.
(2) From the patch, it looks like "data" and "meta" are being treated the
same anyway, so is there a bug in today's fsck.gfs2? This might
show up if there's a GFS1 file system that has enough damage as to
push a pile of directory entries into lost+found, thus increasing its
size enough to add indirect blocks. Due to that same bug, fsck.gfs2
might not catch the discrepancy, so we probably should check it by hand
using gfs2_edit.
(3) If there is a bug with how fsck.gfs2 handles GFS1 bitmaps, do we need
to fix it? Or is it too much work for too little gain?
If we do find a bug here I think we should fix it if at all possible. If
we ever do drop support for gfs1 in fsck.gfs2 it should be done in a
more considered way than letting the gfs1 bits go rotten :)
(4) I've got a test case I run called fsck.gfs2.nightmare2.sh, which tests
fsck against my entire collection of metadata sets, both GFS and GFS2.
It can run for days, depending on the hardware. Do the GFS1 metadata
sets still pass with this patch?
Well I would like to subject these patches to those tests in any case
(and add some of them to the in-tree test suite where possible), but the
intention here was to better organise the code rather than change any
functionality so I would be surprised if this patch introduced the bug
that you're theorizing rather than simply highlighting it.
Andy