Re: [Linux-cluster] fsck.gfs2 The root dinode block is destroyed.

2016-03-29 Thread Bob Peterson
- Original Message -
> Great Thanks.
> 
> This particular mount has two partitions.  I get the following when i
> use the savemeta/savemetaslow commands.  Is this ok or do I need to do
> something different to gather the information?  I wasn't sure if the
> information was just on the first partition or on each partition.
> 
> [root@-dr tmp]# gfs2_edit savemetaslow /dev/sdd1 /tmp/sdd1.meta
> There are 488281088 blocks of 4096 bytes in the destination device.
> Reading resource groups...Done. File system size: 0.0
> Metadata saved to file /tmp/sdd1.meta (gzipped, level 9).
> [root@-dr tmp]# gfs2_edit savemetaslow /dev/sdd2 /tmp/sdd2.meta
> Either the super block is corrupted, or this is not a GFS2 filesystem
> Unable to read superblock.
> [root@dr tmp]# gfs2_edit savemetaslow /dev/sdd /tmp/sdd.meta
> Either the super block is corrupted, or this is not a GFS2 filesystem
> Unable to read superblock.
> [root@-dr tmp]#

Hm. If the GFS2 superblock is gone, it sounds like there was major
damage. I've seen hardware problems do that, like when a RAID controller
makes scrambled eggs out of a device. Perhaps you can use dd to save off
the first 1MB of metadata (which should just be gfs2 metadata) and I can
see what's left for you.

Regards,

Bob Peterson
Red Hat File Systems

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] fsck.gfs2 The root dinode block is destroyed.

2016-03-29 Thread Megan .
Great Thanks.

This particular mount has two partitions.  I get the following when i
use the savemeta/savemetaslow commands.  Is this ok or do I need to do
something different to gather the information?  I wasn't sure if the
information was just on the first partition or on each partition.

[root@-dr tmp]# gfs2_edit savemetaslow /dev/sdd1 /tmp/sdd1.meta
There are 488281088 blocks of 4096 bytes in the destination device.
Reading resource groups...Done. File system size: 0.0
Metadata saved to file /tmp/sdd1.meta (gzipped, level 9).
[root@-dr tmp]# gfs2_edit savemetaslow /dev/sdd2 /tmp/sdd2.meta
Either the super block is corrupted, or this is not a GFS2 filesystem
Unable to read superblock.
[root@dr tmp]# gfs2_edit savemetaslow /dev/sdd /tmp/sdd.meta
Either the super block is corrupted, or this is not a GFS2 filesystem
Unable to read superblock.
[root@-dr tmp]#



On Tue, Mar 29, 2016 at 10:34 AM, Bob Peterson  wrote:
> - Original Message -
>> Good Morning,
>>
>> We have a large cluster with 50 gfs2 SAN mounts.  The mounts range in
>> size from 1TB to 15TB each.  We have some with 6-8TB of data but most
>> average around 3TB used right now.  We were doing network testing a
>> while back to check our redundancy incase of a switch failure, and the
>> tests failed..  multiple times.  We ended having the SAN mounts yanked
>> out from under the cluster.  Long story short, we seem to have
>> corruption.  I can still bring the volumes up with the cluster but
>> when i take everything down and do a fsck I get the following:
>>
>>
>> (ran with fsck -n /dev/$device)
>>
>> Found a copy of the root directory in a journal at block: 0x501ca.
>> Damaged root dinode not fixed.
>> The root dinode should be at block 0x2f3b98b7 but it seems to be destroyed.
>> Found a copy of the root directory in a journal at block: 0x501d2.
>> Damaged root dinode not fixed.
>> The root dinode should be at block 0x28a3ac7f but it seems to be destroyed.
>> Found a copy of the root directory in a journal at block: 0x501da.
>> Damaged root dinode not fixed.
>> Unable to locate the root directory.
>> Can't find any dinodes that might be the root; using master - 1.
>> Found a possible root at: 0x16
>> The root dinode block is destroyed.
>> At this point I recommend reinitializing it.
>> Hopefully everything will later be put into lost+found.
>> The root dinode was not reinitialized; aborting.
>>
>>
>> This particular device had 4698 "seems to be destroyed..  found a
>> copy" messages before the final, "Can't find any dinodes" message.  I
>> fear that we have a number of mounts in this state.
>>
>>  Is there any way to recover?  Thanks in advance.
>
> Hi Megan,
>
> I can't tell what's "really" going on unless I examine the GFS2 file system
> metadata up close. If you save the metadata (gfs2_edit savemeta  
> )
> and also the first 1MB of the block device, and somehow get it to me, I might
> be able to figure out what's going on, how it got that way, and what to do to
> recover it. Ordinarily, the root directory appears early in the metadata and
> it should not be deleted. What was the history of the file system? Was it
> converted from GFS1 with gfs2_convert or something?
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems
>
> --
> Linux-cluster mailing list
> Linux-cluster@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] fsck.gfs2 The root dinode block is destroyed.

2016-03-29 Thread Bob Peterson
- Original Message -
> Good Morning,
> 
> We have a large cluster with 50 gfs2 SAN mounts.  The mounts range in
> size from 1TB to 15TB each.  We have some with 6-8TB of data but most
> average around 3TB used right now.  We were doing network testing a
> while back to check our redundancy incase of a switch failure, and the
> tests failed..  multiple times.  We ended having the SAN mounts yanked
> out from under the cluster.  Long story short, we seem to have
> corruption.  I can still bring the volumes up with the cluster but
> when i take everything down and do a fsck I get the following:
> 
> 
> (ran with fsck -n /dev/$device)
> 
> Found a copy of the root directory in a journal at block: 0x501ca.
> Damaged root dinode not fixed.
> The root dinode should be at block 0x2f3b98b7 but it seems to be destroyed.
> Found a copy of the root directory in a journal at block: 0x501d2.
> Damaged root dinode not fixed.
> The root dinode should be at block 0x28a3ac7f but it seems to be destroyed.
> Found a copy of the root directory in a journal at block: 0x501da.
> Damaged root dinode not fixed.
> Unable to locate the root directory.
> Can't find any dinodes that might be the root; using master - 1.
> Found a possible root at: 0x16
> The root dinode block is destroyed.
> At this point I recommend reinitializing it.
> Hopefully everything will later be put into lost+found.
> The root dinode was not reinitialized; aborting.
> 
> 
> This particular device had 4698 "seems to be destroyed..  found a
> copy" messages before the final, "Can't find any dinodes" message.  I
> fear that we have a number of mounts in this state.
> 
>  Is there any way to recover?  Thanks in advance.

Hi Megan,

I can't tell what's "really" going on unless I examine the GFS2 file system
metadata up close. If you save the metadata (gfs2_edit savemeta  )
and also the first 1MB of the block device, and somehow get it to me, I might
be able to figure out what's going on, how it got that way, and what to do to
recover it. Ordinarily, the root directory appears early in the metadata and
it should not be deleted. What was the history of the file system? Was it
converted from GFS1 with gfs2_convert or something?

Regards,

Bob Peterson
Red Hat File Systems

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster