Re: [Ocfs2-users] ocfs2 goes to read only

2016-03-25 Thread Eric Ren
Hi,

On 03/24/2016 10:45 PM, gjprabu wrote:
> Hi Team and Joseph,
>
>Ocfs2 file systems goes to Read Only mode when mount after reboot. we try 
> fix with fsck but its aborted and thrown error" fsck.ocfs2: 
> dir_indexed.c:1441: ocfs2_dx_dir_search: Assertion 
> `entry_list-de_num_used  0' failed.". We are in critical situation 
> please give us solution asap.
>

Could you provide the core dump file?

If you have something emergent, I think it's OK to CC ocfs2-devel maillist.

Off topic a bit, It's the second time I notice you encounter some issues 
using ceph RBD as ocfs2 volume? If so, it's good news for ocfs2.
But I'm afraid you may be the first one to experience this usage. Could 
you elaborate the reasons, strength and pain points you use ocfs2 in 
this scenario? I'm very expecting to hear your story;-)

Actually, I'm practicing ceph now. Hope I can also try out what you're 
exploring now. Thus we can reproduce your issues so that we developer 
can actually help;-)

Eric

>
> [root@ceph-zclient1 home]# cd sas/cde/
>
> [root@ceph-zclient1 cde]# pwd
>
> /home/sas/cde
>
> [root@ceph-zclient1 cde]# mkdir test123213
>
> mkdir: cannot create directory ‘test123213’: Read-only file system
>
> [root@ceph-zclient1 cde]# mount | grep ocfs
>
> ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw,relatime)
>
> /dev/rbd0 on /home/sas/cde type ocfs2 
> (ro,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,coherency=full,user_xattr,acl)
>
>
>
>
>
>
>
>fsck.ocfs2 -y -f /dev/rbd/rbd/labs
>
> fsck.ocfs2 1.8.0
>
> Checking OCFS2 filesystem in /dev/rbd/rbd/cdelabs:
>
>Label:  label
>
>UUID:   EDE38A7C7D45498D889CA6943589B3C1
>
>Number of blocks:   402653184
>
>Block size: 4096
>
>Number of clusters: 402653184
>
>Cluster size:   4096
>
>Number of slots:25
>
>
>
> /dev/rbd/rbd/labs was run with -f, check forced.
>
> Pass 0a: Checking cluster allocation chains
>
> Pass 0b: Checking inode allocation chains
>
> [CHAIN_BITS] Chain 229 in allocator inode 62 has 746 bits marked free out of 
> 16384 total bits but the block groups in the chain have 747 free out of 16384 
> total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 224 in allocator inode 62 has 1096 bits marked free out of 
> 16384 total bits but the block groups in the chain have 1099 free out of 
> 16384 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 223 in allocator inode 62 has 41 bits marked free out of 
> 16384 total bits but the block groups in the chain have 43 free out of 16384 
> total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 222 in allocator inode 62 has 1797 bits marked free out of 
> 16384 total bits but the block groups in the chain have 1812 free out of 
> 16384 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 219 in allocator inode 62 has 946 bits marked free out of 
> 16384 total bits but the block groups in the chain have 978 free out of 16384 
> total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 215 in allocator inode 62 has 927 bits marked free out of 
> 16384 total bits but the block groups in the chain have 929 free out of 16384 
> total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 214 in allocator inode 62 has 1391 bits marked free out of 
> 16384 total bits but the block groups in the chain have 1468 free out of 
> 16384 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 212 in allocator inode 62 has 1346 bits marked free out of 
> 16384 total bits but the block groups in the chain have 1347 free out of 
> 16384 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 210 in allocator inode 62 has 1165 bits marked free out of 
> 16384 total bits but the block groups in the chain have 1166 free out of 
> 16384 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 190 in allocator inode 62 has 786 bits marked free out of 
> 17408 total bits but the block groups in the chain have 787 free out of 17408 
> total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 189 in allocator inode 62 has 1291 bits marked free out of 
> 17408 total bits but the block groups in the chain have 1297 free out of 
> 17408 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 187 in allocator inode 62 has 925 bits marked free out of 
> 17408 total bits but the block groups in the chain have 991 free out of 17408 
> total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 180 in allocator inode 62 has 1131 bits marked free out of 
> 17408 total bits but the block groups in the chain have 1146 free out of 
> 17408 total.  Fix this by updating the chain record? y
>
> [CHAIN_BITS] Chain 179 in allocator inode 62 has 1071 bits marked free out of 
> 17408 total bits but the block groups in the chain have 1072 free out of 
> 17408 total.  Fix 

Re: [Ocfs2-users] fsck.ocfs2 loops + hangs but does not check

2016-03-25 Thread Joseph Qi
Hi Michael,
Yes, currently the best way is to copy out data as much as possible and
recreate the ocfs2 volume, then restore back the data.
I haven't encountered this issue before and don't know which case can
lead to it, so I'm sorry I can't give you the advice which can avoid
this issue.
But I suggest you keep follow the patches of the latest kernel, and
patch those read-only related (both ocfs2 and jbd2). We have indeed
submitted several patches to fix read-only issues.

Thanks,
Joseph

On 2016/3/26 0:41, Michael Ulbrich wrote:
> Joseph,
> 
> thanks again for your help!
> 
> Currently I'm dumping out 4 TB of data from the broken ocfs2 device to
> an external disk. I have shut down the cluster and have the fs mounted
> read-only on a single node. It seems that the data structures are still
> intact and that the file system problems are bound to internal data
> areas (DLM?) which are not in use in the single node r/o mount use case.
> 
> Will create a new ocfs2 device and restore the data later.
> 
> Besides taking  metadata backups with o2image is there any advice which
> you would give to avoid similar situations in the future?
> 
> All the best ... Michael
> 
> On 03/25/2016 01:36 AM, Joseph Qi wrote:
>> Hi Michael,
>>
>> On 2016/3/24 21:47, Michael Ulbrich wrote:
>>> Hi Joseph,
>>>
>>> thanks for this information although this does not sound too optimistic ...
>>>
>>> So, if I understand you correctly, if we had a metadata backup from
>>> o2image _before_ the crash we could have looked up the missing info to
>>> remove the loop from group chain 73, right?
>> If we have metadata backup, we can use o2image to restore it back, but
>> this may loss some data.
>>
>>>
>>> But how could the loop issue be fixed and at the same time the damage to
>>> the data be minimized? There is a recent file level backup from which
>>> damaged or missing files could be restored later.
>>>
>>> 151   4054438912158722152 13720106061984
>>> 152   409459507215872107535119 5119 1984
>>> 153   4090944512158721818 140549646 1984 <--
>>> 154   408364339215872571  153014914 1984
>>> 155   4510758912158724834 110386601 1984
>>> 156   4492506112158726532 9340 5119 1984
>>>
>>> Could you describe a "brute force" way how to dd out and edit record
>>> #153 to remove the loop and minimize potential loss of data at the same
>>> time? So that fsck would have a chance to complete and fix the remaining
>>> issues?
>> This is dangerous until we can know exactly what's info the block should
>> store.
>>
>> My idea is to find out the actual block of record #154 and let block
>> 4090944512 of record #153 points to it. This must be a bit complicated
>> and should be done under deep understanding of the disk layout.
>>
>> I have went though fsck.ocfs2 patches, and found the following may help:
>> commit efca4b0f2241 (Break a chain loop in group desc)
>> But as you said, you have already upgraded to version 1.8.4. So I'm sorry
>> currently I don't have a better idea.
>>
>> Thanks,
>> Joseph
>>>
>>> Thanks a lot for your help ... Michael
>>>
>>> On 03/24/2016 02:10 PM, Joseph Qi wrote:
 Hi Michael,
 So I think the block of record #153 goes wrong, which points next to
 block 4083643392 of record #19.
 But the problem is we don't know the right info of the block of record
 #153, otherwise we can dd out, edit it and then dd in to fix it.

 Thanks,
 Joseph

 On 2016/3/24 18:38, Michael Ulbrich wrote:
> Hi Joseph,
>
> ok, got it! Here's the loop in chain 73:
>
> Group Chain: 73   Parent Inode: 13  Generation: 1172963971
> CRC32:    ECC: 
> ##   Block#TotalUsed Free Contig   Size
> 0428077363215872114874385 1774 1984
> 12583263232158725341 105315153 1984
> 24543613952158725329 105435119 1984
> 3453266227215872107535119 5119 1984
> 44539963392158723223 126497530 1984
> 54536312832158725219 106535534 1984
> 64529011712158726047 9825 3359 1984
> 74525361152158724475 113975809 1984
> 84521710592158723182 126905844 1984
> 94518060032158725881 9991 5131 1984
> 10   423696691215872107535119 5119 1984
> 11   409824563215872107565116 3388 1984
> 12   4514409472158728826 7046 5119 1984
> 13   34411448321587215   158579680 1984
> 14   4404892672158727563 8309 5119 1984
> 15   4233316352158729398 6474 5114 1984
> 16   44888215872  

Re: [Ocfs2-users] fsck.ocfs2 loops + hangs but does not check

2016-03-25 Thread Michael Ulbrich
Joseph,

thanks again for your help!

Currently I'm dumping out 4 TB of data from the broken ocfs2 device to
an external disk. I have shut down the cluster and have the fs mounted
read-only on a single node. It seems that the data structures are still
intact and that the file system problems are bound to internal data
areas (DLM?) which are not in use in the single node r/o mount use case.

Will create a new ocfs2 device and restore the data later.

Besides taking  metadata backups with o2image is there any advice which
you would give to avoid similar situations in the future?

All the best ... Michael

On 03/25/2016 01:36 AM, Joseph Qi wrote:
> Hi Michael,
> 
> On 2016/3/24 21:47, Michael Ulbrich wrote:
>> Hi Joseph,
>>
>> thanks for this information although this does not sound too optimistic ...
>>
>> So, if I understand you correctly, if we had a metadata backup from
>> o2image _before_ the crash we could have looked up the missing info to
>> remove the loop from group chain 73, right?
> If we have metadata backup, we can use o2image to restore it back, but
> this may loss some data.
> 
>>
>> But how could the loop issue be fixed and at the same time the damage to
>> the data be minimized? There is a recent file level backup from which
>> damaged or missing files could be restored later.
>>
>> 151   4054438912158722152 13720106061984
>> 152   409459507215872107535119 5119 1984
>> 153   4090944512158721818 140549646 1984 <--
>> 154   408364339215872571  153014914 1984
>> 155   4510758912158724834 110386601 1984
>> 156   4492506112158726532 9340 5119 1984
>>
>> Could you describe a "brute force" way how to dd out and edit record
>> #153 to remove the loop and minimize potential loss of data at the same
>> time? So that fsck would have a chance to complete and fix the remaining
>> issues?
> This is dangerous until we can know exactly what's info the block should
> store.
> 
> My idea is to find out the actual block of record #154 and let block
> 4090944512 of record #153 points to it. This must be a bit complicated
> and should be done under deep understanding of the disk layout.
> 
> I have went though fsck.ocfs2 patches, and found the following may help:
> commit efca4b0f2241 (Break a chain loop in group desc)
> But as you said, you have already upgraded to version 1.8.4. So I'm sorry
> currently I don't have a better idea.
> 
> Thanks,
> Joseph
>>
>> Thanks a lot for your help ... Michael
>>
>> On 03/24/2016 02:10 PM, Joseph Qi wrote:
>>> Hi Michael,
>>> So I think the block of record #153 goes wrong, which points next to
>>> block 4083643392 of record #19.
>>> But the problem is we don't know the right info of the block of record
>>> #153, otherwise we can dd out, edit it and then dd in to fix it.
>>>
>>> Thanks,
>>> Joseph
>>>
>>> On 2016/3/24 18:38, Michael Ulbrich wrote:
 Hi Joseph,

 ok, got it! Here's the loop in chain 73:

 Group Chain: 73   Parent Inode: 13  Generation: 1172963971
 CRC32:    ECC: 
 ##   Block#TotalUsed Free Contig   Size
 0428077363215872114874385 1774 1984
 12583263232158725341 105315153 1984
 24543613952158725329 105435119 1984
 3453266227215872107535119 5119 1984
 44539963392158723223 126497530 1984
 54536312832158725219 106535534 1984
 64529011712158726047 9825 3359 1984
 74525361152158724475 113975809 1984
 84521710592158723182 126905844 1984
 94518060032158725881 9991 5131 1984
 10   423696691215872107535119 5119 1984
 11   409824563215872107565116 3388 1984
 12   4514409472158728826 7046 5119 1984
 13   34411448321587215   158579680 1984
 14   4404892672158727563 8309 5119 1984
 15   4233316352158729398 6474 5114 1984
 16   448882158726358 9514 5119 1984
 17   3901115392158729932 5940 3757 1984
 18   4507108352158726557 9315 6166 1984
 19   408364339215872571  153014914 1984 <--
 20   4510758912158724834 110386601 1984
 21   4492506112158726532 9340 5119 1984
 22   449615667215872107535119 5119 1984
 23   450345779215872107185154 5119 1984
 ...
 154   408364339215872571  153014914 1984 <--
 155   4510758912158724834