Re: [Gluster-users] Volume heal info not reporting files in split brain and core dumping, after upgrading to 3.7.0

Alessandro De Salvo Fri, 29 May 2015 03:07:59 -0700

Hi Pranith,
thanks to you! 2-3 days are fine, don’t worry. However, if you can give me the 
details of the compilation of glsheal you are mentioning, we could have a quick 
check if everything’s fine with the fix, before you release. So just let me 
know what you prefer. For me waiting 2-3 days is not a problem though, as it is 
not a critical server and I could even recreate the volumes.
Thanks again,


        Alessandro

> Il giorno 29/mag/2015, alle ore 11:54, Pranith Kumar Karampuri 
> <[email protected]> ha scritto:
> 
> 
> 
> On 05/29/2015 03:16 PM, Alessandro De Salvo wrote:
>> Hi Pranith,
>> I’m definitely sure the log is correct, but you are also correct when you 
>> say there is no sign of crash (even checking with grep!).
>> However I see core dumps (e.g. core.19430) in /var/log/gluster) created 
>> every time I issue the heal info command.
>> From gdb I see this:
> Thanks for providing the information Alessandro. We will fix this issue. I am 
> wondering how we can unblock you in the interim. There is a plan to release 
> 3.7.1 in 2-3 days I think. I can try to make this fix for that release. Let 
> me know if you can wait that long? Another possibility is to compile just 
> glfsheal binary with the fix which "gluster volume heal <volname> info" 
> internally. Let me know.
> 
> Pranith.
>> 
>> 
>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
>> Copyright (C) 2013 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html 
>> <http://gnu.org/licenses/gpl.html>>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/ 
>> <http://www.gnu.org/software/gdb/bugs/>>...
>> Reading symbols from /usr/sbin/glfsheal...Reading symbols from 
>> /usr/lib/debug/usr/sbin/glfsheal.debug...done.
>> done.
>> [New LWP 19430]
>> [New LWP 19431]
>> [New LWP 19434]
>> [New LWP 19436]
>> [New LWP 19433]
>> [New LWP 19437]
>> [New LWP 19432]
>> [New LWP 19435]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> Core was generated by `/usr/sbin/glfsheal adsnet-vm-01'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  inode_unref (inode=0x7f7a1e27806c) at inode.c:499
>> 499             table = inode->table;
>> (gdb) bt
>> #0  inode_unref (inode=0x7f7a1e27806c) at inode.c:499
>> #1  0x00007f7a265e8a61 in fini (this=<optimized out>) at qemu-block.c:1092
>> #2  0x00007f7a39a53791 in xlator_fini_rec (xl=0x7f7a2000b9a0) at xlator.c:463
>> #3  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000d450) at xlator.c:453
>> #4  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000e800) at xlator.c:453
>> #5  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000fbb0) at xlator.c:453
>> #6  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20010f80) at xlator.c:453
>> #7  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20012330) at xlator.c:453
>> #8  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a200136e0) at xlator.c:453
>> #9  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20014b30) at xlator.c:453
>> #10 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20015fc0) at xlator.c:453
>> #11 0x00007f7a39a54eea in xlator_tree_fini (xl=<optimized out>) at 
>> xlator.c:545
>> #12 0x00007f7a39a90b25 in glusterfs_graph_deactivate (graph=<optimized out>) 
>> at graph.c:340
>> #13 0x00007f7a38d50e3c in pub_glfs_fini (fs=fs@entry=0x7f7a3a6b6010) at 
>> glfs.c:1155
>> #14 0x00007f7a39f18ed4 in main (argc=<optimized out>, argv=<optimized out>) 
>> at glfs-heal.c:821
>> 
>> 
>> Thanks,
>> 
>> 
>>         Alessandro
>> 
>>> Il giorno 29/mag/2015, alle ore 11:12, Pranith Kumar Karampuri 
>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>> 
>>> 
>>> 
>>> On 05/29/2015 02:37 PM, Alessandro De Salvo wrote:
>>>> Hi Pranith,
>>>> many thanks for the help!
>>>> The volume info of the problematic volume is the following:
>>>> 
>>>> # gluster volume info adsnet-vm-01
>>>>  
>>>> Volume Name: adsnet-vm-01
>>>> Type: Replicate
>>>> Volume ID: f8f615df-3dde-4ea6-9bdb-29a1706e864c
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gwads02.sta.adsnet.it 
>>>> <http://gwads02.sta.adsnet.it/>:/gluster/vm01/data
>>>> Brick2: gwads03.sta.adsnet.it 
>>>> <http://gwads03.sta.adsnet.it/>:/gluster/vm01/data
>>>> Options Reconfigured:
>>>> nfs.disable: true
>>>> features.barrier: disable
>>>> features.file-snapshot: on
>>>> server.allow-insecure: on
>>> Are you sure the attached log is correct? I do not see any backtrace in the 
>>> log file to indicate there is a crash :-(. Could you do "grep -i crash 
>>> /var/log/glusterfs/*" to see if there is some other file with the crash. If 
>>> that also fails, will it be possible for you to provide the backtrace of 
>>> the core by opening it using gdb?
>>> 
>>> Pranith
>>>> 
>>>> The log is in attachment.
>>>> I just wanted to add that the heal info command works fine on other 
>>>> volumes hosted by the same machines, so it’s just this volume which is 
>>>> causing problems.
>>>> Thanks,
>>>> 
>>>>  Alessandro
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Il giorno 29/mag/2015, alle ore 10:50, Pranith Kumar Karampuri 
>>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 05/29/2015 02:18 PM, Pranith Kumar Karampuri wrote:
>>>>>> 
>>>>>> 
>>>>>> On 05/29/2015 02:13 PM, Alessandro De Salvo wrote:
>>>>>>> Hi,
>>>>>>> I'm facing a strange issue with split brain reporting.
>>>>>>> I have upgraded to 3.7.0, after stopping all gluster processes as 
>>>>>>> described in the twiki, on all servers hosting the volumes. The upgrade 
>>>>>>> and the restart was fine, and the volumes are accessible.
>>>>>>> However I had two files in split brain that I did not heal before 
>>>>>>> upgrading, so I tried a full heal with 3.7.0. The heal was launched 
>>>>>>> correctly, but when I now perform an heal info there is no output, 
>>>>>>> while the heal statistics says there are actually 2 files in split 
>>>>>>> brain. In the logs I see something like this:
>>>>>>> 
>>>>>>> glustershd.log:
>>>>>>> [2015-05-29 08:28:43.008373] I 
>>>>>>> [afr-self-heal-entry.c:558:afr_selfheal_entry_do] 
>>>>>>> 0-adsnet-gluster-01-replicate-0: performing entry selfheal on 
>>>>>>> 7fd1262d-949b-402e-96c2-ae487c8d4e27
>>>>>>> [2015-05-29 08:28:43.012690] W 
>>>>>>> [client-rpc-fops.c:241:client3_3_mknod_cbk] 
>>>>>>> 0-adsnet-gluster-01-client-1: remote operation failed: Invalid 
>>>>>>> argument. Path: (null)
>>>>>> Hey could you let us know "gluster volume info" output? Please let us 
>>>>>> know the backtrace printed by /var/log/glusterfs/glfsheal-<volname>.log 
>>>>>> as well.
>>>>> Please attach /var/log/glusterfs/glfsheal-<volname>.log file to this 
>>>>> thread so that I can take a look.
>>>>> 
>>>>> Pranith
>>>>>> 
>>>>>> Pranith
>>>>>>> 
>>>>>>> 
>>>>>>> So, it seems like the files to be healed are not correctly identified, 
>>>>>>> or at least their path is null.
>>>>>>> Also, every time I issue a "gluster volume heal <volname> info" a core 
>>>>>>> dump is generated in the log area.
>>>>>>> All servers are using the latest CentOS 7.
>>>>>>> Any idea why this might be happening and how to solve it?
>>>>>>> Thanks,
>>>>>>> 
>>>>>>>    Alessandro
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users 
>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> [email protected] <mailto:[email protected]>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users 
>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>> 
>> 
>

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume heal info not reporting files in split brain and core dumping, after upgrading to 3.7.0

Reply via email to