Hi Pranith, thanks to you! 2-3 days are fine, don’t worry. However, if you can give me the details of the compilation of glsheal you are mentioning, we could have a quick check if everything’s fine with the fix, before you release. So just let me know what you prefer. For me waiting 2-3 days is not a problem though, as it is not a critical server and I could even recreate the volumes. Thanks again,
Alessandro
> Il giorno 29/mag/2015, alle ore 11:54, Pranith Kumar Karampuri
> <[email protected]> ha scritto:
>
>
>
> On 05/29/2015 03:16 PM, Alessandro De Salvo wrote:
>> Hi Pranith,
>> I’m definitely sure the log is correct, but you are also correct when you
>> say there is no sign of crash (even checking with grep!).
>> However I see core dumps (e.g. core.19430) in /var/log/gluster) created
>> every time I issue the heal info command.
>> From gdb I see this:
> Thanks for providing the information Alessandro. We will fix this issue. I am
> wondering how we can unblock you in the interim. There is a plan to release
> 3.7.1 in 2-3 days I think. I can try to make this fix for that release. Let
> me know if you can wait that long? Another possibility is to compile just
> glfsheal binary with the fix which "gluster volume heal <volname> info"
> internally. Let me know.
>
> Pranith.
>>
>>
>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
>> Copyright (C) 2013 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>> <http://gnu.org/licenses/gpl.html>>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/
>> <http://www.gnu.org/software/gdb/bugs/>>...
>> Reading symbols from /usr/sbin/glfsheal...Reading symbols from
>> /usr/lib/debug/usr/sbin/glfsheal.debug...done.
>> done.
>> [New LWP 19430]
>> [New LWP 19431]
>> [New LWP 19434]
>> [New LWP 19436]
>> [New LWP 19433]
>> [New LWP 19437]
>> [New LWP 19432]
>> [New LWP 19435]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> Core was generated by `/usr/sbin/glfsheal adsnet-vm-01'.
>> Program terminated with signal 11, Segmentation fault.
>> #0 inode_unref (inode=0x7f7a1e27806c) at inode.c:499
>> 499 table = inode->table;
>> (gdb) bt
>> #0 inode_unref (inode=0x7f7a1e27806c) at inode.c:499
>> #1 0x00007f7a265e8a61 in fini (this=<optimized out>) at qemu-block.c:1092
>> #2 0x00007f7a39a53791 in xlator_fini_rec (xl=0x7f7a2000b9a0) at xlator.c:463
>> #3 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000d450) at xlator.c:453
>> #4 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000e800) at xlator.c:453
>> #5 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000fbb0) at xlator.c:453
>> #6 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20010f80) at xlator.c:453
>> #7 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20012330) at xlator.c:453
>> #8 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a200136e0) at xlator.c:453
>> #9 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20014b30) at xlator.c:453
>> #10 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20015fc0) at xlator.c:453
>> #11 0x00007f7a39a54eea in xlator_tree_fini (xl=<optimized out>) at
>> xlator.c:545
>> #12 0x00007f7a39a90b25 in glusterfs_graph_deactivate (graph=<optimized out>)
>> at graph.c:340
>> #13 0x00007f7a38d50e3c in pub_glfs_fini (fs=fs@entry=0x7f7a3a6b6010) at
>> glfs.c:1155
>> #14 0x00007f7a39f18ed4 in main (argc=<optimized out>, argv=<optimized out>)
>> at glfs-heal.c:821
>>
>>
>> Thanks,
>>
>>
>> Alessandro
>>
>>> Il giorno 29/mag/2015, alle ore 11:12, Pranith Kumar Karampuri
>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>
>>>
>>>
>>> On 05/29/2015 02:37 PM, Alessandro De Salvo wrote:
>>>> Hi Pranith,
>>>> many thanks for the help!
>>>> The volume info of the problematic volume is the following:
>>>>
>>>> # gluster volume info adsnet-vm-01
>>>>
>>>> Volume Name: adsnet-vm-01
>>>> Type: Replicate
>>>> Volume ID: f8f615df-3dde-4ea6-9bdb-29a1706e864c
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gwads02.sta.adsnet.it
>>>> <http://gwads02.sta.adsnet.it/>:/gluster/vm01/data
>>>> Brick2: gwads03.sta.adsnet.it
>>>> <http://gwads03.sta.adsnet.it/>:/gluster/vm01/data
>>>> Options Reconfigured:
>>>> nfs.disable: true
>>>> features.barrier: disable
>>>> features.file-snapshot: on
>>>> server.allow-insecure: on
>>> Are you sure the attached log is correct? I do not see any backtrace in the
>>> log file to indicate there is a crash :-(. Could you do "grep -i crash
>>> /var/log/glusterfs/*" to see if there is some other file with the crash. If
>>> that also fails, will it be possible for you to provide the backtrace of
>>> the core by opening it using gdb?
>>>
>>> Pranith
>>>>
>>>> The log is in attachment.
>>>> I just wanted to add that the heal info command works fine on other
>>>> volumes hosted by the same machines, so it’s just this volume which is
>>>> causing problems.
>>>> Thanks,
>>>>
>>>> Alessandro
>>>>
>>>>
>>>>
>>>>
>>>>> Il giorno 29/mag/2015, alle ore 10:50, Pranith Kumar Karampuri
>>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>
>>>>>
>>>>>
>>>>> On 05/29/2015 02:18 PM, Pranith Kumar Karampuri wrote:
>>>>>>
>>>>>>
>>>>>> On 05/29/2015 02:13 PM, Alessandro De Salvo wrote:
>>>>>>> Hi,
>>>>>>> I'm facing a strange issue with split brain reporting.
>>>>>>> I have upgraded to 3.7.0, after stopping all gluster processes as
>>>>>>> described in the twiki, on all servers hosting the volumes. The upgrade
>>>>>>> and the restart was fine, and the volumes are accessible.
>>>>>>> However I had two files in split brain that I did not heal before
>>>>>>> upgrading, so I tried a full heal with 3.7.0. The heal was launched
>>>>>>> correctly, but when I now perform an heal info there is no output,
>>>>>>> while the heal statistics says there are actually 2 files in split
>>>>>>> brain. In the logs I see something like this:
>>>>>>>
>>>>>>> glustershd.log:
>>>>>>> [2015-05-29 08:28:43.008373] I
>>>>>>> [afr-self-heal-entry.c:558:afr_selfheal_entry_do]
>>>>>>> 0-adsnet-gluster-01-replicate-0: performing entry selfheal on
>>>>>>> 7fd1262d-949b-402e-96c2-ae487c8d4e27
>>>>>>> [2015-05-29 08:28:43.012690] W
>>>>>>> [client-rpc-fops.c:241:client3_3_mknod_cbk]
>>>>>>> 0-adsnet-gluster-01-client-1: remote operation failed: Invalid
>>>>>>> argument. Path: (null)
>>>>>> Hey could you let us know "gluster volume info" output? Please let us
>>>>>> know the backtrace printed by /var/log/glusterfs/glfsheal-<volname>.log
>>>>>> as well.
>>>>> Please attach /var/log/glusterfs/glfsheal-<volname>.log file to this
>>>>> thread so that I can take a look.
>>>>>
>>>>> Pranith
>>>>>>
>>>>>> Pranith
>>>>>>>
>>>>>>>
>>>>>>> So, it seems like the files to be healed are not correctly identified,
>>>>>>> or at least their path is null.
>>>>>>> Also, every time I issue a "gluster volume heal <volname> info" a core
>>>>>>> dump is generated in the log area.
>>>>>>> All servers are using the latest CentOS 7.
>>>>>>> Any idea why this might be happening and how to solve it?
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Alessandro
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> [email protected] <mailto:[email protected]>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>
>>
>
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
