Xavi,
          As I mentioned before the error could happen for any FOP. Will try to 
run with TRACE debug level. Is there a possibility that we are checking for 
this attribute on a directory, because a directory does not seem to be having 
this attribute set. Also is the function to check size and version called after 
it is decided that heal should be run or is this check is the one which decides 
whether a heal should be run.

Thanks and Regards,
Ram

Sent from my iPhone

> On Jan 12, 2017, at 2:25 AM, Xavier Hernandez <xhernan...@datalab.es> wrote:
> 
> Hi Ram,
> 
>> On 12/01/17 02:36, Ankireddypalle Reddy wrote:
>> Xavi,
>>          I added some more logging information. The trusted.ec.size field 
>> values are in fact different.
>>           trusted.ec.size    l1 = 62719407423488    l2 = 0
> 
> That's very weird. Directories do not have this attribute. It's only present 
> on regular files. But you said that the error happens while creating the 
> file, so it doesn't make much sense because file creation always sets 
> trusted.ec.size to 0.
> 
> Could you reproduce the problem with diagnostics.client-log-level set to 
> TRACE and send the log to me ? it will create a big log, but I'll have much 
> more information about what's going on.
> 
> Do you have a mixed setup with nodes of different types ? for example mixed 
> 32/64 bits architectures or different operating systems ? I ask this because 
> 62719407423488 in hex is 0x390B00000000, which has the lower 32 bits set to 
> 0, but has garbage above that.
> 
>> 
>>           This is a fairly static setup with no brick/ node failure.  Please 
>> explain why  is that a heal is being triggered and what could have acutually 
>> caused these size xattrs to differ.  This is causing random I/O failures and 
>> is impacting the backup schedules.
> 
> The launch of self-heal is normal because it has detected an inconsistency. 
> The real problem is what originates that inconsistency.
> 
> Xavi
> 
>> 
>> [ 2017-01-12 01:19:18.256970] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-12 01:19:18.257015] W [MSGID: 122053] 
>> [ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8: Operation 
>> failed on some subvolumes (up=7, mask=7, remaining=0, good=3, bad=4)
>> [2017-01-12 01:19:18.257018] W [MSGID: 122002] 
>> [ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-8: Heal failed 
>> [Invalid argument]
>> [2017-01-12 01:19:21.002028] E [dict.c:197:key_value_cmp] 
>> 0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-12 01:19:21.002056] E [dict.c:166:log_value] 
>> 0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488 l2 = 0 i1 
>> = 0 i2 = 0 ]
>> [2017-01-12 01:19:21.002064] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-12 01:19:21.209640] E [dict.c:197:key_value_cmp] 
>> 0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-12 01:19:21.209673] E [dict.c:166:log_value] 
>> 0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488 l2 = 0 i1 
>> = 0 i2 = 0 ]
>> [2017-01-12 01:19:21.209686] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-12 01:19:21.209719] W [MSGID: 122053] 
>> [ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-4: Operation 
>> failed on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
>> [2017-01-12 01:19:21.209753] W [MSGID: 122002] 
>> [ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-4: Heal failed 
>> [Invalid argument]
>> 
>> Thanks and Regards,
>> Ram
>> 
>> -----Original Message-----
>> From: Ankireddypalle Reddy
>> Sent: Wednesday, January 11, 2017 9:29 AM
>> To: Ankireddypalle Reddy; Xavier Hernandez; Gluster Devel 
>> (gluster-devel@gluster.org); gluster-us...@gluster.org
>> Subject: RE: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse 
>> volume
>> 
>> Xavi,
>>            I built a debug binary to log more information. This is what is 
>> getting logged. Looks like it is the attribute trusted.ec.size which is 
>> different among the bricks in a sub volume.
>> 
>> In glustershd.log :
>> 
>> [2017-01-11 14:19:45.023845] N [MSGID: 122029] 
>> [ec-generic.c:683:ec_combine_lookup] 0-glusterfsProd-disperse-8: Mismatching 
>> iatt in answers of 'GF_FOP_LOOKUP'
>> [2017-01-11 14:19:45.027718] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.027736] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.027763] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.027781] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.027793] W [MSGID: 122053] 
>> [ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation 
>> failed on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
>> [2017-01-11 14:19:45.027815] W [MSGID: 122002] 
>> [ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-6: Heal failed 
>> [Invalid argument]
>> [2017-01-11 14:19:45.029035] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.029057] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.029089] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.029105] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.029121] W [MSGID: 122053] 
>> [ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8: Operation 
>> failed on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
>> [2017-01-11 14:19:45.032566] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.029138] W [MSGID: 122002] 
>> [ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-8: Heal failed 
>> [Invalid argument]
>> [2017-01-11 14:19:45.032585] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.032614] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.032631] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.032638] W [MSGID: 122053] 
>> [ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation 
>> failed on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
>> [2017-01-11 14:19:45.032654] W [MSGID: 122002] 
>> [ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-6: Heal failed 
>> [Invalid argument]
>> [2017-01-11 14:19:45.037514] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.037536] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.037553] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:19:45.037573] W [MSGID: 122056] 
>> [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
>> xdata in answers of 'LOOKUP'
>> [2017-01-11 14:19:45.037582] W [MSGID: 122053] 
>> [ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation 
>> failed on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
>> [2017-01-11 14:19:45.037599] W [MSGID: 122002] 
>> [ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-6: Heal failed 
>> [Invalid argument]
>> [2017-01-11 14:20:40.001401] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-3: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> [2017-01-11 14:20:40.001387] E [dict.c:166:key_value_cmp] 
>> 0-glusterfsProd-disperse-5: 'trusted.ec.size' is different in two dicts (8, 
>> 8)
>> 
>> In the mount daemon log:
>> 
>> [2017-01-11 14:20:17.806826] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-0: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.806847] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-0: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807076] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-1: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807099] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-1: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807286] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-10: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807298] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-10: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807409] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-11: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807420] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-11: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807448] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-4: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807462] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-4: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807539] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-2: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807550] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-2: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807723] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-3: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807739] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-3: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.807785] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-5: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.807796] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-5: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.808020] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-9: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.808034] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-9: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.808054] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-6: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.808066] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-6: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.808282] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-8: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.808292] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-8: Invalid 
>> config xattr [Invalid argument]
>> [2017-01-11 14:20:17.809212] E [MSGID: 122001] 
>> [ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-7: Invalid or 
>> corrupted config [Invalid argument]
>> [2017-01-11 14:20:17.809228] E [MSGID: 122066] 
>> [ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-7: Invalid 
>> config xattr [Invalid argument]
>> 
>> [2017-01-11 14:20:17.812660] I [MSGID: 109036] 
>> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 2-glusterfsProd-dht: 
>> Setting layout of /Folder_01.05.2017_21.15/CV_MAGNETIC/V_31500/CHUNK_402578 
>> with [Subvol_name: glusterfsProd-disperse-0, Err: -1 , Start: 1789569705 , 
>> Stop: 2147483645 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-1, Err: 
>> -1 , Start: 2147483646 , Stop: 2505397586 , Hash: 1 ], [Subvol_name: 
>> glusterfsProd-disperse-10, Err: -1 , Start: 2505397587 , Stop: 2863311527 , 
>> Hash: 1 ], [Subvol_name: glusterfsProd-disperse-11, Err: -1 , Start: 
>> 2863311528 , Stop: 3221225468 , Hash: 1 ], [Subvol_name: 
>> glusterfsProd-disperse-2, Err: -1 , Start: 3221225469 , Stop: 3579139409 , 
>> Hash: 1 ], [Subvol_name: glusterfsProd-disperse-3, Err: -1 , Start: 
>> 3579139410 , Stop: 3937053350 , Hash: 1 ], [Subvol_name: 
>> glusterfsProd-disperse-4, Err: -1 , Start: 3937053351 , Stop: 4294967295 , 
>> Hash: 1 ], [Subvol_name: glusterfsProd-disperse-5, Err: -1 , Start: 0 , 
>> Stop: 357913940 , H
 ash: 1 ], [Subvol_name: glusterfsProd-disperse-6, Err: -1 , Start: 357913941 , 
Stop: 715827881 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-7, Err: -1 , 
Start: 715827882 , Stop: 1073741822 , Hash: 1 ], [Subvol_name: 
glusterfsProd-disperse-8, Err: -1 , Start: 1073741823 , Stop: 1431655763 , 
Hash: 1 ], [Subvol_name: glusterfsProd-disperse-9, Err: -1 , Start: 1431655764 
, Stop: 1789569704 , Hash: 1 ],
>> 
>> 
>> -----Original Message-----
>> From: gluster-users-boun...@gluster.org 
>> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Ankireddypalle Reddy
>> Sent: Tuesday, January 10, 2017 10:09 AM
>> To: Xavier Hernandez; Gluster Devel (gluster-devel@gluster.org); 
>> gluster-us...@gluster.org
>> Subject: Re: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse 
>> volume
>> 
>> Xavi,
>>           In this case it's the file creation which failed. So I provided 
>> the xattrs of the parent.
>> 
>> Thanks and Regards,
>> Ram
>> 
>> -----Original Message-----
>> From: Xavier Hernandez [mailto:xhernan...@datalab.es]
>> Sent: Tuesday, January 10, 2017 9:10 AM
>> To: Ankireddypalle Reddy; Gluster Devel (gluster-devel@gluster.org); 
>> gluster-us...@gluster.org
>> Subject: Re: [Gluster-devel] Lot of EIO errors in disperse volume
>> 
>> Hi Ram,
>> 
>>> On 10/01/17 14:42, Ankireddypalle Reddy wrote:
>>> Attachments (2):
>>> 
>>> 1
>>> 
>>>    
>>> 
>>> ec.txt
>>> <https://imap.commvault.com/webconsole/embedded.do?url=https://imap.co
>>> mmvault.com/webconsole/api/drive/publicshare/346714/file/ee2d1536c2dc4
>>> dff94afb12132b4f8f6/action/preview&downloadUrl=https://imap.commvault.
>>> com/webconsole/api/contentstore/publicshare/346714/file/ee2d1536c2dc4d
>>> ff94afb12132b4f8f6/action/download>
>>> [Download]
>>> <https://imap.commvault.com/webconsole/api/contentstore/publicshare/34
>>> 6714/file/ee2d1536c2dc4dff94afb12132b4f8f6/action/download>(11.50
>>> KB)
>>> 
>>> 2
>>> 
>>>    
>>> 
>>> ws-glus.log
>>> <https://imap.commvault.com/webconsole/embedded.do?url=https://imap.co
>>> mmvault.com/webconsole/api/drive/publicshare/346714/file/cff3e0506e754
>>> b9a939db02da1cbbd58/action/preview&downloadUrl=https://imap.commvault.
>>> com/webconsole/api/contentstore/publicshare/346714/file/cff3e0506e754b
>>> 9a939db02da1cbbd58/action/download>
>>> [Download]
>>> <https://imap.commvault.com/webconsole/api/contentstore/publicshare/34
>>> 6714/file/cff3e0506e754b9a939db02da1cbbd58/action/download>(3.48
>>> MB)
>>> 
>>> Xavi,
>>>          We are encountering errors for different kinds of FOPS.
>>>          The open failed for the following file:
>>> 
>>>          cvd_2017_01_10_02_28_26.log:98182 1f9fe 01/10 00:57:10 8414465
>>> [MEDIAFS    ] 20117519-52075477 SingleInstancer_FS::StartDataFile2:
>>> Failed to create the data file
>>> [/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/CHUNK_51342720
>>> /SFILE_CONTAINER_062], error=0xECCC0005:{CQiFile::Open(92)} +
>>> {CQiUTFOSAPI::open(96)/ErrNo.5.(Input/output error)-Open failed,
>>> File=/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/CHUNK_5134
>>> 2720/SFILE_CONTAINER_062, OperationFlag=0xC1, PermissionMode=0x1FF}
>>> 
>>>          I've attached the extended attributes for the directories
>>>          /ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/ and
>>> 
>>> /ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/CHUNK_51342720
>>> from all the bricks.
>>> 
>>>         The attributes look fine to me. I've also attached some log
>>> cuts to illustrate the problem.
>> 
>> I need the extended attributes of the file itself, not the parent 
>> directories.
>> 
>> Xavi
>> 
>>> 
>>> Thanks and Regards,
>>> Ram
>>> 
>>> -----Original Message-----
>>> From: Xavier Hernandez [mailto:xhernan...@datalab.es]
>>> Sent: Tuesday, January 10, 2017 7:53 AM
>>> To: Ankireddypalle Reddy; Gluster Devel (gluster-devel@gluster.org);
>>> gluster-us...@gluster.org
>>> Subject: Re: [Gluster-devel] Lot of EIO errors in disperse volume
>>> 
>>> Hi Ram,
>>> 
>>> the error is caused by an extended attribute that does not match on
>>> all
>>> 3 bricks of the disperse set. Most probable value is
>>> trusted.ec.version, but could be others.
>>> 
>>> At first sight, I don't see any change from 3.7.8 that could have
>>> caused this. I'll check again.
>>> 
>>> What kind of operations are you doing ? this can help me narrow the search.
>>> 
>>> Xavi
>>> 
>>>> On 10/01/17 13:43, Ankireddypalle Reddy wrote:
>>>> Xavi,
>>>>          Thanks. If you could please explain what to look for in the
>>> extended attributes then I will check and let you know if I find
>>> anything suspicious.  Also we noticed that some of these operations
>>> would succeed if retried. Do you know of any communicated related
>>> errors that are being reported/triaged.
>>>> 
>>>> Thanks and Regards,
>>>> Ram
>>>> 
>>>> -----Original Message-----
>>>> From: Xavier Hernandez [mailto:xhernan...@datalab.es]
>>>> Sent: Tuesday, January 10, 2017 7:23 AM
>>>> To: Ankireddypalle Reddy; Gluster Devel (gluster-devel@gluster.org);
>>>> gluster-us...@gluster.org
>>>> Subject: Re: [Gluster-devel] Lot of EIO errors in disperse volume
>>>> 
>>>> Hi Ram,
>>>> 
>>>>> On 10/01/17 13:14, Ankireddypalle Reddy wrote:
>>>>> Attachment (1):
>>>>> 
>>>>> 1
>>>>> 
>>>>> 
>>>>> 
>>>>> ecxattrs.txt
>>>>> <https://imap.commvault.com/webconsole/embedded.do?url=https://imap.
>>>>> c
>>>>> o
>>>>> mmvault.com/webconsole/api/drive/publicshare/346714/file/1272e682787
>>>>> 4
>>>>> 4
>>>>> f15bf1a54f2b31b559d/action/preview&downloadUrl=https://imap.commvault.
>>>>> com/webconsole/api/contentstore/publicshare/346714/file/1272e6827874
>>>>> 4
>>>>> f
>>>>> 15bf1a54f2b31b559d/action/download>
>>>>> [Download]
>>>>> <https://imap.commvault.com/webconsole/api/contentstore/publicshare/
>>>>> 3
>>>>> 4
>>>>> 6714/file/1272e68278744f15bf1a54f2b31b559d/action/download>(5.92
>>>>> KB)
>>>>> 
>>>>> Xavi,
>>>>>             Please find attached the extended attributes for a
>>>>> directory from all the bricks. Free space check failed for this with
>>>>> error number EIO.
>>>> 
>>>> What do you mean ? what operation have you made to check the free
>>> space on that directory ?
>>>> 
>>>> If it's a recursive check, I need the extended attributes from the
>>> exact file that triggers the EIO. The attached attributes seem
>>> consistent and that directory shouldn't cause any problem. Does an 'ls'
>>> on that directory fail or does it show the contents ?
>>>> 
>>>> Xavi
>>>> 
>>>>> 
>>>>> Thanks and Regards,
>>>>> Ram
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Xavier Hernandez [mailto:xhernan...@datalab.es]
>>>>> Sent: Tuesday, January 10, 2017 6:45 AM
>>>>> To: Ankireddypalle Reddy; Gluster Devel (gluster-devel@gluster.org);
>>>>> gluster-us...@gluster.org
>>>>> Subject: Re: [Gluster-devel] Lot of EIO errors in disperse volume
>>>>> 
>>>>> Hi Ram,
>>>>> 
>>>>> can you execute the following command on all bricks on a file that
>>>>> is giving EIO ?
>>>>> 
>>>>> getfattr -m. -e hex -d <path to file in brick>
>>>>> 
>>>>> Xavi
>>>>> 
>>>>>> On 10/01/17 12:41, Ankireddypalle Reddy wrote:
>>>>>> Xavi,
>>>>>>            We have been running 3.7.8 on these servers. We
>>>>>> upgraded
>>>>> to 3.7.18 yesterday. We upgraded all the servers at a time.  The
>>>>> volume was brought down during upgrade.
>>>>>> 
>>>>>> Thanks and Regards,
>>>>>> Ram
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Xavier Hernandez [mailto:xhernan...@datalab.es]
>>>>>> Sent: Tuesday, January 10, 2017 6:35 AM
>>>>>> To: Ankireddypalle Reddy; Gluster Devel
>>>>>> (gluster-devel@gluster.org); gluster-us...@gluster.org
>>>>>> Subject: Re: [Gluster-devel] Lot of EIO errors in disperse volume
>>>>>> 
>>>>>> Hi Ram,
>>>>>> 
>>>>>> how did you upgrade gluster ? from which version ?
>>>>>> 
>>>>>> Did you upgrade one server at a time and waited until self-heal
>>>>> finished before upgrading the next server ?
>>>>>> 
>>>>>> Xavi
>>>>>> 
>>>>>>> On 10/01/17 11:39, Ankireddypalle Reddy wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>>      We upgraded to GlusterFS 3.7.18 yesterday.  We see lot of
>>>>>>> failures in our applications. Most of the errors are EIO. The
>>>>>>> following log lines are commonly seen in the logs:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check]
>>>>>>> 0-StoragePool-disperse-4: Mismatching xdata in answers of 'LOOKUP'"
>>>>>>> repeated 2 times between [2017-01-10 02:46:25.069809] and
>>>>>>> [2017-01-10 02:46:25.069835]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:25.069852] W [MSGID: 122056]
>>>>>>> [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-5:
>>>>>>> Mismatching xdata in answers of 'LOOKUP'
>>>>>>> 
>>>>>>> The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check]
>>>>>>> 0-StoragePool-disperse-5: Mismatching xdata in answers of 'LOOKUP'"
>>>>>>> repeated 2 times between [2017-01-10 02:46:25.069852] and
>>>>>>> [2017-01-10 02:46:25.069873]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:25.069910] W [MSGID: 122056]
>>>>>>> [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-6:
>>>>>>> Mismatching xdata in answers of 'LOOKUP'
>>>>>>> 
>>>>>>> ...
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.520774] I [MSGID: 109036]
>>>>>>> [dht-common.c:9076:dht_log_new_layout_for_dir_selfheal]
>>>>>>> 0-StoragePool-dht: Setting layout of
>>>>>>> /Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854213/CHUNK_51334585 with
>>>>>>> [Subvol_name: StoragePool-disperse-0, Err: -1 , Start: 3221225466
>>>>>>> ,
>>>>>>> Stop: 3758096376 , Hash: 1 ], [Subvol_name:
>>>>>>> StoragePool-disperse-1,
>>> Err:
>>>>>>> -1 , Start: 3758096377 , Stop: 4294967295 , Hash: 1 ], [Subvol_name:
>>>>>>> StoragePool-disperse-2, Err: -1 , Start: 0 , Stop: 536870910 , Hash:
>>>>>>> 1 ], [Subvol_name: StoragePool-disperse-3, Err: -1 , Start:
>>>>>>> 536870911 ,
>>>>>>> Stop: 1073741821 , Hash: 1 ], [Subvol_name:
>>>>>>> StoragePool-disperse-4,
>>> Err:
>>>>>>> -1 , Start: 1073741822 , Stop: 1610612732 , Hash: 1 ], [Subvol_name:
>>>>>>> StoragePool-disperse-5, Err: -1 , Start: 1610612733 , Stop:
>>>>>>> 2147483643 ,
>>>>>>> Hash: 1 ], [Subvol_name: StoragePool-disperse-6, Err: -1 , Start:
>>>>>>> 2147483644 , Stop: 2684354554 , Hash: 1 ], [Subvol_name:
>>>>>>> StoragePool-disperse-7, Err: -1 , Start: 2684354555 , Stop:
>>>>>>> 3221225465 ,
>>>>>>> Hash: 1 ],
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.522841] N [MSGID: 122031]
>>>>>>> [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-3:
>>>>>>> Mismatching dictionary in answers of 'GF_FOP_XATTROP'
>>>>>>> 
>>>>>>> The message "N [MSGID: 122031]
>>>>>>> [ec-generic.c:1130:ec_combine_xattrop]
>>>>>>> 0-StoragePool-disperse-3: Mismatching dictionary in answers of
>>>>>>> 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10
>>>>>>> 02:46:26.522841] and [2017-01-10 02:46:26.522894]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.522898] W [MSGID: 122040]
>>>>>>> [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-3:
>>>>>>> Failed to get size and version [Input/output error]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.523115] N [MSGID: 122031]
>>>>>>> [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-6:
>>>>>>> Mismatching dictionary in answers of 'GF_FOP_XATTROP'
>>>>>>> 
>>>>>>> The message "N [MSGID: 122031]
>>>>>>> [ec-generic.c:1130:ec_combine_xattrop]
>>>>>>> 0-StoragePool-disperse-6: Mismatching dictionary in answers of
>>>>>>> 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10
>>>>>>> 02:46:26.523115] and [2017-01-10 02:46:26.523143]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.523147] W [MSGID: 122040]
>>>>>>> [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-6:
>>>>>>> Failed to get size and version [Input/output error]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.523302] N [MSGID: 122031]
>>>>>>> [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-2:
>>>>>>> Mismatching dictionary in answers of 'GF_FOP_XATTROP'
>>>>>>> 
>>>>>>> The message "N [MSGID: 122031]
>>>>>>> [ec-generic.c:1130:ec_combine_xattrop]
>>>>>>> 0-StoragePool-disperse-2: Mismatching dictionary in answers of
>>>>>>> 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10
>>>>>>> 02:46:26.523302] and [2017-01-10 02:46:26.523324]
>>>>>>> 
>>>>>>> [2017-01-10 02:46:26.523328] W [MSGID: 122040]
>>>>>>> [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-2:
>>>>>>> Failed to get size and version [Input/output error]
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [root@glusterfs3 Log_Files]# gluster --version
>>>>>>> 
>>>>>>> glusterfs 3.7.18 built on Dec  8 2016 06:34:26
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [root@glusterfs3 Log_Files]# gluster volume info
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Volume Name: StoragePool
>>>>>>> 
>>>>>>> Type: Distributed-Disperse
>>>>>>> 
>>>>>>> Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f
>>>>>>> 
>>>>>>> Status: Started
>>>>>>> 
>>>>>>> Number of Bricks: 8 x (2 + 1) = 24
>>>>>>> 
>>>>>>> Transport-type: tcp
>>>>>>> 
>>>>>>> Bricks:
>>>>>>> 
>>>>>>> Brick1: glusterfs1sds:/ws/disk1/ws_brick
>>>>>>> 
>>>>>>> Brick2: glusterfs2sds:/ws/disk1/ws_brick
>>>>>>> 
>>>>>>> Brick3: glusterfs3sds:/ws/disk1/ws_brick
>>>>>>> 
>>>>>>> Brick4: glusterfs1sds:/ws/disk2/ws_brick
>>>>>>> 
>>>>>>> Brick5: glusterfs2sds:/ws/disk2/ws_brick
>>>>>>> 
>>>>>>> Brick6: glusterfs3sds:/ws/disk2/ws_brick
>>>>>>> 
>>>>>>> Brick7: glusterfs1sds:/ws/disk3/ws_brick
>>>>>>> 
>>>>>>> Brick8: glusterfs2sds:/ws/disk3/ws_brick
>>>>>>> 
>>>>>>> Brick9: glusterfs3sds:/ws/disk3/ws_brick
>>>>>>> 
>>>>>>> Brick10: glusterfs1sds:/ws/disk4/ws_brick
>>>>>>> 
>>>>>>> Brick11: glusterfs2sds:/ws/disk4/ws_brick
>>>>>>> 
>>>>>>> Brick12: glusterfs3sds:/ws/disk4/ws_brick
>>>>>>> 
>>>>>>> Brick13: glusterfs1sds:/ws/disk5/ws_brick
>>>>>>> 
>>>>>>> Brick14: glusterfs2sds:/ws/disk5/ws_brick
>>>>>>> 
>>>>>>> Brick15: glusterfs3sds:/ws/disk5/ws_brick
>>>>>>> 
>>>>>>> Brick16: glusterfs1sds:/ws/disk6/ws_brick
>>>>>>> 
>>>>>>> Brick17: glusterfs2sds:/ws/disk6/ws_brick
>>>>>>> 
>>>>>>> Brick18: glusterfs3sds:/ws/disk6/ws_brick
>>>>>>> 
>>>>>>> Brick19: glusterfs1sds:/ws/disk7/ws_brick
>>>>>>> 
>>>>>>> Brick20: glusterfs2sds:/ws/disk7/ws_brick
>>>>>>> 
>>>>>>> Brick21: glusterfs3sds:/ws/disk7/ws_brick
>>>>>>> 
>>>>>>> Brick22: glusterfs1sds:/ws/disk8/ws_brick
>>>>>>> 
>>>>>>> Brick23: glusterfs2sds:/ws/disk8/ws_brick
>>>>>>> 
>>>>>>> Brick24: glusterfs3sds:/ws/disk8/ws_brick
>>>>>>> 
>>>>>>> Options Reconfigured:
>>>>>>> 
>>>>>>> performance.readdir-ahead: on
>>>>>>> 
>>>>>>> diagnostics.client-log-level: INFO
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks and Regards,
>>>>>>> 
>>>>>>> Ram
>>>>>>> 
>>>>>>> ***************************Legal
>>>>>>> Disclaimer***************************
>>>>>>> "This communication may contain confidential and privileged
>>>>>>> material for the sole use of the intended recipient. Any
>>>>>>> unauthorized review, use or distribution by others is strictly
>>>>>>> prohibited. If you have received the message by mistake, please
>>>>>>> advise the sender by reply email and delete the message. Thank you."
>>>>>>> ******************************************************************
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Gluster-devel mailing list
>>>>>>> Gluster-devel@gluster.org
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>> 
>>>>>> 
>>>>>> ***************************Legal
>>>>>> Disclaimer***************************
>>>>>> "This communication may contain confidential and privileged
>>>>>> material for the sole use of the intended recipient. Any
>>>>>> unauthorized review, use or distribution by others is strictly
>>>>>> prohibited. If you have received the message by mistake, please
>>>>>> advise the sender by reply
>>>>> email and delete the message. Thank you."
>>>>>> *******************************************************************
>>>>>> *
>>>>>> *
>>>>>> *
>>>>>> 
>>>>> 
>>>>> ***************************Legal
>>>>> Disclaimer***************************
>>>>> "This communication may contain confidential and privileged material
>>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>>> use or distribution by others is strictly prohibited. If you have
>>>>> received the message by mistake, please advise the sender by reply
>>>>> email and delete the message. Thank you."
>>>>> ********************************************************************
>>>>> *
>>>>> *
>>>> 
>>>> ***************************Legal
>>>> Disclaimer***************************
>>>> "This communication may contain confidential and privileged material
>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>> use or distribution by others is strictly prohibited. If you have
>>>> received the message by mistake, please advise the sender by reply
>>> email and delete the message. Thank you."
>>>> *********************************************************************
>>>> *
>>>> 
>>> 
>>> ***************************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material
>>> for the sole use of the intended recipient. Any unauthorized review,
>>> use or distribution by others is strictly prohibited. If you have
>>> received the message by mistake, please advise the sender by reply
>>> email and delete the message. Thank you."
>>> **********************************************************************
>> 
>> ***************************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material for the 
>> sole use of the intended recipient. Any unauthorized review, use or 
>> distribution by others is strictly prohibited. If you have received the 
>> message by mistake, please advise the sender by reply email and delete the 
>> message. Thank you."
>> **********************************************************************
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> gluster-us...@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>> ***************************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material for the
>> sole use of the intended recipient. Any unauthorized review, use or 
>> distribution
>> by others is strictly prohibited. If you have received the message by 
>> mistake,
>> please advise the sender by reply email and delete the message. Thank you."
>> **********************************************************************
>> 
> 
***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to