I looked at the logs. >From the time the new graph (since the add-brick command you shared where bricks 41 through 44 are added) is switched to (line 3011 onwards in nfs-gfapi.log), I see the following kinds of errors:
1. Lookups to a bunch of files failed with ENOENT on both replicas which protocol/client converts to ESTALE. I am guessing these entries got migrated to other subvolumes leading to 'No such file or directory' errors. DHT and thereafter shard get the same error code and log the following: 0 [2017-03-17 14:04:26.353444] E [MSGID: 109040] [dht-helper.c:1198:dht_migration_complete_check_task] 17-vmware2-dht: <gfid:a68ce411-e381-46a3-93cd-d2af6a7c3532>: failed to lookup the file on vmware2-dht [Stale file handle] 1 [2017-03-17 14:04:26.353528] E [MSGID: 133014] [shard.c:1253:shard_common_stat_cbk] 17-vmware2-shard: stat failed: a68ce411-e381-46a3-93cd-d2af6a7c3532 [Stale file handle] which is fine. 2. The other kind are from AFR logging of possible split-brain which I suppose are harmless too. [2017-03-17 14:23:36.968883] W [MSGID: 108008] [afr-read-txn.c:228:afr_read_txn] 17-vmware2-replicate-13: Unreadable subvolume -1 found with event generation 2 for gfid 74d49288-8452-40d4-893e-ff4672557ff9. (Possible split-brain) Since you are saying the bug is hit only on VMs that are undergoing IO while rebalance is running (as opposed to those that remained powered off), rebalance + IO could be causing some issues. CC'ing DHT devs Raghavendra/Nithya/Susant, Could you take a look? -Krutika On Sun, Mar 19, 2017 at 4:55 PM, Mahdi Adnan <[email protected]> wrote: > Thank you for your email mate. > > > Yes, im aware of this but, to save costs i chose replica 2, this cluster > is all flash. > > In version 3.7.x i had issues with ping timeout, if one hosts went down > for few seconds the whole cluster hangs and become unavailable, to avoid > this i adjusted the ping timeout to 5 seconds. > > As for choosing Ganesha over gfapi, VMWare does not support Gluster (FUSE > or gfapi) im stuck with NFS for this volume. > > The other volume is mounted using gfapi in oVirt cluster. > > > > > > -- > > Respectfully > *Mahdi A. Mahdi* > > ------------------------------ > *From:* Krutika Dhananjay <[email protected]> > *Sent:* Sunday, March 19, 2017 2:01:49 PM > > *To:* Mahdi Adnan > *Cc:* [email protected] > *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption > > While I'm still going through the logs, just wanted to point out a couple > of things: > > 1. It is recommended that you use 3-way replication (replica count 3) for > VM store use case > 2. network.ping-timeout at 5 seconds is way too low. Please change it to > 30. > > Is there any specific reason for using NFS-Ganesha over gfapi/FUSE? > > Will get back with anything else I might find or more questions if I have > any. > > -Krutika > > On Sun, Mar 19, 2017 at 2:36 PM, Mahdi Adnan <[email protected]> > wrote: > >> Thanks mate, >> >> Kindly, check the attachment. >> >> >> >> -- >> >> Respectfully >> *Mahdi A. Mahdi* >> >> ------------------------------ >> *From:* Krutika Dhananjay <[email protected]> >> *Sent:* Sunday, March 19, 2017 10:00:22 AM >> >> *To:* Mahdi Adnan >> *Cc:* [email protected] >> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption >> >> In that case could you share the ganesha-gfapi logs? >> >> -Krutika >> >> On Sun, Mar 19, 2017 at 12:13 PM, Mahdi Adnan <[email protected]> >> wrote: >> >>> I have two volumes, one is mounted using libgfapi for ovirt mount, the >>> other one is exported via NFS-Ganesha for VMWare which is the one im >>> testing now. >>> >>> >>> >>> -- >>> >>> Respectfully >>> *Mahdi A. Mahdi* >>> >>> ------------------------------ >>> *From:* Krutika Dhananjay <[email protected]> >>> *Sent:* Sunday, March 19, 2017 8:02:19 AM >>> >>> *To:* Mahdi Adnan >>> *Cc:* [email protected] >>> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption >>> >>> >>> >>> On Sat, Mar 18, 2017 at 10:36 PM, Mahdi Adnan <[email protected]> >>> wrote: >>> >>>> Kindly, check the attached new log file, i dont know if it's helpful or >>>> not but, i couldn't find the log with the name you just described. >>>> >>> No. Are you using FUSE or libgfapi for accessing the volume? Or is it >>> NFS? >>> >>> -Krutika >>> >>>> >>>> >>>> -- >>>> >>>> Respectfully >>>> *Mahdi A. Mahdi* >>>> >>>> ------------------------------ >>>> *From:* Krutika Dhananjay <[email protected]> >>>> *Sent:* Saturday, March 18, 2017 6:10:40 PM >>>> >>>> *To:* Mahdi Adnan >>>> *Cc:* [email protected] >>>> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption >>>> >>>> mnt-disk11-vmware2.log seems like a brick log. Could you attach the >>>> fuse mount logs? It should be right under /var/log/glusterfs/ directory >>>> named after the mount point name, only hyphenated. >>>> >>>> -Krutika >>>> >>>> On Sat, Mar 18, 2017 at 7:27 PM, Mahdi Adnan <[email protected]> >>>> wrote: >>>> >>>>> Hello Krutika, >>>>> >>>>> >>>>> Kindly, check the attached logs. >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Respectfully >>>>> *Mahdi A. Mahdi* >>>>> >>>>> ------------------------------ >>>>> *From:* Krutika Dhananjay <[email protected]> >>>>> *Sent:* Saturday, March 18, 2017 3:29:03 PM >>>>> *To:* Mahdi Adnan >>>>> *Cc:* [email protected] >>>>> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption >>>>> >>>>> Hi Mahdi, >>>>> >>>>> Could you attach mount, brick and rebalance logs? >>>>> >>>>> -Krutika >>>>> >>>>> On Sat, Mar 18, 2017 at 12:14 AM, Mahdi Adnan <[email protected] >>>>> > wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have upgraded to Gluster 3.8.10 today and ran the add-brick >>>>>> procedure in a volume contains few VMs. >>>>>> After the completion of rebalance, i have rebooted the VMs, some of >>>>>> ran just fine, and others just crashed. >>>>>> Windows boot to recovery mode and Linux throw xfs errors and does not >>>>>> boot. >>>>>> I ran the test again and it happened just as the first one, but i >>>>>> have noticed only VMs doing disk IOs are affected by this bug. >>>>>> The VMs in power off mode started fine and even md5 of the disk file >>>>>> did not change after the rebalance. >>>>>> >>>>>> anyone else can confirm this ? >>>>>> >>>>>> >>>>>> Volume info: >>>>>> >>>>>> Volume Name: vmware2 >>>>>> Type: Distributed-Replicate >>>>>> Volume ID: 02328d46-a285-4533-aa3a-fb9bfeb688bf >>>>>> Status: Started >>>>>> Snapshot Count: 0 >>>>>> Number of Bricks: 22 x 2 = 44 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: gluster01:/mnt/disk1/vmware2 >>>>>> Brick2: gluster03:/mnt/disk1/vmware2 >>>>>> Brick3: gluster02:/mnt/disk1/vmware2 >>>>>> Brick4: gluster04:/mnt/disk1/vmware2 >>>>>> Brick5: gluster01:/mnt/disk2/vmware2 >>>>>> Brick6: gluster03:/mnt/disk2/vmware2 >>>>>> Brick7: gluster02:/mnt/disk2/vmware2 >>>>>> Brick8: gluster04:/mnt/disk2/vmware2 >>>>>> Brick9: gluster01:/mnt/disk3/vmware2 >>>>>> Brick10: gluster03:/mnt/disk3/vmware2 >>>>>> Brick11: gluster02:/mnt/disk3/vmware2 >>>>>> Brick12: gluster04:/mnt/disk3/vmware2 >>>>>> Brick13: gluster01:/mnt/disk4/vmware2 >>>>>> Brick14: gluster03:/mnt/disk4/vmware2 >>>>>> Brick15: gluster02:/mnt/disk4/vmware2 >>>>>> Brick16: gluster04:/mnt/disk4/vmware2 >>>>>> Brick17: gluster01:/mnt/disk5/vmware2 >>>>>> Brick18: gluster03:/mnt/disk5/vmware2 >>>>>> Brick19: gluster02:/mnt/disk5/vmware2 >>>>>> Brick20: gluster04:/mnt/disk5/vmware2 >>>>>> Brick21: gluster01:/mnt/disk6/vmware2 >>>>>> Brick22: gluster03:/mnt/disk6/vmware2 >>>>>> Brick23: gluster02:/mnt/disk6/vmware2 >>>>>> Brick24: gluster04:/mnt/disk6/vmware2 >>>>>> Brick25: gluster01:/mnt/disk7/vmware2 >>>>>> Brick26: gluster03:/mnt/disk7/vmware2 >>>>>> Brick27: gluster02:/mnt/disk7/vmware2 >>>>>> Brick28: gluster04:/mnt/disk7/vmware2 >>>>>> Brick29: gluster01:/mnt/disk8/vmware2 >>>>>> Brick30: gluster03:/mnt/disk8/vmware2 >>>>>> Brick31: gluster02:/mnt/disk8/vmware2 >>>>>> Brick32: gluster04:/mnt/disk8/vmware2 >>>>>> Brick33: gluster01:/mnt/disk9/vmware2 >>>>>> Brick34: gluster03:/mnt/disk9/vmware2 >>>>>> Brick35: gluster02:/mnt/disk9/vmware2 >>>>>> Brick36: gluster04:/mnt/disk9/vmware2 >>>>>> Brick37: gluster01:/mnt/disk10/vmware2 >>>>>> Brick38: gluster03:/mnt/disk10/vmware2 >>>>>> Brick39: gluster02:/mnt/disk10/vmware2 >>>>>> Brick40: gluster04:/mnt/disk10/vmware2 >>>>>> Brick41: gluster01:/mnt/disk11/vmware2 >>>>>> Brick42: gluster03:/mnt/disk11/vmware2 >>>>>> Brick43: gluster02:/mnt/disk11/vmware2 >>>>>> Brick44: gluster04:/mnt/disk11/vmware2 >>>>>> Options Reconfigured: >>>>>> cluster.server-quorum-type: server >>>>>> nfs.disable: on >>>>>> performance.readdir-ahead: on >>>>>> transport.address-family: inet >>>>>> performance.quick-read: off >>>>>> performance.read-ahead: off >>>>>> performance.io-cache: off >>>>>> performance.stat-prefetch: off >>>>>> cluster.eager-lock: enable >>>>>> network.remote-dio: enable >>>>>> features.shard: on >>>>>> cluster.data-self-heal-algorithm: full >>>>>> features.cache-invalidation: on >>>>>> ganesha.enable: on >>>>>> features.shard-block-size: 256MB >>>>>> client.event-threads: 2 >>>>>> server.event-threads: 2 >>>>>> cluster.favorite-child-policy: size >>>>>> storage.build-pgfid: off >>>>>> network.ping-timeout: 5 >>>>>> cluster.enable-shared-storage: enable >>>>>> nfs-ganesha: enable >>>>>> cluster.server-quorum-ratio: 51% >>>>>> >>>>>> >>>>>> Adding bricks: >>>>>> gluster volume add-brick vmware2 replica 2 >>>>>> gluster01:/mnt/disk11/vmware2 gluster03:/mnt/disk11/vmware2 >>>>>> gluster02:/mnt/disk11/vmware2 gluster04:/mnt/disk11/vmware2 >>>>>> >>>>>> >>>>>> starting fix layout: >>>>>> gluster volume rebalance vmware2 fix-layout start >>>>>> >>>>>> Starting rebalance: >>>>>> gluster volume rebalance vmware2 start >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Respectfully >>>>>> *Mahdi A. Mahdi* >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> [email protected] >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> >>>>> >>>> >>> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
