On Sat, Jun 9, 2018 at 9:38 AM, Dan Lavu <[email protected]> wrote: > Krutika, > > Is it also normal for the following messages as well? >
Yes, this should be fine. It only represents a transient state when multiple threads/clients are trying to create the same shard at the same time. These can be ignored. -Krutika > [2018-06-07 06:36:22.008492] E [MSGID: 113020] [posix.c:1395:posix_mknod] > 0-rhev_vms-posix: setting gfid on /gluster/brick/rhev_vms/. > shard/0ab3a16c-1d07-4153-8d01-b9b0ffd9d19b.16158 failed > [2018-06-07 06:36:22.319735] E [MSGID: 113020] [posix.c:1395:posix_mknod] > 0-rhev_vms-posix: setting gfid on /gluster/brick/rhev_vms/. > shard/0ab3a16c-1d07-4153-8d01-b9b0ffd9d19b.16160 failed > [2018-06-07 06:36:24.711800] E [MSGID: 113002] [posix.c:267:posix_lookup] > 0-rhev_vms-posix: buf->ia_gfid is null for /gluster/brick/rhev_vms/. > shard/0ab3a16c-1d07-4153-8d01-b9b0ffd9d19b.16177 [No data available] > [2018-06-07 06:36:24.711839] E [MSGID: 115050] > [server-rpc-fops.c:170:server_lookup_cbk] 0-rhev_vms-server: 32334131: > LOOKUP /.shard/0ab3a16c-1d07-4153-8d01-b9b0ffd9d19b.16177 > (be318638-e8a0-4c6d-977d-7a937aa84806/0ab3a16c-1d07-4153-8d01-b9b0ffd9d19b.16177) > ==> (No data available) [No data available] > > if so what does it mean? > > Dan > > On Tue, Aug 16, 2016 at 1:21 AM, Krutika Dhananjay <[email protected]> > wrote: > >> Thanks, I just sent http://review.gluster.org/#/c/15161/1 to reduce the >> log-level to DEBUG. Let's see what the maintainers have to say. :) >> >> -Krutika >> >> On Tue, Aug 16, 2016 at 5:50 AM, David Gossage < >> [email protected]> wrote: >> >>> On Mon, Aug 15, 2016 at 6:24 PM, Krutika Dhananjay <[email protected]> >>> wrote: >>> >>>> No. The EEXIST errors are normal and can be ignored. This can happen >>>> when multiple threads try to create the same >>>> shard in parallel. Nothing wrong with that. >>>> >>>> >>> Other than they pop up as E errors making a user worry hehe >>> >>> Is their a known bug filed against that or should I maybe create one to >>> see if we can get that sent to an informational level maybe? >>> >>> >>> >>>> -Krutika >>>> >>>> On Tue, Aug 16, 2016 at 1:02 AM, David Gossage < >>>> [email protected]> wrote: >>>> >>>>> On Sat, Aug 13, 2016 at 6:37 AM, David Gossage < >>>>> [email protected]> wrote: >>>>> >>>>>> Here is reply again just in case. I got quarantine message so not >>>>>> sure if first went through or wll anytime soon. Brick logs weren't large >>>>>> so Ill just include as text files this time >>>>>> >>>>> >>>>> Did maintenance over weekend updating ovirt from 3.6.6->3.6.7 and >>>>> after restrating the complaining ovirt node I was able to migrate the 2 vm >>>>> with issues. So not sure why the mount got stale, but I imagine that one >>>>> node couldn't see the new image files after that had occurred? >>>>> >>>>> Still getting a few sporadic errors, but seems much fewer than before >>>>> and never get any corresponding notices in any other log files >>>>> >>>>> [2016-08-15 13:40:31.510798] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/0e5ad95d-722d-4374-88fb-66fca0b14341.584 >>>>> failed [File exists] >>>>> [2016-08-15 13:40:31.522067] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/0e5ad95d-722d-4374-88fb-66fca0b14341.584 >>>>> failed [File exists] >>>>> [2016-08-15 17:47:06.375708] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/d5a328be-03d0-42f7-a443-248290849e7d.722 >>>>> failed [File exists] >>>>> [2016-08-15 17:47:26.435198] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/d5a328be-03d0-42f7-a443-248290849e7d.723 >>>>> failed [File exists] >>>>> [2016-08-15 17:47:06.405481] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/d5a328be-03d0-42f7-a443-248290849e7d.722 >>>>> failed [File exists] >>>>> [2016-08-15 17:47:26.464542] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/d5a328be-03d0-42f7-a443-248290849e7d.723 >>>>> failed [File exists] >>>>> [2016-08-15 18:46:47.187967] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.739 >>>>> failed [File exists] >>>>> [2016-08-15 18:47:41.414312] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.779 >>>>> failed [File exists] >>>>> [2016-08-15 18:47:41.450470] E [MSGID: 113022] >>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.779 >>>>> failed [File exists] >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> The attached file bricks.zip you sent to <[email protected]>;< >>>>>> [email protected]> on 8/13/2016 7:17:35 AM was quarantined. >>>>>> As a safety precaution, the University of South Carolina quarantines .zip >>>>>> and .docm files sent via email. If this is a legitimate attachment < >>>>>> [email protected]>;<[email protected]> may contact the >>>>>> Service Desk at 803-777-1800 ([email protected]) and the attachment >>>>>> file will be released from quarantine and delivered. >>>>>> >>>>>> >>>>>> On Sat, Aug 13, 2016 at 6:15 AM, David Gossage < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> On Sat, Aug 13, 2016 at 12:26 AM, Krutika Dhananjay < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> 1. Could you share the output of `gluster volume heal <VOL> info`? >>>>>>>> >>>>>>> Results were same moments after issue occurred as well >>>>>>> Brick ccgl1.gl.local:/gluster1/BRICK1/1 >>>>>>> Status: Connected >>>>>>> Number of entries: 0 >>>>>>> >>>>>>> Brick ccgl2.gl.local:/gluster1/BRICK1/1 >>>>>>> Status: Connected >>>>>>> Number of entries: 0 >>>>>>> >>>>>>> Brick ccgl4.gl.local:/gluster1/BRICK1/1 >>>>>>> Status: Connected >>>>>>> Number of entries: 0 >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 2. `gluster volume info` >>>>>>>> >>>>>>> Volume Name: GLUSTER1 >>>>>>> Type: Replicate >>>>>>> Volume ID: 167b8e57-28c3-447a-95cc-8410cbdf3f7f >>>>>>> Status: Started >>>>>>> Number of Bricks: 1 x 3 = 3 >>>>>>> Transport-type: tcp >>>>>>> Bricks: >>>>>>> Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 >>>>>>> Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 >>>>>>> Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 >>>>>>> Options Reconfigured: >>>>>>> cluster.locking-scheme: granular >>>>>>> nfs.enable-ino32: off >>>>>>> nfs.addr-namelookup: off >>>>>>> nfs.disable: on >>>>>>> performance.strict-write-ordering: off >>>>>>> cluster.background-self-heal-count: 16 >>>>>>> cluster.self-heal-window-size: 1024 >>>>>>> server.allow-insecure: on >>>>>>> cluster.server-quorum-type: server >>>>>>> cluster.quorum-type: auto >>>>>>> network.remote-dio: enable >>>>>>> cluster.eager-lock: enable >>>>>>> performance.stat-prefetch: on >>>>>>> performance.io-cache: off >>>>>>> performance.read-ahead: off >>>>>>> performance.quick-read: off >>>>>>> storage.owner-gid: 36 >>>>>>> storage.owner-uid: 36 >>>>>>> performance.readdir-ahead: on >>>>>>> features.shard: on >>>>>>> features.shard-block-size: 64MB >>>>>>> diagnostics.brick-log-level: WARNING >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 3. fuse mount logs of the affected volume(s)? >>>>>>>> >>>>>>> [2016-08-12 21:34:19.518511] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 21:34:19.519115] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] 0-GLUSTER1-client-0: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 21:34:19.519203] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] 0-GLUSTER1-client-1: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 21:34:19.519226] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] 0-GLUSTER1-client-2: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 21:34:19.520737] W [MSGID: 108008] >>>>>>> [afr-read-txn.c:244:afr_read_txn] 0-GLUSTER1-replicate-0: >>>>>>> Unreadable subvolume -1 found with event generation 3 for gfid >>>>>>> e18650c4-02c0-4a5a-bd4c-bbdf5fbd9c88. (Possible split-brain) >>>>>>> [2016-08-12 21:34:19.521393] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] 0-GLUSTER1-client-2: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 21:34:19.522269] E [MSGID: 109040] >>>>>>> [dht-helper.c:1190:dht_migration_complete_check_task] >>>>>>> 0-GLUSTER1-dht: (null): failed to lookup the file on GLUSTER1-dht [Stale >>>>>>> file handle] >>>>>>> [2016-08-12 21:34:19.522341] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>>>>> 0-glusterfs-fuse: 18479997: READ => -1 >>>>>>> gfid=31d7c904-775e-4b9f-8ef7-888218679845 >>>>>>> fd=0x7f00a80bde58 (Stale file handle) >>>>>>> [2016-08-12 21:34:19.521296] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] 0-GLUSTER1-client-1: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 21:34:19.521357] W [MSGID: 114031] >>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] 0-GLUSTER1-client-0: >>>>>>> remote operation failed [No such file or directory] >>>>>>> [2016-08-12 22:15:08.337528] I [MSGID: 109066] >>>>>>> [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/ec4f5b10-02b1-4 >>>>>>> 35c-a7e1-97e399532597/0e6ed1c3-ffe0-43b0-9863-439ccc3193c9.meta.new >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) => >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/ec4f5b10-02b1-4 >>>>>>> 35c-a7e1-97e399532597/0e6ed1c3-ffe0-43b0-9863-439ccc3193c9.meta >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) >>>>>>> [2016-08-12 22:15:12.240026] I [MSGID: 109066] >>>>>>> [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/78636a1b-86dd-4 >>>>>>> aaf-8b4f-4ab9c3509e88/4707d651-06c6-446b-b9c8-408004a55ada.meta.new >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) => >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/78636a1b-86dd-4 >>>>>>> aaf-8b4f-4ab9c3509e88/4707d651-06c6-446b-b9c8-408004a55ada.meta >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) >>>>>>> [2016-08-12 22:15:11.105593] I [MSGID: 109066] >>>>>>> [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/ec4f5b10-02b1-4 >>>>>>> 35c-a7e1-97e399532597/0e6ed1c3-ffe0-43b0-9863-439ccc3193c9.meta.new >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) => >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/ec4f5b10-02b1-4 >>>>>>> 35c-a7e1-97e399532597/0e6ed1c3-ffe0-43b0-9863-439ccc3193c9.meta >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) >>>>>>> [2016-08-12 22:15:14.772713] I [MSGID: 109066] >>>>>>> [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/78636a1b-86dd-4 >>>>>>> aaf-8b4f-4ab9c3509e88/4707d651-06c6-446b-b9c8-408004a55ada.meta.new >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) => >>>>>>> /7c73a8dd-a72e-4556-ac88-7f6813131e64/images/78636a1b-86dd-4 >>>>>>> aaf-8b4f-4ab9c3509e88/4707d651-06c6-446b-b9c8-408004a55ada.meta >>>>>>> (hash=GLUSTER1-replicate-0/cache=GLUSTER1-replicate-0) >>>>>>> >>>>>>> 4. glustershd logs >>>>>>>> >>>>>>> Nothing recent same on all 3 storage nodes >>>>>>> [2016-08-07 08:48:03.593401] I [glusterfsd-mgmt.c:1600:mgmt_getspec_cbk] >>>>>>> 0-glusterfs: No change in volfile, continuing >>>>>>> [2016-08-11 08:14:03.683287] I [MSGID: 100011] >>>>>>> [glusterfsd.c:1323:reincarnate] 0-glusterfsd: Fetching the volume >>>>>>> file from server... >>>>>>> [2016-08-11 08:14:03.684492] I [glusterfsd-mgmt.c:1600:mgmt_getspec_cbk] >>>>>>> 0-glusterfs: No change in volfile, continuing >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 5. Brick logs >>>>>>>> >>>>>>> Their have been some error in brick logs I hadn't noticed >>>>>>> occurring. I've zip'd and attached all 3 nodes logs, but from this >>>>>>> snippet >>>>>>> on one node none of them seem to coincide with the time window when >>>>>>> migration had issues. f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42 shard >>>>>>> refers to an image for a different vm than one I had issues with as >>>>>>> well. >>>>>>> Maybe gluster is trying to do some sort of make shard test before >>>>>>> writing >>>>>>> out changes that would go to that image and that shard file? >>>>>>> >>>>>>> [2016-08-12 18:48:22.463628] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.697 >>>>>>> failed [File exists] >>>>>>> [2016-08-12 18:48:24.553455] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.698 >>>>>>> failed [File exists] >>>>>>> [2016-08-12 18:49:16.065502] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.738 >>>>>>> failed [File exists] >>>>>>> The message "E [MSGID: 113022] [posix.c:1245:posix_mknod] >>>>>>> 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7 >>>>>>> f3c5-4c13-4020-b560-1f4f7b1e3c42.697 failed [File exists]" repeated >>>>>>> 5 times between [2016-08-12 18:48:22.463628] and [2016-08-12 >>>>>>> 18:48:22.514777] >>>>>>> [2016-08-12 18:48:24.581216] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/f9a7f3c5-4c13-4020-b560-1f4f7b1e3c42.698 >>>>>>> failed [File exists] >>>>>>> The message "E [MSGID: 113022] [posix.c:1245:posix_mknod] >>>>>>> 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7 >>>>>>> f3c5-4c13-4020-b560-1f4f7b1e3c42.738 failed [File exists]" repeated >>>>>>> 5 times between [2016-08-12 18:49:16.065502] and [2016-08-12 >>>>>>> 18:49:16.107746] >>>>>>> [2016-08-12 19:23:40.964678] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/83794e5d-2225-4560-8df6-7c903c8a648a.1301 >>>>>>> failed [File exists] >>>>>>> [2016-08-12 20:00:33.498751] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/0e5ad95d-722d-4374-88fb-66fca0b14341.580 >>>>>>> failed [File exists] >>>>>>> [2016-08-12 20:00:33.530938] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/0e5ad95d-722d-4374-88fb-66fca0b14341.580 >>>>>>> failed [File exists] >>>>>>> [2016-08-13 01:47:23.338036] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/18843fb4-e31c-4fc3-b519-cc6e5e947813.211 >>>>>>> failed [File exists] >>>>>>> The message "E [MSGID: 113022] [posix.c:1245:posix_mknod] >>>>>>> 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/1884 >>>>>>> 3fb4-e31c-4fc3-b519-cc6e5e947813.211 failed [File exists]" repeated >>>>>>> 16 times between [2016-08-13 01:47:23.338036] and [2016-08-13 >>>>>>> 01:47:23.380980] >>>>>>> [2016-08-13 01:48:02.224494] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/ffbbcce0-3c4a-4fdf-b79f-a96ca3215657.211 >>>>>>> failed [File exists] >>>>>>> [2016-08-13 01:48:42.266148] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/18843fb4-e31c-4fc3-b519-cc6e5e947813.177 >>>>>>> failed [File exists] >>>>>>> [2016-08-13 01:49:09.717434] E [MSGID: 113022] >>>>>>> [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on >>>>>>> /gluster1/BRICK1/1/.shard/18843fb4-e31c-4fc3-b519-cc6e5e947813.178 >>>>>>> failed [File exists] >>>>>>> >>>>>>> >>>>>>>> -Krutika >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Aug 13, 2016 at 3:10 AM, David Gossage < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> On Fri, Aug 12, 2016 at 4:25 PM, Dan Lavu <[email protected]> wrote: >>>>>>>>> >>>>>>>>>> David, >>>>>>>>>> >>>>>>>>>> I'm seeing similar behavior in my lab, but it has been caused by >>>>>>>>>> healing files in the gluster cluster, though I attribute my problems >>>>>>>>>> to >>>>>>>>>> problems with the storage fabric. See if 'gluster volume heal $VOL >>>>>>>>>> info' >>>>>>>>>> indicates files that are being healed, and if those reduce in >>>>>>>>>> number, can >>>>>>>>>> the VM start? >>>>>>>>>> >>>>>>>>>> >>>>>>>>> I haven't had any files in a state of being healed according to >>>>>>>>> either of the 3 storage nodes. >>>>>>>>> >>>>>>>>> I shut down one VM that has been around awhile a moment ago then >>>>>>>>> told it to start on the one ovirt server that complained previously. >>>>>>>>> It >>>>>>>>> ran fine, and I was able to migrate it off and on the host no issues. >>>>>>>>> >>>>>>>>> I told one of the new VM's to migrate to the one node and within >>>>>>>>> seconds it paused from unknown storage errors no shards showing heals >>>>>>>>> nothing with an error on storage node. Same stale file handle issues. >>>>>>>>> >>>>>>>>> I'll probably put this node in maintenance later and reboot it. >>>>>>>>> Other than that I may re-clone those 2 reccent VM's. maybe images >>>>>>>>> just got >>>>>>>>> corrupted though why it would only fail on one node of 3 if image was >>>>>>>>> bad >>>>>>>>> not sure. >>>>>>>>> >>>>>>>>> >>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> On Thu, Aug 11, 2016 at 7:52 AM, David Gossage < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Figure I would repost here as well. one client out of 3 >>>>>>>>>>> complaining of stale file handles on a few new VM's I migrated >>>>>>>>>>> over. No >>>>>>>>>>> errors on storage nodes just client. Maybe just put that one in >>>>>>>>>>> maintenance and restart gluster mount? >>>>>>>>>>> >>>>>>>>>>> *David Gossage* >>>>>>>>>>> *Carousel Checks Inc. | System Administrator* >>>>>>>>>>> *Office* 708.613.2284 >>>>>>>>>>> >>>>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>>>> From: David Gossage <[email protected]> >>>>>>>>>>> Date: Thu, Aug 11, 2016 at 12:17 AM >>>>>>>>>>> Subject: vm paused unknown storage error one node out of 3 only >>>>>>>>>>> To: users <[email protected]> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Out of a 3 node cluster running oVirt 3.6.6.2-1.el7.centos with >>>>>>>>>>> a 3 replicate gluster 3.7.14 starting a VM i just copied in on one >>>>>>>>>>> node of >>>>>>>>>>> the 3 gets the following errors. The other 2 the vm starts fine. >>>>>>>>>>> All >>>>>>>>>>> ovirt and gluster are centos 7 based. VM on start of the one node >>>>>>>>>>> it tries >>>>>>>>>>> to default to on its own accord immediately puts into paused for >>>>>>>>>>> unknown >>>>>>>>>>> reason. Telling it to start on different node starts ok. node >>>>>>>>>>> with issue >>>>>>>>>>> already has 5 VMs running fine on it same gluster storage plus the >>>>>>>>>>> hosted >>>>>>>>>>> engine on different volume. >>>>>>>>>>> >>>>>>>>>>> gluster nodes logs did not have any errors for volume >>>>>>>>>>> nodes own gluster logs had this in log >>>>>>>>>>> >>>>>>>>>>> dfb8777a-7e8c-40ff-8faa-252beabba5f8 couldnt find in .glusterfs >>>>>>>>>>> .shard or images/ >>>>>>>>>>> >>>>>>>>>>> 7919f4a0-125c-4b11-b5c9-fb50cc195c43 is the gfid of the >>>>>>>>>>> bootable drive of the vm >>>>>>>>>>> >>>>>>>>>>> [2016-08-11 04:31:39.982952] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] >>>>>>>>>>> 0-GLUSTER1-client-2: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:31:39.983683] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-2: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:31:39.984182] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-0: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:31:39.984221] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-1: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:31:39.985941] W [MSGID: 108008] >>>>>>>>>>> [afr-read-txn.c:244:afr_read_txn] 0-GLUSTER1-replicate-0: >>>>>>>>>>> Unreadable subvolume -1 found with event generation 3 for gfid >>>>>>>>>>> dfb8777a-7e8c-40ff-8faa-252beabba5f8. (Possible split-brain) >>>>>>>>>>> [2016-08-11 04:31:39.986633] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-2: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:31:39.987644] E [MSGID: 109040] >>>>>>>>>>> [dht-helper.c:1190:dht_migration_complete_check_task] >>>>>>>>>>> 0-GLUSTER1-dht: (null): failed to lookup the file on GLUSTER1-dht >>>>>>>>>>> [Stale >>>>>>>>>>> file handle] >>>>>>>>>>> [2016-08-11 04:31:39.987751] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>>>>>>>>> 0-glusterfs-fuse: 15152930: READ => -1 >>>>>>>>>>> gfid=7919f4a0-125c-4b11-b5c9-fb50cc195c43 >>>>>>>>>>> fd=0x7f00a80bdb64 (Stale file handle) >>>>>>>>>>> [2016-08-11 04:31:39.986567] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-0: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:31:39.986567] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-1: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:35:21.210145] W [MSGID: 108008] >>>>>>>>>>> [afr-read-txn.c:244:afr_read_txn] 0-GLUSTER1-replicate-0: >>>>>>>>>>> Unreadable subvolume -1 found with event generation 3 for gfid >>>>>>>>>>> dfb8777a-7e8c-40ff-8faa-252beabba5f8. (Possible split-brain) >>>>>>>>>>> [2016-08-11 04:35:21.210873] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-1: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:35:21.210888] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-2: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:35:21.210947] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-0: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:35:21.213270] E [MSGID: 109040] >>>>>>>>>>> [dht-helper.c:1190:dht_migration_complete_check_task] >>>>>>>>>>> 0-GLUSTER1-dht: (null): failed to lookup the file on GLUSTER1-dht >>>>>>>>>>> [Stale >>>>>>>>>>> file handle] >>>>>>>>>>> [2016-08-11 04:35:21.213345] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>>>>>>>>> 0-glusterfs-fuse: 15156910: READ => -1 >>>>>>>>>>> gfid=7919f4a0-125c-4b11-b5c9-fb50cc195c43 >>>>>>>>>>> fd=0x7f00a80bf6d0 (Stale file handle) >>>>>>>>>>> [2016-08-11 04:35:21.211516] W [MSGID: 108008] >>>>>>>>>>> [afr-read-txn.c:244:afr_read_txn] 0-GLUSTER1-replicate-0: >>>>>>>>>>> Unreadable subvolume -1 found with event generation 3 for gfid >>>>>>>>>>> dfb8777a-7e8c-40ff-8faa-252beabba5f8. (Possible split-brain) >>>>>>>>>>> [2016-08-11 04:35:21.212013] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-0: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:35:21.212081] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-1: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> [2016-08-11 04:35:21.212121] W [MSGID: 114031] >>>>>>>>>>> [client-rpc-fops.c:1572:client3_3_fstat_cbk] >>>>>>>>>>> 0-GLUSTER1-client-2: remote operation failed [No such file or >>>>>>>>>>> directory] >>>>>>>>>>> >>>>>>>>>>> I attached vdsm.log starting from when I spun up vm on offending >>>>>>>>>>> node >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Gluster-users mailing list >>>>>>>>> [email protected] >>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
