On Thu, Jan 25, 2018 at 1:49 PM, Samuli Heinonen <[email protected]> wrote:
> Pranith Kumar Karampuri kirjoitti 25.01.2018 07:09: > >> On Thu, Jan 25, 2018 at 2:27 AM, Samuli Heinonen >> <[email protected]> wrote: >> >> Hi! >>> >>> Thank you very much for your help so far. Could you please tell an >>> example command how to use aux-gid-mount to remove locks? "gluster >>> vol clear-locks" seems to mount volume by itself. >>> >> >> You are correct, sorry, this was implemented around 7 years back and I >> forgot that bit about it :-(. Essentially it becomes a getxattr >> syscall on the file. >> Could you give me the clear-locks command you were trying to execute >> and I can probably convert it to the getfattr command? >> > > I have been testing this in test environment and with command: > gluster vol clear-locks g1 /.gfid/14341ccb-df7b-4f92-90d5-7814431c5a1c > kind all inode > Could you do strace of glusterd when this happens? It will have a getxattr with "glusterfs.clrlk" in the key. You need to execute that on the gfid-aux-mount > > > >> Best regards, >>> Samuli Heinonen >>> >>> Pranith Kumar Karampuri <mailto:[email protected]> >>>> 23 January 2018 at 10.30 >>>> >>>> On Tue, Jan 23, 2018 at 1:38 PM, Samuli Heinonen >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> >>>> Pranith Kumar Karampuri kirjoitti 23.01.2018 09:34: >>>> >>>> On Mon, Jan 22, 2018 at 12:33 AM, Samuli Heinonen >>>> >>>> <[email protected] <mailto:[email protected]>> >>>> wrote: >>>> >>>> Hi again, >>>> >>>> here is more information regarding issue described >>>> earlier >>>> >>>> It looks like self healing is stuck. According to >>>> "heal >>>> statistics" >>>> crawl began at Sat Jan 20 12:56:19 2018 and it's still >>>> going on >>>> (It's around Sun Jan 21 20:30 when writing this). >>>> However >>>> glustershd.log says that last heal was completed at >>>> "2018-01-20 >>>> 11:00:13.090697" (which is 13:00 UTC+2). Also "heal >>>> info" >>>> has been >>>> running now for over 16 hours without any information. >>>> In >>>> statedump >>>> I can see that storage nodes have locks on files and >>>> some >>>> of those >>>> are blocked. Ie. Here again it says that ovirt8z2 is >>>> having active >>>> lock even ovirt8z2 crashed after the lock was >>>> granted.: >>>> >>>> [xlator.features.locks.zone2-ssd1-vmstor1-locks.inode] >>>> path=/.shard/3d55f8cc-cda9-489a-b0a3-fd0f43d67876.27 >>>> mandatory=0 >>>> inodelk-count=3 >>>> >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal >>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, >>>> start=0, >>>> len=0, pid >>>> = 18446744073709551610, owner=d0c6d857a87f0000, >>>> client=0x7f885845efa0, >>>> >>>> >>>> >>>> >>> connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541- >> zone2-ssd1-vmstor1-client-0-0-0, >> >>> >>>> granted at 2018-01-20 10:59:52 >>>> >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0 >>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, >>>> start=0, >>>> len=0, pid >>>> = 3420, owner=d8b9372c397f0000, client=0x7f8858410be0, >>>> >>>> connection-id=ovirt8z2.xxx.com [1] >>>> >>>> >>>> >>> <http://ovirt8z2.xxx.com>-5652-2017/12/27-09:49:02:946825- >> zone2-ssd1-vmstor1-client-0-7-0, >> >>> >>>> granted at 2018-01-20 08:57:23 >>>> inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, >>>> start=0, >>>> len=0, >>>> pid = 18446744073709551610, owner=d0c6d857a87f0000, >>>> client=0x7f885845efa0, >>>> >>>> >>>> >>>> >>> connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541- >> zone2-ssd1-vmstor1-client-0-0-0, >> >>> >>>> blocked at 2018-01-20 10:59:52 >>>> >>>> I'd also like to add that volume had arbiter brick >>>> before >>>> crash >>>> happened. We decided to remove it because we thought >>>> that >>>> it was >>>> causing issues. However now I think that this was >>>> unnecessary. After >>>> the crash arbiter logs had lots of messages like this: >>>> [2018-01-20 10:19:36.515717] I [MSGID: 115072] >>>> [server-rpc-fops.c:1640:server_setattr_cbk] >>>> 0-zone2-ssd1-vmstor1-server: 37374187: SETATTR >>>> <gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe> >>>> (a52055bd-e2e9-42dd-92a3-e96b693bcafe) ==> (Operation >>>> not >>>> permitted) >>>> [Operation not permitted] >>>> >>>> Is there anyways to force self heal to stop? Any help >>>> would be very >>>> much appreciated :) >>>> >>>> Exposing .shard to a normal mount is opening a can of >>>> worms. You >>>> should probably look at mounting the volume with gfid >>>> aux-mount where >>>> you can access a file with >>>> <path-to-mount>/.gfid/<gfid-string>to clear >>>> locks on it. >>>> >>>> Mount command: mount -t glusterfs -o aux-gfid-mount >>>> vm1:test >>>> /mnt/testvol >>>> >>>> A gfid string will have some hyphens like: >>>> 11118443-1894-4273-9340-4b212fa1c0e4 >>>> >>>> That said. Next disconnect on the brick where you >>>> successfully >>>> did the >>>> clear-locks will crash the brick. There was a bug in 3.8.x >>>> series with >>>> clear-locks which was fixed in 3.9.0 with a feature. The >>>> self-heal >>>> deadlocks that you witnessed also is fixed in 3.10 version >>>> of the >>>> release. >>>> >>>> Thank you the answer. Could you please tell more about crash? >>>> What >>>> will actually happen or is there a bug report about it? Just >>>> want >>>> to make sure that we can do everything to secure data on >>>> bricks. >>>> We will look into upgrade but we have to make sure that new >>>> version works for us and of course get self healing working >>>> before >>>> doing anything :) >>>> >>>> Locks xlator/module maintains a list of locks that are granted to >>>> a client. Clear locks had an issue where it forgets to remove the >>>> lock from this list. So the connection list ends up pointing to >>>> data that is freed in that list after a clear lock. When a >>>> disconnect happens, all the locks that are granted to a client >>>> need to be unlocked. So the process starts traversing through this >>>> list and when it starts trying to access this freed data it leads >>>> to a crash. I found it while reviewing a feature patch sent by >>>> facebook folks to locks xlator (http://review.gluster.org/14816 >>>> [2]) for 3.9.0 and they also fixed this bug as well as part of >>>> >>>> that feature patch. >>>> >>>> Br, >>>> Samuli >>>> >>>> 3.8.x is EOLed, so I recommend you to upgrade to a >>>> supported >>>> version >>>> soon. >>>> >>>> Best regards, >>>> Samuli Heinonen >>>> >>>> Samuli Heinonen >>>> 20 January 2018 at 21.57 >>>> >>>> Hi all! >>>> >>>> One hypervisor on our virtualization environment >>>> crashed and now >>>> some of the VM images cannot be accessed. After >>>> investigation we >>>> found out that there was lots of images that still >>>> had >>>> active lock >>>> on crashed hypervisor. We were able to remove >>>> locks >>>> from "regular >>>> files", but it doesn't seem possible to remove >>>> locks >>>> from shards. >>>> >>>> We are running GlusterFS 3.8.15 on all nodes. >>>> >>>> Here is part of statedump that shows shard having >>>> active lock on >>>> crashed node: >>>> >>>> [xlator.features.locks.zone2-ssd1-vmstor1-locks.inode] >>>> >>>> path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 >>>> mandatory=0 >>>> inodelk-count=1 >>>> >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata >>>> >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal >>>> >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0 >>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, >>>> start=0, len=0, >>>> pid = 3568, owner=14ce372c397f0000, >>>> client=0x7f3198388770, >>>> connection-id >>>> >>>> >>>> >>>> >>> ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmst >> or1-client-1-7-0, >> >>> >>>> granted at 2018-01-20 08:57:24 >>>> >>>> If we try to run clear-locks we get following >>>> error >>>> message: >>>> # gluster volume clear-locks zone2-ssd1-vmstor1 >>>> /.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 >>>> kind >>>> all inode >>>> Volume clear-locks unsuccessful >>>> clear-locks getxattr command failed. Reason: >>>> Operation not >>>> permitted >>>> >>>> Gluster vol info if needed: >>>> Volume Name: zone2-ssd1-vmstor1 >>>> Type: Replicate >>>> Volume ID: b6319968-690b-4060-8fff-b212d2295208 >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 1 x 2 = 2 >>>> Transport-type: rdma >>>> Bricks: >>>> Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export >>>> Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export >>>> Options Reconfigured: >>>> cluster.shd-wait-qlength: 10000 >>>> cluster.shd-max-threads: 8 >>>> cluster.locking-scheme: granular >>>> performance.low-prio-threads: 32 >>>> cluster.data-self-heal-algorithm: full >>>> performance.client-io-threads: off >>>> storage.linux-aio: off >>>> performance.readdir-ahead: on >>>> client.event-threads: 16 >>>> server.event-threads: 16 >>>> performance.strict-write-ordering: off >>>> performance.quick-read: off >>>> performance.read-ahead: on >>>> performance.io-cache: off >>>> performance.stat-prefetch: off >>>> cluster.eager-lock: enable >>>> network.remote-dio: on >>>> cluster.quorum-type: none >>>> network.ping-timeout: 22 >>>> performance.write-behind: off >>>> nfs.disable: on >>>> features.shard: on >>>> features.shard-block-size: 512MB >>>> storage.owner-uid: 36 >>>> storage.owner-gid: 36 >>>> performance.io-thread-count: 64 >>>> performance.cache-size: 2048MB >>>> performance.write-behind-window-size: 256MB >>>> server.allow-insecure: on >>>> cluster.ensure-durability: off >>>> config.transport: rdma >>>> server.outstanding-rpc-limit: 512 >>>> diagnostics.brick-log-level: INFO >>>> >>>> Any recommendations how to advance from here? >>>> >>>> Best regards, >>>> Samuli Heinonen >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> <mailto:[email protected]> >>>> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users [3] >>>> <http://lists.gluster.org/mailman/listinfo/gluster-users [3]> >>>> [1] >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> <mailto:[email protected]> >>>> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users [3] >>>> >>>> <http://lists.gluster.org/mailman/listinfo/gluster-users [3]> [1] >>>> >>>> -- >>>> >>>> Pranith >>>> >>>> Links: >>>> ------ >>>> [1] >>>> http://lists.gluster.org/mailman/listinfo/gluster-users [3] >>>> <http://lists.gluster.org/mailman/listinfo/gluster-users >>>> [3]> >>>> >>>> >>>> -- >>>> Pranith >>>> Samuli Heinonen <mailto:[email protected]> >>>> 21 January 2018 at 21.03 >>>> Hi again, >>>> >>>> here is more information regarding issue described earlier >>>> >>>> It looks like self healing is stuck. According to "heal >>>> statistics" crawl began at Sat Jan 20 12:56:19 2018 and it's still >>>> going on (It's around Sun Jan 21 20:30 when writing this). However >>>> glustershd.log says that last heal was completed at "2018-01-20 >>>> 11:00:13.090697" (which is 13:00 UTC+2). Also "heal info" has been >>>> running now for over 16 hours without any information. In >>>> statedump I can see that storage nodes have locks on files and >>>> some of those are blocked. Ie. Here again it says that ovirt8z2 is >>>> having active lock even ovirt8z2 crashed after the lock was >>>> granted.: >>>> >>>> [xlator.features.locks.zone2-ssd1-vmstor1-locks.inode] >>>> path=/.shard/3d55f8cc-cda9-489a-b0a3-fd0f43d67876.27 >>>> mandatory=0 >>>> inodelk-count=3 >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal >>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, >>>> pid = 18446744073709551610, owner=d0c6d857a87f0000, >>>> client=0x7f885845efa0, >>>> >>>> >>> connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541- >> zone2-ssd1-vmstor1-client-0-0-0, >> >>> granted at 2018-01-20 10:59:52 >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0 >>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, >>>> pid = 3420, owner=d8b9372c397f0000, client=0x7f8858410be0, >>>> connection-id=ovirt8z2.xxx.com >>>> >>>> [1]-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-0-7-0, >>> >>>> granted at 2018-01-20 08:57:23 >>>> inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, >>>> pid = 18446744073709551610, owner=d0c6d857a87f0000, >>>> client=0x7f885845efa0, >>>> >>>> >>> connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14:649541- >> zone2-ssd1-vmstor1-client-0-0-0, >> >>> blocked at 2018-01-20 10:59:52 >>>> >>>> I'd also like to add that volume had arbiter brick before crash >>>> happened. We decided to remove it because we thought that it was >>>> causing issues. However now I think that this was unnecessary. >>>> After the crash arbiter logs had lots of messages like this: >>>> [2018-01-20 10:19:36.515717] I [MSGID: 115072] >>>> [server-rpc-fops.c:1640:server_setattr_cbk] >>>> 0-zone2-ssd1-vmstor1-server: 37374187: SETATTR >>>> <gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe> >>>> (a52055bd-e2e9-42dd-92a3-e96b693bcafe) ==> (Operation not >>>> permitted) [Operation not permitted] >>>> >>>> Is there anyways to force self heal to stop? Any help would be >>>> very much appreciated :) >>>> >>>> Best regards, >>>> Samuli Heinonen >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> http://lists.gluster.org/mailman/listinfo/gluster-users [3] >>>> >>>> Samuli Heinonen <mailto:[email protected]> >>>> >>>> 20 January 2018 at 21.57 >>>> Hi all! >>>> >>>> One hypervisor on our virtualization environment crashed and now >>>> some of the VM images cannot be accessed. After investigation we >>>> found out that there was lots of images that still had active lock >>>> on crashed hypervisor. We were able to remove locks from "regular >>>> files", but it doesn't seem possible to remove locks from shards. >>>> >>>> We are running GlusterFS 3.8.15 on all nodes. >>>> >>>> Here is part of statedump that shows shard having active lock on >>>> crashed node: >>>> [xlator.features.locks.zone2-ssd1-vmstor1-locks.inode] >>>> path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 >>>> mandatory=0 >>>> inodelk-count=1 >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal >>>> lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0 >>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, >>>> pid = 3568, owner=14ce372c397f0000, client=0x7f3198388770, >>>> connection-id >>>> >>>> >>> ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmst >> or1-client-1-7-0, >> >>> granted at 2018-01-20 08:57:24 >>>> >>>> If we try to run clear-locks we get following error message: >>>> # gluster volume clear-locks zone2-ssd1-vmstor1 >>>> /.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 kind all inode >>>> Volume clear-locks unsuccessful >>>> clear-locks getxattr command failed. Reason: Operation not >>>> permitted >>>> >>>> Gluster vol info if needed: >>>> Volume Name: zone2-ssd1-vmstor1 >>>> Type: Replicate >>>> Volume ID: b6319968-690b-4060-8fff-b212d2295208 >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 1 x 2 = 2 >>>> Transport-type: rdma >>>> Bricks: >>>> Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export >>>> Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export >>>> Options Reconfigured: >>>> cluster.shd-wait-qlength: 10000 >>>> cluster.shd-max-threads: 8 >>>> cluster.locking-scheme: granular >>>> performance.low-prio-threads: 32 >>>> cluster.data-self-heal-algorithm: full >>>> performance.client-io-threads: off >>>> storage.linux-aio: off >>>> performance.readdir-ahead: on >>>> client.event-threads: 16 >>>> server.event-threads: 16 >>>> performance.strict-write-ordering: off >>>> performance.quick-read: off >>>> performance.read-ahead: on >>>> performance.io-cache: off >>>> performance.stat-prefetch: off >>>> cluster.eager-lock: enable >>>> network.remote-dio: on >>>> cluster.quorum-type: none >>>> network.ping-timeout: 22 >>>> performance.write-behind: off >>>> nfs.disable: on >>>> features.shard: on >>>> features.shard-block-size: 512MB >>>> storage.owner-uid: 36 >>>> storage.owner-gid: 36 >>>> performance.io-thread-count: 64 >>>> performance.cache-size: 2048MB >>>> performance.write-behind-window-size: 256MB >>>> server.allow-insecure: on >>>> cluster.ensure-durability: off >>>> config.transport: rdma >>>> server.outstanding-rpc-limit: 512 >>>> diagnostics.brick-log-level: INFO >>>> >>>> Any recommendations how to advance from here? >>>> >>>> Best regards, >>>> Samuli Heinonen >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> http://lists.gluster.org/mailman/listinfo/gluster-users [3] >>>> >>> >> -- >> >> Pranith >> >> >> Links: >> ------ >> [1] http://ovirt8z2.xxx.com >> [2] http://review.gluster.org/14816 >> [3] http://lists.gluster.org/mailman/listinfo/gluster-users >> > -- Pranith
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
