[Gluster-users] Deletion of old CHANGELOG files in .glusterfs/changelogs
Hi, I am using geo-replication since now over a year on my 3.7.20 GlusterFS volumes and noticed that the CHANGELOG. in the .glusterfs/changelogs directory of a brick never get deleted. I have for example over 120k files in one of these directories and it is growing constantly. So my question, does GlusterFS have any mechanism to automatically delete old and processed CHANGELOG files? If not is it safe to delete them manually? Regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Fw: Deletion of old CHANGELOG files in .glusterfs/changelogs
Anyone? Original Message Subject: Deletion of old CHANGELOG files in .glusterfs/changelogs Local Time: March 31, 2017 11:22 PM UTC Time: March 31, 2017 9:22 PM From: m...@protonmail.ch To: Gluster Users Hi, I am using geo-replication since now over a year on my 3.7.20 GlusterFS volumes and noticed that the CHANGELOG. in the .glusterfs/changelogs directory of a brick never get deleted. I have for example over 120k files in one of these directories and it is growing constantly. So my question, does GlusterFS have any mechanism to automatically delete old and processed CHANGELOG files? If not is it safe to delete them manually? Regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Fw: Deletion of old CHANGELOG files in .glusterfs/changelogs
Thanks to all of you for your answers. I will try the archive tool as recommended by Aravinda. Just for your information I suppose you are aware that having tons of files such as the CHANGELOGS in one single directory is really sub-optimal. Maybe better would be to have a 2 level hierarchy and store the files using an algorithm to distribute the files among that 2 level hierarchy of sub-directories, especially if there is no archiving of these files by default. Just my two cents ;-) Cheers Original Message Subject: Re: [Gluster-users] Fw: Deletion of old CHANGELOG files in .glusterfs/changelogs Local Time: April 5, 2017 8:44 AM UTC Time: April 5, 2017 6:44 AM From: atumb...@redhat.com To: Mohammed Rafi K C mabi , Gluster Users Local Time: March 31, 2017 11:22 PM UTC Time: March 31, 2017 9:22 PM From: m...@protonmail.ch To: Gluster Users [](mailto:gluster-users@gluster.org) Hi, I am using geo-replication since now over a year on my 3.7.20 GlusterFS volumes and noticed that the CHANGELOG. in the .glusterfs/changelogs directory of a brick never get deleted. I have for example over 120k files in one of these directories and it is growing constantly. So my question, does GlusterFS have any mechanism to automatically delete old and processed CHANGELOG files? If not is it safe to delete them manually? I will try to answer the question, I'm not an expert in geo-replication, So I could be wrong here. I think GlusterFS won't delete the changelogs automatically, reason being geo-replication is not the author of changelogs, it is just a consumer any other application could use changelogs. +1 for the reasoning. You can safely delete *all processed* changelogs from actual changelogs directory and geo-replication directory. You can look into the stime set as the extended attribute on the root to see the time which geo-replication last synced. If georep is the only consumer, you can use [Aravinda's tool](https://github.com/aravindavk/archive_gluster_changelogs)to move the files to another dir, and delete them. Regards, Amar___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Fw: Deletion of old CHANGELOG files in .glusterfs/changelogs
Amazing, thanks Amar for creating the issue! Cheers, M. Original Message Subject: Re: [Gluster-users] Fw: Deletion of old CHANGELOG files in .glusterfs/changelogs Local Time: April 6, 2017 4:08 AM UTC Time: April 6, 2017 2:08 AM From: atumb...@redhat.com To: mabi Mohammed Rafi K C , Gluster Users On Wed, Apr 5, 2017 at 10:58 PM, mabi wrote: Thanks to all of you for your answers. I will try the archive tool as recommended by Aravinda. Just for your information I suppose you are aware that having tons of files such as the CHANGELOGS in one single directory is really sub-optimal. Maybe better would be to have a 2 level hierarchy and store the files using an algorithm to distribute the files among that 2 level hierarchy of sub-directories, especially if there is no archiving of these files by default. Just my two cents ;-) These 2 cents helps when summed up later :-) Created a github issues so that we don't miss it : https://github.com/gluster/glusterfs/issues/154 Regards, Amar Cheers Original Message Subject: Re: [Gluster-users] Fw: Deletion of old CHANGELOG files in .glusterfs/changelogs Local Time: April 5, 2017 8:44 AM UTC Time: April 5, 2017 6:44 AM From: atumb...@redhat.com To: Mohammed Rafi K C mabi , Gluster Users Local Time: March 31, 2017 11:22 PM UTC Time: March 31, 2017 9:22 PM From: m...@protonmail.ch To: Gluster Users [](mailto:gluster-users@gluster.org) Hi, I am using geo-replication since now over a year on my 3.7.20 GlusterFS volumes and noticed that the CHANGELOG. in the .glusterfs/changelogs directory of a brick never get deleted. I have for example over 120k files in one of these directories and it is growing constantly. So my question, does GlusterFS have any mechanism to automatically delete old and processed CHANGELOG files? If not is it safe to delete them manually? I will try to answer the question, I'm not an expert in geo-replication, So I could be wrong here. I think GlusterFS won't delete the changelogs automatically, reason being geo-replication is not the author of changelogs, it is just a consumer any other application could use changelogs. +1 for the reasoning. You can safely delete *all processed* changelogs from actual changelogs directory and geo-replication directory. You can look into the stime set as the extended attribute on the root to see the time which geo-replication last synced. If georep is the only consumer, you can use [Aravinda's tool](https://github.com/aravindavk/archive_gluster_changelogs)to move the files to another dir, and delete them. Regards, Amar -- Amar Tumballi (amarts)___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")
Hello, I am using distributed geo replication with two of my GlusterFS 3.7.20 replicated volumes and just noticed that the geo replication for one volume is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried to stop and restart geo replication but still it stays stuck at that specific date and time under the DATA field of the geo replication "status detail" command I can see 3879 and that it has "Active" as STATUS but still nothing happens. I noticed that the rsync process is running but does not do anything, then I did a strace on the PID of rsync and saw the following: write(2, "rsync: link_stat \"(unreachable)/"..., 114 It looks like rsync can't read or find a file and stays stuck on that. In the geo-replication log files of GlusterFS master I can't find any error messages just informational message. For example when I restart the geo replication I see the following log entries: [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] : slave bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}] [2017-04-07 21:43:05.666435] I [monitor(monitor):468:distribute] : worker specs: [('/data/private/brick', 'ssh://root@gfs1geo.domain:gluster://localhost:private-geo', '1', False)] [2017-04-07 21:43:05.823931] I [monitor(monitor):267:monitor] Monitor: [2017-04-07 21:43:05.824204] I [monitor(monitor):268:monitor] Monitor: starting gsyncd worker [2017-04-07 21:43:05.930124] I [gsyncd(/data/private/brick):733:main_i] : syncing: gluster://localhost:private -> ssh://[root@gfs1geo.domain](mailto:r...@gfs1geo.domain.ch):gluster://localhost:private-geo [2017-04-07 21:43:05.931169] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining... [2017-04-07 21:43:08.558648] I [master(/data/private/brick):83:gmaster_builder] : setting up xsync change detection mode [2017-04-07 21:43:08.559071] I [master(/data/private/brick):367:__init__] _GMaster: using 'rsync' as the sync engine [2017-04-07 21:43:08.560163] I [master(/data/private/brick):83:gmaster_builder] : setting up changelog change detection mode [2017-04-07 21:43:08.560431] I [master(/data/private/brick):367:__init__] _GMaster: using 'rsync' as the sync engine [2017-04-07 21:43:08.561105] I [master(/data/private/brick):83:gmaster_builder] : setting up changeloghistory change detection mode [2017-04-07 21:43:08.561391] I [master(/data/private/brick):367:__init__] _GMaster: using 'rsync' as the sync engine [2017-04-07 21:43:11.354417] I [master(/data/private/brick):1249:register] _GMaster: xsync temp directory: /var/lib/misc/glusterfsd/private/ssh%3A%2F%2Froot%40192.168.20.107%3Agluster%3A%2F%2F127.0.0.1%3Aprivate-geo/616931ac8f39da5dc5834f9d47fc7b1a/xsync [2017-04-07 21:43:11.354751] I [resource(/data/private/brick):1528:service_loop] GLUSTER: Register time: 1491601391 [2017-04-07 21:43:11.357630] I [master(/data/private/brick):510:crawlwrap] _GMaster: primary master with volume id e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5 ... [2017-04-07 21:43:11.489355] I [master(/data/private/brick):519:crawlwrap] _GMaster: crawl interval: 1 seconds [2017-04-07 21:43:11.516710] I [master(/data/private/brick):1163:crawl] _GMaster: starting history crawl... turns: 1, stime: (1487885974, 0), etime: 1491601391 [2017-04-07 21:43:12.607836] I [master(/data/private/brick):1192:crawl] _GMaster: slave's time: (1487885974, 0) Does anyone know how I can find out the root cause of this problem and make geo replication work again from the time point it got stuck? Many thanks in advance for your help. Best regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")
Hi Kotresh, I am using the official Debian 8 (jessie) package which has rsync version 3.1.1. Regards, M. Original Message Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)") Local Time: April 10, 2017 6:33 AM UTC Time: April 10, 2017 4:33 AM From: khire...@redhat.com To: mabi Gluster Users Hi Mabi, What's the rsync version being used? Thanks and Regards, Kotresh H R - Original Message - > From: "mabi" > To: "Gluster Users" > Sent: Saturday, April 8, 2017 4:20:25 PM > Subject: [Gluster-users] Geo replication stuck (rsync: link_stat > "(unreachable)") > > Hello, > > I am using distributed geo replication with two of my GlusterFS 3.7.20 > replicated volumes and just noticed that the geo replication for one volume > is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried > to stop and restart geo replication but still it stays stuck at that > specific date and time under the DATA field of the geo replication "status > detail" command I can see 3879 and that it has "Active" as STATUS but still > nothing happens. I noticed that the rsync process is running but does not do > anything, then I did a strace on the PID of rsync and saw the following: > > write(2, "rsync: link_stat \"(unreachable)/"..., 114 > > It looks like rsync can't read or find a file and stays stuck on that. In the > geo-replication log files of GlusterFS master I can't find any error > messages just informational message. For example when I restart the geo > replication I see the following log entries: > > [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] : slave > bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}] > [2017-04-07 21:43:05.666435] I [monitor(monitor):468:distribute] : > worker specs: [('/data/private/brick', 'ssh:// root@gfs1geo.domain > :gluster://localhost:private-geo', '1', False)] > [2017-04-07 21:43:05.823931] I [monitor(monitor):267:monitor] Monitor: > > [2017-04-07 21:43:05.824204] I [monitor(monitor):268:monitor] Monitor: > starting gsyncd worker > [2017-04-07 21:43:05.930124] I [gsyncd(/data/private/brick):733:main_i] > : syncing: gluster://localhost:private -> ssh:// root@gfs1geo.domain > :gluster://localhost:private-geo > [2017-04-07 21:43:05.931169] I [changelogagent(agent):73:__init__] > ChangelogAgent: Agent listining... > [2017-04-07 21:43:08.558648] I > [master(/data/private/brick):83:gmaster_builder] : setting up xsync > change detection mode > [2017-04-07 21:43:08.559071] I [master(/data/private/brick):367:__init__] > _GMaster: using 'rsync' as the sync engine > [2017-04-07 21:43:08.560163] I > [master(/data/private/brick):83:gmaster_builder] : setting up changelog > change detection mode > [2017-04-07 21:43:08.560431] I [master(/data/private/brick):367:__init__] > _GMaster: using 'rsync' as the sync engine > [2017-04-07 21:43:08.561105] I > [master(/data/private/brick):83:gmaster_builder] : setting up > changeloghistory change detection mode > [2017-04-07 21:43:08.561391] I [master(/data/private/brick):367:__init__] > _GMaster: using 'rsync' as the sync engine > [2017-04-07 21:43:11.354417] I [master(/data/private/brick):1249:register] > _GMaster: xsync temp directory: > /var/lib/misc/glusterfsd/private/ssh%3A%2F%2Froot%40192.168.20.107%3Agluster%3A%2F%2F127.0.0.1%3Aprivate-geo/616931ac8f39da5dc5834f9d47fc7b1a/xsync > [2017-04-07 21:43:11.354751] I > [resource(/data/private/brick):1528:service_loop] GLUSTER: Register time: > 1491601391 > [2017-04-07 21:43:11.357630] I [master(/data/private/brick):510:crawlwrap] > _GMaster: primary master with volume id e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5 > ... > [2017-04-07 21:43:11.489355] I [master(/data/private/brick):519:crawlwrap] > _GMaster: crawl interval: 1 seconds > [2017-04-07 21:43:11.516710] I [master(/data/private/brick):1163:crawl] > _GMaster: starting history crawl... turns: 1, stime: (1487885974, 0), etime: > 1491601391 > [2017-04-07 21:43:12.607836] I [master(/data/private/brick):1192:crawl] > _GMaster: slave's time: (1487885974, 0) > > Does anyone know how I can find out the root cause of this problem and make > geo replication work again from the time point it got stuck? > > Many thanks in advance for your help. > > Best regards, > Mabi > > > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")
Hi Kotresh, Thanks for your hint, adding the "--ignore-missing-args" option to rsync and restarting geo-replication worked but it only managed to sync approximately 1/3 of the data until it put the geo replication in status "Failed" this time. Now I have a different type of error as you can see below from the log extract on my geo replication slave node: [2017-04-12 18:01:55.268923] I [MSGID: 109066] [dht-rename.c:1574:dht_rename] 0-myvol-private-geo-dht: renaming /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls.ocTransferId2118183895.part (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) => /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) [2017-04-12 18:01:55.269842] W [fuse-bridge.c:1787:fuse_rename_cbk] 0-glusterfs-fuse: 4786: /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls.ocTransferId2118183895.part -> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls => -1 (Directory not empty) [2017-04-12 18:01:55.314062] I [fuse-bridge.c:5016:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-PNSR8s [2017-04-12 18:01:55.314311] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7f97d3129064] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f97d438a725] -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x7f97d438a5a7] ) 0-: received signum (15), shutting down [2017-04-12 18:01:55.314335] I [fuse-bridge.c:5720:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-PNSR8s'. How can I fix now this issue and have geo-replication continue synchronising again? Best regards, M. Original Message Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)") Local Time: April 11, 2017 9:18 AM UTC Time: April 11, 2017 7:18 AM From: khire...@redhat.com To: mabi Gluster Users Hi, Then please use set the following rsync config and let us know if it helps. gluster vol geo-rep :: config rsync-options "--ignore-missing-args" Thanks and Regards, Kotresh H R - Original Message - > From: "mabi" > To: "Kotresh Hiremath Ravishankar" > Cc: "Gluster Users" > Sent: Tuesday, April 11, 2017 2:15:54 AM > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat > "(unreachable)") > > Hi Kotresh, > > I am using the official Debian 8 (jessie) package which has rsync version > 3.1.1. > > Regards, > M. > > Original Message > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat > "(unreachable)") > Local Time: April 10, 2017 6:33 AM > UTC Time: April 10, 2017 4:33 AM > From: khire...@redhat.com > To: mabi > Gluster Users > > Hi Mabi, > > What's the rsync version being used? > > Thanks and Regards, > Kotresh H R > > - Original Message - > > From: "mabi" > > To: "Gluster Users" > > Sent: Saturday, April 8, 2017 4:20:25 PM > > Subject: [Gluster-users] Geo replication stuck (rsync: link_stat > > "(unreachable)") > > > > Hello, > > > > I am using distributed geo replication with two of my GlusterFS 3.7.20 > > replicated volumes and just noticed that the geo replication for one volume > > is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried > > to stop and restart geo replication but still it stays stuck at that > > specific date and time under the DATA field of the geo replication "status > > detail" command I can see 3879 and that it has "Active" as STATUS but still > > nothing happens. I noticed that the rsync process is running but does not > > do > > anything, then I did a strace on the PID of rsync and saw the following: > > > > write(2, "rsync: link_stat \"(unreachable)/"..., 114 > > > > It looks like rsync can't read or find a file and stays stuck on that. In > > the > > geo-replication log files of GlusterFS master I can't find any error > > messages just informational message. For example when I restart the geo > > replication I see the following log entries: > > > > [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] : > > slave > > bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}] > > [2017-04-07 21:43:05.666435] I [monitor(monitor):468:distribute] : > > worker specs: [('/data/private/brick', 'ssh:// root@gfs1geo.domain > > :gluster://localhost:private-geo', '1', False)] > > [2017-
Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")
Hi Kotresh, Thanks for your feedback. So do you mean I can simply login into the geo-replication slave node, mount the volume with fuse, and delete the problematic directory, and finally restart geo-replcation? I am planning to migrate to 3.8 as soon as I have a backup (geo-replication). Is this issue with DHT fixed in the latest 3.8.x release? Regards, M. Original Message Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)") Local Time: April 13, 2017 7:57 AM UTC Time: April 13, 2017 5:57 AM From: khire...@redhat.com To: mabi Gluster Users Hi, I think the directory Workhours_2017 is deleted on master and on slave it's failing to delete because there might be stale linkto files at the back end. These issues are fixed in DHT with latest versions. Upgrading to latest version would solve these issues. To workaround the issue, you might need to cleanup the problematic directory on slave from the backend. Thanks and Regards, Kotresh H R - Original Message - > From: "mabi" > To: "Kotresh Hiremath Ravishankar" > Cc: "Gluster Users" > Sent: Thursday, April 13, 2017 12:28:50 AM > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat > "(unreachable)") > > Hi Kotresh, > > Thanks for your hint, adding the "--ignore-missing-args" option to rsync and > restarting geo-replication worked but it only managed to sync approximately > 1/3 of the data until it put the geo replication in status "Failed" this > time. Now I have a different type of error as you can see below from the log > extract on my geo replication slave node: > > [2017-04-12 18:01:55.268923] I [MSGID: 109066] [dht-rename.c:1574:dht_rename] > 0-myvol-private-geo-dht: renaming > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 > empty.xls.ocTransferId2118183895.part > (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) => > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls > (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) > [2017-04-12 18:01:55.269842] W [fuse-bridge.c:1787:fuse_rename_cbk] > 0-glusterfs-fuse: 4786: > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 > empty.xls.ocTransferId2118183895.part -> > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls => -1 > (Directory not empty) > [2017-04-12 18:01:55.314062] I [fuse-bridge.c:5016:fuse_thread_proc] 0-fuse: > unmounting /tmp/gsyncd-aux-mount-PNSR8s > [2017-04-12 18:01:55.314311] W [glusterfsd.c:1251:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7f97d3129064] > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f97d438a725] > -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x7f97d438a5a7] ) 0-: > received signum (15), shutting down > [2017-04-12 18:01:55.314335] I [fuse-bridge.c:5720:fini] 0-fuse: Unmounting > '/tmp/gsyncd-aux-mount-PNSR8s'. > > How can I fix now this issue and have geo-replication continue synchronising > again? > > Best regards, > M. > > Original Message > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat > "(unreachable)") > Local Time: April 11, 2017 9:18 AM > UTC Time: April 11, 2017 7:18 AM > From: khire...@redhat.com > To: mabi > Gluster Users > > Hi, > > Then please use set the following rsync config and let us know if it helps. > > gluster vol geo-rep :: config rsync-options > "--ignore-missing-args" > > Thanks and Regards, > Kotresh H R > > - Original Message - > > From: "mabi" > > To: "Kotresh Hiremath Ravishankar" > > Cc: "Gluster Users" > > Sent: Tuesday, April 11, 2017 2:15:54 AM > > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat > > "(unreachable)") > > > > Hi Kotresh, > > > > I am using the official Debian 8 (jessie) package which has rsync version > > 3.1.1. > > > > Regards, > > M. > > > > Original Message > > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat > > "(unreachable)") > > Local Time: April 10, 2017 6:33 AM > > UTC Time: April 10, 2017 4:33 AM > > From: khire...@redhat.com > > To: mabi > > Gluster Users > > > > Hi Mabi, > > > > What's the rsync version being used? > > > > Thanks and Regards, > > Kotresh H R > > > > - Original Message - > > > From: "mabi" > > > To: "Gluster Users" > > > Sent: Saturday, April 8, 2017 4:20:25 PM
[Gluster-users] Upgrading to 3.8 guide missing
Hello, I am planning to upgade from 3.7.20 to 3.8.11 but unfortunately the "Upgrading to 3.8" guide is missing: https://gluster.readthedocs.io/en/latest/Upgrade-Guide/README/ Where can I find the instruction to upgrade to 3.8? Best regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Upgrading to 3.8 guide missing
Does anyone know why this guide is missing? Regards, M. Original Message Subject: Upgrading to 3.8 guide missing Local Time: April 14, 2017 11:18 AM UTC Time: April 14, 2017 9:18 AM From: m...@protonmail.ch To: Gluster Users Hello, I am planning to upgade from 3.7.20 to 3.8.11 but unfortunately the "Upgrading to 3.8" guide is missing: https://gluster.readthedocs.io/en/latest/Upgrade-Guide/README/ Where can I find the instruction to upgrade to 3.8? Best regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed
Thanks for pointing me to the documentation. That's perfect, I can now plan my upgrade to 3.8.11. By the way I was wondering why is a self-heal part of the upgrade procedure? Is it just in case or is it mandatory? Regards M. Original Message Subject: Re: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed Local Time: April 20, 2017 5:17 PM UTC Time: April 20, 2017 3:17 PM From: nde...@redhat.com To: mabi gluster-users@gluster.org On Wed, Apr 19, 2017 at 01:46:14PM -0400, mabi wrote: > Sorry for insisting but where can I find the upgrading to 3.8 guide? > This is the only guide missing from the docs... I would like to > upgrade from 3.7 and would like to follow the documentation to make > sure everything goes well. The upgrade guide for 3.8 has been lumbering in a HitHub Pull-Request for a while now. I've just updated it again and hope it will be merged soon: https://github.com/gluster/glusterdocs/pull/219 You can see the proposed document here: https://github.com/nixpanic/glusterdocs/blob/f6d48dc17f2cb6ee4680e372520ec3358641b2bc/Upgrade-Guide/upgrade_to_3.8.md HTH, Niels > > Original Message > Subject: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed > Local Time: April 18, 2017 4:34 PM > UTC Time: April 18, 2017 2:34 PM > From: nde...@redhat.com > To: annou...@gluster.org > > [repost from > http://blog.nixpanic.net/2017/04/bugfix-release-glusterfs-3811-has-landed.html] > > Bugfix release GlusterFS 3.8.11 has landed > > An other month has passed, and more bugs have been squashed in the > 3.8 release. Packages should be available or arrive soon at the usual > repositories. The next 3.8 update is expected to be made available just > after the 10th of May. > > Release notes for Gluster 3.8.11 > > This is a bugfix release. The Release Notes for 3.8.0, 3.8.1, 3.8.2, > 3.8.3, 3.8.4, 3.8.5, 3.8.6, 3.8.7, 3.8.8, 3.8.9 and 3.8.10 contain a > listing of all the new features that were added and bugs fixed in the > GlusterFS 3.8 stable release. > > Bugs addressed > > A total of 15 patches have been merged, addressing 13 bugs: > * #1422788: [Replicate] "RPC call decoding failed" leading to IO hang & mount > inaccessible > * #1427390: systemic testing: seeing lot of ping time outs which would lead > to splitbrains > * #1430845: build/packaging: Debian and Ubuntu don't have /usr/libexec/; > results in bad packages > * #1431592: memory leak in features/locks xlator > * #1434298: [Disperse] Metadata version is not healing when a brick is down > * #1434302: Move spit-brain msg in read txn to debug > * #1435645: Disperse: Provide description of disperse.eager-lock option. > * #1436231: Undo pending xattrs only on the up bricks > * #1436412: Unrecognized filesystems (i.e. btrfs, zfs) log many errors about > "getinode size" > * #1437330: Sharding: Fix a performance bug > * #1438424: [Ganesha + EC] : Input/Output Error while creating LOTS of > smallfiles > * #1439112: File-level WORM allows ftruncate() on read-only files > * #1440635: Application VMs with their disk images on sharded-replica 3 > volume are unable to boot after performing rebalance > > ___ > Announce mailing list > annou...@gluster.org > http://lists.gluster.org/mailman/listinfo/announce > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed
Thanks for the precisions regarding the healing during the online upgrade procedure. To be on the safe side I will follow the offline upgrade procedure. I am indeed using replication with two nodes. Original Message Subject: Re: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed Local Time: April 22, 2017 9:07 AM UTC Time: April 22, 2017 7:07 AM From: pkara...@redhat.com To: mabi Niels de Vos , gluster-users@gluster.org If your volume has replication/erasure coding then it is mandatory. On Fri, Apr 21, 2017 at 1:05 AM, mabi wrote: Thanks for pointing me to the documentation. That's perfect, I can now plan my upgrade to 3.8.11. By the way I was wondering why is a self-heal part of the upgrade procedure? Is it just in case or is it mandatory? Regards M. Original Message Subject: Re: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed Local Time: April 20, 2017 5:17 PM UTC Time: April 20, 2017 3:17 PM From: nde...@redhat.com To: mabi gluster-users@gluster.org On Wed, Apr 19, 2017 at 01:46:14PM -0400, mabi wrote: > Sorry for insisting but where can I find the upgrading to 3.8 guide? > This is the only guide missing from the docs... I would like to > upgrade from 3.7 and would like to follow the documentation to make > sure everything goes well. The upgrade guide for 3.8 has been lumbering in a HitHub Pull-Request for a while now. I've just updated it again and hope it will be merged soon: https://github.com/gluster/glusterdocs/pull/219 You can see the proposed document here: https://github.com/nixpanic/glusterdocs/blob/f6d48dc17f2cb6ee4680e372520ec3358641b2bc/Upgrade-Guide/upgrade_to_3.8.md HTH, Niels > > Original Message > Subject: [Gluster-users] Bugfix release GlusterFS 3.8.11 has landed > Local Time: April 18, 2017 4:34 PM > UTC Time: April 18, 2017 2:34 PM > From: nde...@redhat.com > To: annou...@gluster.org > > [repost from > http://blog.nixpanic.net/2017/04/bugfix-release-glusterfs-3811-has-landed.html] > > Bugfix release GlusterFS 3.8.11 has landed > > An other month has passed, and more bugs have been squashed in the > 3.8 release. Packages should be available or arrive soon at the usual > repositories. The next 3.8 update is expected to be made available just > after the 10th of May. > > Release notes for Gluster 3.8.11 > > This is a bugfix release. The Release Notes for 3.8.0, 3.8.1, 3.8.2, > 3.8.3, 3.8.4, 3.8.5, 3.8.6, 3.8.7, 3.8.8, 3.8.9 and 3.8.10 contain a > listing of all the new features that were added and bugs fixed in the > GlusterFS 3.8 stable release. > > Bugs addressed > > A total of 15 patches have been merged, addressing 13 bugs: > * #1422788: [Replicate] "RPC call decoding failed" leading to IO hang & mount > inaccessible > * #1427390: systemic testing: seeing lot of ping time outs which would lead > to splitbrains > * #1430845: build/packaging: Debian and Ubuntu don't have /usr/libexec/; > results in bad packages > * #1431592: memory leak in features/locks xlator > * #1434298: [Disperse] Metadata version is not healing when a brick is down > * #1434302: Move spit-brain msg in read txn to debug > * #1435645: Disperse: Provide description of disperse.eager-lock option. > * #1436231: Undo pending xattrs only on the up bricks > * #1436412: Unrecognized filesystems (i.e. btrfs, zfs) log many errors about > "getinode size" > * #1437330: Sharding: Fix a performance bug > * #1438424: [Ganesha + EC] : Input/Output Error while creating LOTS of > smallfiles > * #1439112: File-level WORM allows ftruncate() on read-only files > * #1440635: Application VMs with their disk images on sharded-replica 3 > volume are unable to boot after performing rebalance > > ___ > Announce mailing list > annou...@gluster.org > http://lists.gluster.org/mailman/listinfo/announce > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users -- Pranith___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] glustershd: unable to get index-dir on myvolume-client-0
Hi, I have a two nodes GlusterFS 3.8.11 replicated volume and just noticed today in the glustershd.log log file a lot of the following warning messages: [2017-05-01 18:42:18.004747] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 [2017-05-01 18:52:19.004989] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 [2017-05-01 19:02:20.004827] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 Does someone understand what it means and if I should be concerned or not? Could it be related that I use ZFS and not XFS as filesystem? Best regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] glustershd: unable to get index-dir on myvolume-client-0
Hi Ravi, Thanks for the pointer, you are totally right the "dirty" directory is missing on my node1. Here is the output of a "ls -la" of both nodes: node1: drw--- 2 root root 2 Apr 28 22:15 entry-changes drw--- 2 root root 2 Mar 6 2016 xattrop node2: drw--- 2 root root 3 May 2 19:57 dirty drw--- 2 root root 2 Apr 28 22:15 entry-changes drw--- 2 root root 3 May 2 19:57 xattrop Now what would be the procedure in order to add the "dirty" directory on node1? Can I simply do an "mkdir dirty" in the indices directory? or do I need to stop the volume before? Regards, M. Original Message Subject: Re: [Gluster-users] glustershd: unable to get index-dir on myvolume-client-0 Local Time: May 2, 2017 10:56 AM UTC Time: May 2, 2017 8:56 AM From: ravishan...@redhat.com To: mabi , Gluster Users On 05/02/2017 01:08 AM, mabi wrote: Hi, I have a two nodes GlusterFS 3.8.11 replicated volume and just noticed today in the glustershd.log log file a lot of the following warning messages: [2017-05-01 18:42:18.004747] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 [2017-05-01 18:52:19.004989] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 [2017-05-01 19:02:20.004827] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 Does someone understand what it means and if I should be concerned or not? Could it be related that I use ZFS and not XFS as filesystem? In replicate volumes, the //.glusterfs/indices directory of bricks must contain these sub folders: 'dirty', 'entry-changes' and 'xattrop'. From the messages, it looks like these are missing from your first brick (myvolume-client-0). Can you check if that is the case? -Ravi Best regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] glustershd: unable to get index-dir on myvolume-client-0
Thanks Ravi, I now manually created the missing "dirty" directory and do not get any error messages in the gluster self-heal daemon log file. There is still one warning message which I see often in my brick log file and would be thankful if you could let me know what this means or what the problem could be: [2017-05-07 11:26:22.465194] W [dict.c:1223:dict_foreach_match] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_foreach_match+0x65) [0x7f8795902f45] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/index.so(+0x31d0) [0x7f878d6971d0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_foreach_match+0xe1) [0x7f8795902fc1] ) 0-dict: dict|match|action is NULL [Invalid argument] Any ideas? Original Message Subject: Re: [Gluster-users] glustershd: unable to get index-dir on myvolume-client-0 Local Time: May 3, 2017 3:09 AM UTC Time: May 3, 2017 1:09 AM From: ravishan...@redhat.com To: mabi Gluster Users On 05/02/2017 11:48 PM, mabi wrote: Hi Ravi, Thanks for the pointer, you are totally right the "dirty" directory is missing on my node1. Here is the output of a "ls -la" of both nodes: node1: drw--- 2 root root 2 Apr 28 22:15 entry-changes drw--- 2 root root 2 Mar 6 2016 xattrop node2: drw--- 2 root root 3 May 2 19:57 dirty drw--- 2 root root 2 Apr 28 22:15 entry-changes drw--- 2 root root 3 May 2 19:57 xattrop Now what would be the procedure in order to add the "dirty" directory on node1? Can I simply do an "mkdir dirty" in the indices directory? or do I need to stop the volume before? mkdir should work. The folders are created whenever the brick process is started, so I'm wondering how it went missing in the first place. -Ravi Regards, M. Original Message Subject: Re: [Gluster-users] glustershd: unable to get index-dir on myvolume-client-0 Local Time: May 2, 2017 10:56 AM UTC Time: May 2, 2017 8:56 AM From: ravishan...@redhat.com To: mabi [](mailto:m...@protonmail.ch), Gluster Users [](mailto:gluster-users@gluster.org) On 05/02/2017 01:08 AM, mabi wrote: Hi, I have a two nodes GlusterFS 3.8.11 replicated volume and just noticed today in the glustershd.log log file a lot of the following warning messages: [2017-05-01 18:42:18.004747] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 [2017-05-01 18:52:19.004989] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 [2017-05-01 19:02:20.004827] W [MSGID: 108034] [afr-self-heald.c:479:afr_shd_index_sweep] 0-myvolume-replicate-0: unable to get index-dir on myvolume-client-0 Does someone understand what it means and if I should be concerned or not? Could it be related that I use ZFS and not XFS as filesystem? In replicate volumes, the //.glusterfs/indices directory of bricks must contain these sub folders: 'dirty', 'entry-changes' and 'xattrop'. From the messages, it looks like these are missing from your first brick (myvolume-client-0). Can you check if that is the case? -Ravi Best regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Quota limits gone after upgrading to 3.8
Hello, I upgraded last week my 2 nodes replica GlusterFS cluster from 3.7.20 to 3.8.11 and on one of the volumes I use the quota feature of GlusterFS. Unfortunately, I just noticed by using the usual command "gluster volume quota myvolume list" that all my quotas on that volume are gone. I had around 10 different quotas set on different directories. Does anyone have an idea where the quotas have vanished? are they gone for always and do I need to re-set them all? Regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Quota limits gone after upgrading to 3.8
Hi Sanoj, Thanks for pointing me at this bug, I was not aware about it. As this is a production GlusterFS cluster I would rather not mess with the quota.conf file as you suggested. Instead I will simply re-add all my quotas by running the following command again: gluster volume quota myvolume limit-usage /directory1 100GB Can you confirm me that this is safe to run again? As soon as I have a minute I will complete your survey about quotas. Best, M. Original Message Subject: Re: [Gluster-users] Quota limits gone after upgrading to 3.8 Local Time: May 9, 2017 6:50 AM UTC Time: May 9, 2017 4:50 AM From: sunni...@redhat.com To: mabi Gluster Users Hi mabi, This bug was fixed recently, https://bugzilla.redhat.com/show_bug.cgi?id=1414346. It would be available in 3.11 release. I will plan to back port same to earlier releases. Your quota limits are still set and honored, It is only the listing that has gone wrong. Using list with command with single path should display the limit on that path. The printing of list gets messed up when the last gfid in the quota.conf file is not present in the FS (due to an rmdir without a remove limit) You could use the following workaround to get rid of the issue. => Remove exactly the last 17 bytes of " /var/lib/glusterd/vols//quota.conf" Note: keep a backup of quota.conf for safety If this does not solve the issue, please revert back with 1) quota.conf file 2) output of list command (when executed along with path) 3) getfattr -d -m . -e hex | grep limit It would be great to have your feedback for quota on this thread (http://lists.gluster.org/pipermail/gluster-users/2017-April/030676.html) Thanks & Regards, Sanoj On Mon, May 8, 2017 at 7:58 PM, mabi wrote: Hello, I upgraded last week my 2 nodes replica GlusterFS cluster from 3.7.20 to 3.8.11 and on one of the volumes I use the quota feature of GlusterFS. Unfortunately, I just noticed by using the usual command "gluster volume quota myvolume list" that all my quotas on that volume are gone. I had around 10 different quotas set on different directories. Does anyone have an idea where the quotas have vanished? are they gone for always and do I need to re-set them all? Regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Quota limits gone after upgrading to 3.8
Hi Sanoj, I do understand that my quotas are still working on my GluserFS volume but not displayed in the output of the volume quota list command. I now did the test of re-adding the quota by running for example: gluster volume quota myvolume limit-usage /directoryX 50GB After that I ran the volume quota list command and luckily enough my quota is available display again in the list. So I guess I will re-add the quotas so that the are displayed again in the list. That's the easiest way for me but I do hope the quotas stay next time I upgrade... Regards, M. Original Message Subject: Re: [Gluster-users] Quota limits gone after upgrading to 3.8 Local Time: May 10, 2017 8:48 AM UTC Time: May 10, 2017 6:48 AM From: sunni...@redhat.com To: mabi Gluster Users Hi Mabi, Note that limits are still configured and working. re-adding the limits will not help here (unless you are willing to disable and re-enable quota first). The reason is if a gfid exists in quota.conf (because a limit was earlier set on it), it does not need change when limit changes. The quota.conf file only keep track of which gfid have limit set. The original value of the limits are set in xattr on filesystem Another work around without mauallly touching quota.conf is, > Create a new dummy directory anywhere in the FS. add a limit in this > directory. After this you should be able to see the listing. If you remove this dummy directory or limit on it, you will once again be exposed to same issue. Regards, Sanoj On Tue, May 9, 2017 at 10:59 PM, mabi wrote: Hi Sanoj, Thanks for pointing me at this bug, I was not aware about it. As this is a production GlusterFS cluster I would rather not mess with the quota.conf file as you suggested. Instead I will simply re-add all my quotas by running the following command again: gluster volume quota myvolume limit-usage /directory1 100GB Can you confirm me that this is safe to run again? As soon as I have a minute I will complete your survey about quotas. Best, M. Original Message Subject: Re: [Gluster-users] Quota limits gone after upgrading to 3.8 Local Time: May 9, 2017 6:50 AM UTC Time: May 9, 2017 4:50 AM From: sunni...@redhat.com To: mabi Gluster Users Hi mabi, This bug was fixed recently, https://bugzilla.redhat.com/show_bug.cgi?id=1414346. It would be available in 3.11 release. I will plan to back port same to earlier releases. Your quota limits are still set and honored, It is only the listing that has gone wrong. Using list with command with single path should display the limit on that path. The printing of list gets messed up when the last gfid in the quota.conf file is not present in the FS (due to an rmdir without a remove limit) You could use the following workaround to get rid of the issue. => Remove exactly the last 17 bytes of " /var/lib/glusterd/vols//quota.conf" Note: keep a backup of quota.conf for safety If this does not solve the issue, please revert back with 1) quota.conf file 2) output of list command (when executed along with path) 3) getfattr -d -m . -e hex | grep limit It would be great to have your feedback for quota on this thread (http://lists.gluster.org/pipermail/gluster-users/2017-April/030676.html) Thanks & Regards, Sanoj On Mon, May 8, 2017 at 7:58 PM, mabi wrote: Hello, I upgraded last week my 2 nodes replica GlusterFS cluster from 3.7.20 to 3.8.11 and on one of the volumes I use the quota feature of GlusterFS. Unfortunately, I just noticed by using the usual command "gluster volume quota myvolume list" that all my quotas on that volume are gone. I had around 10 different quotas set on different directories. Does anyone have an idea where the quotas have vanished? are they gone for always and do I need to re-set them all? Regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] 120k context switches on GlsuterFS nodes
Hi, Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a very high amount of context switches, around 120k. Usually the average is more around 1k-2k. So I checked what was happening and there where just more users accessing (downloading) their files at the same time. These are directories with typical cloud files, which means files of any sizes ranging from a few kB to MB and a lot of course. Now I never saw such a high number in context switches in my entire life so I wanted to ask if this is normal or to be expected? I do not find any signs of errors or warnings in any log files. My volume is a replicated volume on two nodes with ZFS as filesystem behind and the volume is mounted using FUSE on the client (the cloud server). On that cloud server the glusterfs process was using quite a lot of system CPU but that server (VM) only has 2 vCPUs so maybe I should increase the number of vCPUs... Any ideas or recommendations? Regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 120k context switches on GlsuterFS nodes
Today I even saw up to 400k context switches for around 30 minutes on my two nodes replica... Does anyone else have so high context switches on their GlusterFS nodes? I am wondering what is "normal" and if I should be worried... Original Message Subject: 120k context switches on GlsuterFS nodes Local Time: May 11, 2017 9:18 PM UTC Time: May 11, 2017 7:18 PM From: m...@protonmail.ch To: Gluster Users Hi, Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a very high amount of context switches, around 120k. Usually the average is more around 1k-2k. So I checked what was happening and there where just more users accessing (downloading) their files at the same time. These are directories with typical cloud files, which means files of any sizes ranging from a few kB to MB and a lot of course. Now I never saw such a high number in context switches in my entire life so I wanted to ask if this is normal or to be expected? I do not find any signs of errors or warnings in any log files. My volume is a replicated volume on two nodes with ZFS as filesystem behind and the volume is mounted using FUSE on the client (the cloud server). On that cloud server the glusterfs process was using quite a lot of system CPU but that server (VM) only has 2 vCPUs so maybe I should increase the number of vCPUs... Any ideas or recommendations? Regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 120k context switches on GlsuterFS nodes
I don't know exactly what kind of context-switches it was but what I know is that it is the "cs" number under "system" when you run vmstat. Also I use the percona linux monitoring template for cacti (https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html) which monitors context switches too. If that's of any use interrupts where also quite high during that time with peaks up to 50k interrupts. Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 17, 2017 2:37 AM UTC Time: May 17, 2017 12:37 AM From: ravishan...@redhat.com To: mabi , Gluster Users On 05/16/2017 11:13 PM, mabi wrote: Today I even saw up to 400k context switches for around 30 minutes on my two nodes replica... Does anyone else have so high context switches on their GlusterFS nodes? I am wondering what is "normal" and if I should be worried... Original Message Subject: 120k context switches on GlsuterFS nodes Local Time: May 11, 2017 9:18 PM UTC Time: May 11, 2017 7:18 PM From: m...@protonmail.ch To: Gluster Users [](mailto:gluster-users@gluster.org) Hi, Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a very high amount of context switches, around 120k. Usually the average is more around 1k-2k. So I checked what was happening and there where just more users accessing (downloading) their files at the same time. These are directories with typical cloud files, which means files of any sizes ranging from a few kB to MB and a lot of course. Now I never saw such a high number in context switches in my entire life so I wanted to ask if this is normal or to be expected? I do not find any signs of errors or warnings in any log files. What context switch are you referring to (syscalls context-switch on the bricks?) ? How did you measure this? -Ravi My volume is a replicated volume on two nodes with ZFS as filesystem behind and the volume is mounted using FUSE on the client (the cloud server). On that cloud server the glusterfs process was using quite a lot of system CPU but that server (VM) only has 2 vCPUs so maybe I should increase the number of vCPUs... Any ideas or recommendations? Regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 120k context switches on GlsuterFS nodes
I have been using GlusterFS now for over a year in production and what made me write this mail initially is that it is the first time I see such high peeks in terms of context switches/interrupts but else to reply to your question everything seems to work just fine so far. What has changed from my side is that I was using 3.7 until end of April and now I am on the latest 3.8.11 version of GlusterFS. So as you mention it could be something related to 3.8. Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 17, 2017 7:49 PM UTC Time: May 17, 2017 5:49 PM From: jlawre...@squaretrade.com To: mabi , Gluster Users > On May 17, 2017, at 10:20 AM, mabi wrote: > > I don't know exactly what kind of context-switches it was but what I know is > that it is the "cs" number under "system" when you run vmstat. > > Also I use the percona linux monitoring template for cacti > (https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html) > which monitors context switches too. If that's of any use interrupts where > also quite high during that time with peaks up to 50k interrupts. You can't read or write data from the disk or send data over the network from userspace without making system calls. System calls mean context switches. So you should expect to see the CS number scale with load - the whole point of Gluster is to read and write and send data over the network. As far as them being "excessive", I don't know how to think about that without at least a comparison , or better, some evidence that something is doing more work than it "should". (Or best, line numbers where unnecessary work is being performed.) Is there something other than a surprising number to make you think it isn't behaving well? Did the number jump after an upgrade? Do you have other systems doing roughly the same thing with other software that performs better? Keep in mind that, say, a vanilla NFS or SMB server doesn't have the inter-gluster-node overhead, and how much of that traffic there is depends on how you've configured Gluster. -j___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 120k context switches on GlsuterFS nodes
I have a single Intel Xeon CPU E5-2620 v3 @ 2.40GHz in each nodes and this one has 6 cores and 12 threads. I thought this would be enough for GlusterFS. When I check my CPU graphs everything is pretty much idle and there is hardly any peeks at all on the CPU. During the very high context switch my CPU graphs shows the following: 1 thread was 100% busy in CPU user 1 thread was 100% busy in CPU system leaving actually 10 other threads out of the total of 12 threads unused... Is there maybe any performance tuning parameters I need to configure in order to make a better use of my CPU cores or threads? Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 18, 2017 7:03 AM UTC Time: May 18, 2017 5:03 AM From: ravishan...@redhat.com To: Pranith Kumar Karampuri , mabi Gluster Users , Gluster Devel On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote: + gluster-devel On Wed, May 17, 2017 at 10:50 PM, mabi wrote: I don't know exactly what kind of context-switches it was but what I know is that it is the "cs" number under "system" when you run vmstat. Okay, that could be due to the syscalls themselves or pre-emptive multitasking in case there aren't enough cpu cores. I think the spike in numbers is due to more users accessing the files at the same time like you observed, translating into more syscalls. You can try capturing the gluster volume profile info the next time it occurs and co-relate with the cs count. If you don't see any negative performance impact, I think you don't need to be bothered much by the numbers. HTH, Ravi Also I use the percona linux monitoring template for cacti (https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html) which monitors context switches too. If that's of any use interrupts where also quite high during that time with peaks up to 50k interrupts. Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 17, 2017 2:37 AM UTC Time: May 17, 2017 12:37 AM From: ravishan...@redhat.com To: mabi , Gluster Users On 05/16/2017 11:13 PM, mabi wrote: Today I even saw up to 400k context switches for around 30 minutes on my two nodes replica... Does anyone else have so high context switches on their GlusterFS nodes? I am wondering what is "normal" and if I should be worried... Original Message Subject: 120k context switches on GlsuterFS nodes Local Time: May 11, 2017 9:18 PM UTC Time: May 11, 2017 7:18 PM From: m...@protonmail.ch To: Gluster Users [](mailto:gluster-users@gluster.org) Hi, Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a very high amount of context switches, around 120k. Usually the average is more around 1k-2k. So I checked what was happening and there where just more users accessing (downloading) their files at the same time. These are directories with typical cloud files, which means files of any sizes ranging from a few kB to MB and a lot of course. Now I never saw such a high number in context switches in my entire life so I wanted to ask if this is normal or to be expected? I do not find any signs of errors or warnings in any log files. What context switch are you referring to (syscalls context-switch on the bricks?) ? How did you measure this? -Ravi My volume is a replicated volume on two nodes with ZFS as filesystem behind and the volume is mounted using FUSE on the client (the cloud server). On that cloud server the glusterfs process was using quite a lot of system CPU but that server (VM) only has 2 vCPUs so maybe I should increase the number of vCPUs... Any ideas or recommendations? Regards, M. __ _ Gluster-users mailing list Gluster-users@gluster.org [http://lists.gluster.org/ mailman/listinfo/gluster-users](http://lists.gluster.org/mailman/listinfo/gluster-users) ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users -- Pranith___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 120k context switches on GlsuterFS nodes
Sorry for posting again but I was really wondering if it is somehow possible to tune gluster in order to make better use of all my cores (see below for the details). I suspect that is the reason for the high sporadic context switches I have been experiencing. Cheers! Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 18, 2017 8:43 PM UTC Time: May 18, 2017 6:43 PM From: m...@protonmail.ch To: Ravishankar N Pranith Kumar Karampuri , Gluster Users , Gluster Devel I have a single Intel Xeon CPU E5-2620 v3 @ 2.40GHz in each nodes and this one has 6 cores and 12 threads. I thought this would be enough for GlusterFS. When I check my CPU graphs everything is pretty much idle and there is hardly any peeks at all on the CPU. During the very high context switch my CPU graphs shows the following: 1 thread was 100% busy in CPU user 1 thread was 100% busy in CPU system leaving actually 10 other threads out of the total of 12 threads unused... Is there maybe any performance tuning parameters I need to configure in order to make a better use of my CPU cores or threads? Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 18, 2017 7:03 AM UTC Time: May 18, 2017 5:03 AM From: ravishan...@redhat.com To: Pranith Kumar Karampuri , mabi Gluster Users , Gluster Devel On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote: + gluster-devel On Wed, May 17, 2017 at 10:50 PM, mabi wrote: I don't know exactly what kind of context-switches it was but what I know is that it is the "cs" number under "system" when you run vmstat. Okay, that could be due to the syscalls themselves or pre-emptive multitasking in case there aren't enough cpu cores. I think the spike in numbers is due to more users accessing the files at the same time like you observed, translating into more syscalls. You can try capturing the gluster volume profile info the next time it occurs and co-relate with the cs count. If you don't see any negative performance impact, I think you don't need to be bothered much by the numbers. HTH, Ravi Also I use the percona linux monitoring template for cacti (https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html) which monitors context switches too. If that's of any use interrupts where also quite high during that time with peaks up to 50k interrupts. Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 17, 2017 2:37 AM UTC Time: May 17, 2017 12:37 AM From: ravishan...@redhat.com To: mabi , Gluster Users On 05/16/2017 11:13 PM, mabi wrote: Today I even saw up to 400k context switches for around 30 minutes on my two nodes replica... Does anyone else have so high context switches on their GlusterFS nodes? I am wondering what is "normal" and if I should be worried... Original Message Subject: 120k context switches on GlsuterFS nodes Local Time: May 11, 2017 9:18 PM UTC Time: May 11, 2017 7:18 PM From: m...@protonmail.ch To: Gluster Users [](mailto:gluster-users@gluster.org) Hi, Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a very high amount of context switches, around 120k. Usually the average is more around 1k-2k. So I checked what was happening and there where just more users accessing (downloading) their files at the same time. These are directories with typical cloud files, which means files of any sizes ranging from a few kB to MB and a lot of course. Now I never saw such a high number in context switches in my entire life so I wanted to ask if this is normal or to be expected? I do not find any signs of errors or warnings in any log files. What context switch are you referring to (syscalls context-switch on the bricks?) ? How did you measure this? -Ravi My volume is a replicated volume on two nodes with ZFS as filesystem behind and the volume is mounted using FUSE on the client (the cloud server). On that cloud server the glusterfs process was using quite a lot of system CPU but that server (VM) only has 2 vCPUs so maybe I should increase the number of vCPUs... Any ideas or recommendations? Regards, M. __ _ Gluster-users mailing list Gluster-users@gluster.org [http://lists.gluster.org/ mailman/listinfo/gluster-users](http://lists.gluster.org/mailman/listinfo/gluster-users) ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users -- Pranith___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 120k context switches on GlsuterFS nodes
If this happens again I will try to run the profiling of gluster and post back. Fortunately it does not happen often but I need then to be in front so that I can start/stop the profiling. By the way on that server I just have two clients connected with FUSE. Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 22, 2017 7:45 PM UTC Time: May 22, 2017 5:45 PM From: j...@julianfamily.org To: gluster-users@gluster.org On 05/22/17 10:27, mabi wrote: Sorry for posting again but I was really wondering if it is somehow possible to tune gluster in order to make better use of all my cores (see below for the details). I suspect that is the reason for the high sporadic context switches I have been experiencing. Cheers! In theory, more clients and more diverse filesets. The only way to know would be for you to analyze the traffic pattern and/or profile gluster on your server. There's never some magic "tune software X to operate more efficiently" setting, or else it would be the default (except for the "turbo" button back in the early PC clone days). Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 18, 2017 8:43 PM UTC Time: May 18, 2017 6:43 PM From: m...@protonmail.ch To: Ravishankar N [](mailto:ravishan...@redhat.com) Pranith Kumar Karampuri [](mailto:pkara...@redhat.com), Gluster Users [](mailto:gluster-users@gluster.org), Gluster Devel [](mailto:gluster-de...@gluster.org) I have a single Intel Xeon CPU E5-2620 v3 @ 2.40GHz in each nodes and this one has 6 cores and 12 threads. I thought this would be enough for GlusterFS. When I check my CPU graphs everything is pretty much idle and there is hardly any peeks at all on the CPU. During the very high context switch my CPU graphs shows the following: 1 thread was 100% busy in CPU user 1 thread was 100% busy in CPU system leaving actually 10 other threads out of the total of 12 threads unused... Is there maybe any performance tuning parameters I need to configure in order to make a better use of my CPU cores or threads? Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 18, 2017 7:03 AM UTC Time: May 18, 2017 5:03 AM From: ravishan...@redhat.com To: Pranith Kumar Karampuri [](mailto:pkara...@redhat.com), mabi [](mailto:m...@protonmail.ch) Gluster Users [](mailto:gluster-users@gluster.org), Gluster Devel [](mailto:gluster-de...@gluster.org) On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote: + gluster-devel On Wed, May 17, 2017 at 10:50 PM, mabi wrote: I don't know exactly what kind of context-switches it was but what I know is that it is the "cs" number under "system" when you run vmstat. Okay, that could be due to the syscalls themselves or pre-emptive multitasking in case there aren't enough cpu cores. I think the spike in numbers is due to more users accessing the files at the same time like you observed, translating into more syscalls. You can try capturing the gluster volume profile info the next time it occurs and co-relate with the cs count. If you don't see any negative performance impact, I think you don't need to be bothered much by the numbers. HTH, Ravi Also I use the percona linux monitoring template for cacti (https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html) which monitors context switches too. If that's of any use interrupts where also quite high during that time with peaks up to 50k interrupts. Original Message Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes Local Time: May 17, 2017 2:37 AM UTC Time: May 17, 2017 12:37 AM From: ravishan...@redhat.com To: mabi , Gluster Users On 05/16/2017 11:13 PM, mabi wrote: Today I even saw up to 400k context switches for around 30 minutes on my two nodes replica... Does anyone else have so high context switches on their GlusterFS nodes? I am wondering what is "normal" and if I should be worried... Original Message Subject: 120k context switches on GlsuterFS nodes Local Time: May 11, 2017 9:18 PM UTC Time: May 11, 2017 7:18 PM From: m...@protonmail.ch To: Gluster Users [](mailto:gluster-users@gluster.org) Hi, Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a very high amount of context switches, around 120k. Usually the average is more around 1k-2k. So I checked what was happening and there where just more users accessing (downloading) their files at the same time. These are directories with typical cloud files, which means files of any sizes ranging from a few kB to MB and a lot of course. Now I never saw such a high number in context switches in my entire life so I wanted to ask if this is normal or to be expected? I do not find any signs
Re: [Gluster-users] [ovirt-users] Very poor GlusterFS performance
Dear Krutika, Sorry for asking so naively but can you tell me on what factor do you base that the client and server event-threads parameters for a volume should be set to 4? Is this metric for example based on the number of cores a GlusterFS server has? I am asking because I saw my GlusterFS volumes are set to 2 and would like to set these parameters to something meaningful for performance tuning. My setup is a two node replica with GlusterFS 3.8.11. Best regards, M. Original Message Subject: Re: [Gluster-users] [ovirt-users] Very poor GlusterFS performance Local Time: June 20, 2017 12:23 PM UTC Time: June 20, 2017 10:23 AM From: kdhan...@redhat.com To: Lindsay Mathieson gluster-users , oVirt users Couple of things: 1. Like Darrell suggested, you should enable stat-prefetch and increase client and server event threads to 4. # gluster volume set performance.stat-prefetch on # gluster volume set client.event-threads 4 # gluster volume set server.event-threads 4 2. Also glusterfs-3.10.1 and above has a shard performance bug fix - https://review.gluster.org/#/c/16966/ With these two changes, we saw great improvement in performance in our internal testing. Do you mind trying these two options above? -Krutika On Tue, Jun 20, 2017 at 1:00 PM, Lindsay Mathieson wrote: Have you tried with: performance.strict-o-direct : off performance.strict-write-ordering : off They can be changed dynamically. On 20 June 2017 at 17:21, Sahina Bose wrote: [Adding gluster-users] On Mon, Jun 19, 2017 at 8:16 PM, Chris Boot wrote: Hi folks, I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10 configuration. My VMs run off a replica 3 arbiter 1 volume comprised of 6 bricks, which themselves live on two SSDs in each of the servers (one brick per SSD). The bricks are XFS on LVM thin volumes straight onto the SSDs. Connectivity is 10G Ethernet. Performance within the VMs is pretty terrible. I experience very low throughput and random IO is really bad: it feels like a latency issue. On my oVirt nodes the SSDs are not generally very busy. The 10G network seems to run without errors (iperf3 gives bandwidth measurements of >= 9.20 Gbits/sec between the three servers). To put this into perspective: I was getting better behaviour from NFS4 on a gigabit connection than I am with GlusterFS on 10G: that doesn't feel right at all. My volume configuration looks like this: Volume Name: vmssd Type: Distributed-Replicate Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (2 + 1) = 6 Transport-type: tcp Bricks: Brick1: ovirt3:/gluster/ssd0_vmssd/brick Brick2: ovirt1:/gluster/ssd0_vmssd/brick Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter) Brick4: ovirt3:/gluster/ssd1_vmssd/brick Brick5: ovirt1:/gluster/ssd1_vmssd/brick Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter) Options Reconfigured: nfs.disable: on transport.address-family: inet6 performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off performance.low-prio-threads: 32 network.remote-dio: off cluster.eager-lock: enable cluster.quorum-type: auto cluster.server-quorum-type: server cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 1 features.shard: on user.cifs: off storage.owner-uid: 36 storage.owner-gid: 36 features.shard-block-size: 128MB performance.strict-o-direct: on network.ping-timeout: 30 cluster.granular-entry-heal: enable I would really appreciate some guidance on this to try to improve things because at this rate I will need to reconsider using GlusterFS altogether. Could you provide the gluster volume profile output while you're running your I/O tests. # gluster volume profile start to start profiling # gluster volume profile info for the profile output. Cheers, Chris -- Chris Boot bo...@bootc.net ___ Users mailing list us...@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users -- Lindsay ___ Users mailing list us...@ovirt.org http://lists.ovirt.org/mailman/listinfo/users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Persistent storage for docker containers from a Gluster volume
Hello, I have a two node replica 3.8 GlusterFS cluster and am trying to find out the best way to use a GlusterFS volume as persistent storage for docker containers to store their data (e.g. web assets). I was thinking that the simplest method would be to mount my GlusterFS volume for that purpose on all docker nodes using FUSE and then simply start containers which require persistent storage with a mount of bind type. For example here is how I would create my container requiring persistent storage: docker service create --name testcontainer --mount type=bind,source=/mnt/gustervol/testcontainer,target=/mnt alpine What do you think about that? Is this a good way? or is the even a better way? Regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Persistent storage for docker containers from a Gluster volume
Anyone? > Original Message > Subject: Persistent storage for docker containers from a Gluster volume > Local Time: June 25, 2017 6:38 PM > UTC Time: June 25, 2017 4:38 PM > From: m...@protonmail.ch > To: Gluster Users > Hello, > I have a two node replica 3.8 GlusterFS cluster and am trying to find out the > best way to use a GlusterFS volume as persistent storage for docker > containers to store their data (e.g. web assets). > I was thinking that the simplest method would be to mount my GlusterFS volume > for that purpose on all docker nodes using FUSE and then simply start > containers which require persistent storage with a mount of bind type. For > example here is how I would create my container requiring persistent storage: > docker service create --name testcontainer --mount > type=bind,source=/mnt/gustervol/testcontainer,target=/mnt alpine > What do you think about that? Is this a good way? or is the even a better way? > > Regards, > M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Arbiter node as VM
Hello, I have a replica 2 GlusterFS 3.8.11 cluster on 2 Debian 8 physical servers using ZFS as filesystem. Now in order to avoid a split-brain situation I would like to add a third node as arbiter. Regarding the arbiter node I have a few questions: - can the arbiter node be a virtual machine? (I am planning to use Xen as hypervisor) - can I use ext4 as file system on my arbiter? or does it need to be ZFS as the two other nodes? - or should I use here XFS with LVM this provisioning as mentioned in the - is it OK that my arbiter runs Debian 9 (Linux kernel v4) and my other two nodes run Debian 8 (kernel v3)? - what about thin provisioning of my volume on the arbiter node (https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/) is this required? on my two other nodes I do not use any thin provisioning neither LVM but simply ZFS. Thanks in advance for your input. Best regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Persistent storage for docker containers from a Gluster volume
Thank you very much Erekle for your links, they are all relevant to me and very interesting as as you mention I will be mostly serving a lot of rather smaller files from my containers. I will first consider upgrading my gluserfs from 3.8 to 3.10 to take advantage of the small file performance improvements which is afaik newly available in 3.10. Best regards, M. > Original Message > Subject: Re: [Gluster-users] Persistent storage for docker containers from a > Gluster volume > Local Time: June 29, 2017 12:34 PM > UTC Time: June 29, 2017 10:34 AM > From: erekle.magra...@recogizer.de > To: m...@protonmail.ch, gluster-users@gluster.org > > Hi, > > glusterFS is working fine for large files (in most of the cases it's used for > VM image store), with docker you'll generate bunch of small size files and if > you want to have a good performance may be look in [1] and [2]. > > Also two node replica is a bit dangerous in case of high load with small > files there is a good risk of split brain situation, therefore think about > arbiter functionality of gluster [3], I think if you'll apply recommendations > from [1] and [2] and deploy arbiter volume. > > Cheers > > Erekle > > [1] > http://blog.gluster.org/2016/10/gluster-tiering-and-small-file-performance/ > > [2] > https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/Small_File_Performance_Enhancements.html > > [3] > http://events.linuxfoundation.org/sites/events/files/slides/glusterfs-arbiter-VAULT-2016.pdf > > On 29.06.2017 11:55, Raghavendra Talur wrote: > >> On 28-Jun-2017 5:49 PM, "mabi" wrote: >> >>> Anyone? >>> >>>> Original Message >>>> Subject: Persistent storage for docker containers from a Gluster volume >>>> Local Time: June 25, 2017 6:38 PM >>>> UTC Time: June 25, 2017 4:38 PM >>>> From: m...@protonmail.ch >>>> To: Gluster Users >>>> Hello, >>>> I have a two node replica 3.8 GlusterFS cluster and am trying to find out >>>> the best way to use a GlusterFS volume as persistent storage for docker >>>> containers to store their data (e.g. web assets). >>>> I was thinking that the simplest method would be to mount my GlusterFS >>>> volume for that purpose on all docker nodes using FUSE and then simply >>>> start containers which require persistent storage with a mount of bind >>>> type. For example here is how I would create my container requiring >>>> persistent storage: >>>> docker service create --name testcontainer --mount >>>> type=bind,source=/mnt/gustervol/testcontainer,target=/mnt alpine >>>> What do you think about that? Is this a good way? or is the even a better >>>> way? >> If you are using kubernetes, then please have a look at >> https://github.com/gluster/gluster-kubernetes >> Otherwise, what you are suggesting works. >> Raghavendra Talur >> >>>> Regards, >>>> M. >>> ___ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Arbiter node as VM
Thanks for the hints. Now I added the arbiter 1 to my replica 2 using the volume add-brick command and it is now in the healing process in order to copy all the metadata files on my arbiter node. On one of my replica nodes in the brick log file for that particular volume I notice a lot of the following warning message during ongoing healing: [2017-06-30 14:04:42.050120] W [MSGID: 101088] [common-utils.c:3894:gf_backtrace_save] 0-myvolume-index: Failed to save the backtrace. Does anyone have a idea what this is about? The only hint here is the word "index" which for me means it has something to do with indexing. But is this warning normal? anything I can do about it? Regards, M. > Original Message > Subject: Re: [Gluster-users] Arbiter node as VM > Local Time: June 29, 2017 11:55 PM > UTC Time: June 29, 2017 9:55 PM > From: dougti+glus...@gmail.com > To: mabi > Gluster Users > > As long as the VM isn't hosted on one of the two Gluster nodes, that's > perfectly fine. One of my smaller clusters uses the same setup. > As for your other questions, as long as it supports Unix file permissions, > Gluster doesn't care what filesystem you use. Mix & match as you wish. Just > try to keep matching Gluster versions across your nodes. > > On 29 June 2017 at 16:10, mabi wrote: > >> Hello, >> >> I have a replica 2 GlusterFS 3.8.11 cluster on 2 Debian 8 physical servers >> using ZFS as filesystem. Now in order to avoid a split-brain situation I >> would like to add a third node as arbiter. >> Regarding the arbiter node I have a few questions: >> - can the arbiter node be a virtual machine? (I am planning to use Xen as >> hypervisor) >> - can I use ext4 as file system on my arbiter? or does it need to be ZFS as >> the two other nodes? >> - or should I use here XFS with LVM this provisioning as mentioned in the >> - is it OK that my arbiter runs Debian 9 (Linux kernel v4) and my other two >> nodes run Debian 8 (kernel v3)? >> - what about thin provisioning of my volume on the arbiter node >> (https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/) >> is this required? on my two other nodes I do not use any thin provisioning >> neither LVM but simply ZFS. >> Thanks in advance for your input. >> Best regards, >> Mabi >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] How to deal with FAILURES count in geo rep
Hello, I have a replica 2 with a remote slave node for geo-replication (GlusterFS 3.8.11 on Debian 8) and saw for the first time a non zero number in the FAILURES column when running: gluster volume geo-replcation myvolume remotehost:remotevol status detail Right now the number under the FAILURES column is 32 and have a few questions regarding how to deal with that: - first what does 32 mean? is it the number of files which failed to be geo replicated onto to slave node? - how can I find out which files failed to replicate? - how can I make gluster geo-rep re-try to replicate these files? Best regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] set owner:group on root of volume
Hi, By default the owner and group of a GlusterFS seems to be root:root now I changed this by first mounting my volume using glusterfs/fuse on a client and did the following chmod 1000:1000 /mnt/myglustervolume This changed correctly the owner and group to UID/GID 1000 of my volume but like 1-2 hours later it was back to root:root. I tried again and this happens again. Am I doing something wrong here? I am using GlusterFS 3.8.11 on Debian 8. Regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] set owner:group on root of volume
Just found out I needed to set following two parameters: gluster volume set myvol storage.owner-uid 1000 gluster volume set myvol storage.owner-gid 1000 In case that helps any one else :) > Original Message > Subject: set owner:group on root of volume > Local Time: July 11, 2017 8:15 PM > UTC Time: July 11, 2017 6:15 PM > From: m...@protonmail.ch > To: Gluster Users > Hi, > By default the owner and group of a GlusterFS seems to be root:root now I > changed this by first mounting my volume using glusterfs/fuse on a client and > did the following > chmod 1000:1000 /mnt/myglustervolume > This changed correctly the owner and group to UID/GID 1000 of my volume but > like 1-2 hours later it was back to root:root. I tried again and this happens > again. > Am I doing something wrong here? I am using GlusterFS 3.8.11 on Debian 8. > Regards, > M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] set owner:group on root of volume
Unfortunately the root directory of my volume still get its owner and group resetted to root. Can someone explain why or help with this issue? I need it to be set to UID/GID 1000 and stay like that. Thanks > Original Message > Subject: Re: set owner:group on root of volume > Local Time: July 11, 2017 9:33 PM > UTC Time: July 11, 2017 7:33 PM > From: m...@protonmail.ch > To: Gluster Users > Just found out I needed to set following two parameters: > gluster volume set myvol storage.owner-uid 1000 > gluster volume set myvol storage.owner-gid 1000 > In case that helps any one else :) > >> Original Message >> Subject: set owner:group on root of volume >> Local Time: July 11, 2017 8:15 PM >> UTC Time: July 11, 2017 6:15 PM >> From: m...@protonmail.ch >> To: Gluster Users >> Hi, >> By default the owner and group of a GlusterFS seems to be root:root now I >> changed this by first mounting my volume using glusterfs/fuse on a client >> and did the following >> chmod 1000:1000 /mnt/myglustervolume >> This changed correctly the owner and group to UID/GID 1000 of my volume but >> like 1-2 hours later it was back to root:root. I tried again and this >> happens again. >> Am I doing something wrong here? I am using GlusterFS 3.8.11 on Debian 8. >> Regards, >> M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] set owner:group on root of volume
Anyone has an idea? or shall I open a bug for that? > Original Message > Subject: Re: set owner:group on root of volume > Local Time: July 18, 2017 3:46 PM > UTC Time: July 18, 2017 1:46 PM > From: m...@protonmail.ch > To: Gluster Users > Unfortunately the root directory of my volume still get its owner and group > resetted to root. Can someone explain why or help with this issue? I need it > to be set to UID/GID 1000 and stay like that. > Thanks > >> Original Message >> Subject: Re: set owner:group on root of volume >> Local Time: July 11, 2017 9:33 PM >> UTC Time: July 11, 2017 7:33 PM >> From: m...@protonmail.ch >> To: Gluster Users >> Just found out I needed to set following two parameters: >> gluster volume set myvol storage.owner-uid 1000 >> gluster volume set myvol storage.owner-gid 1000 >> In case that helps any one else :) >> >>> Original Message >>> Subject: set owner:group on root of volume >>> Local Time: July 11, 2017 8:15 PM >>> UTC Time: July 11, 2017 6:15 PM >>> From: m...@protonmail.ch >>> To: Gluster Users >>> Hi, >>> By default the owner and group of a GlusterFS seems to be root:root now I >>> changed this by first mounting my volume using glusterfs/fuse on a client >>> and did the following >>> chmod 1000:1000 /mnt/myglustervolume >>> This changed correctly the owner and group to UID/GID 1000 of my volume but >>> like 1-2 hours later it was back to root:root. I tried again and this >>> happens again. >>> Am I doing something wrong here? I am using GlusterFS 3.8.11 on Debian 8. >>> Regards, >>> M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] set owner:group on root of volume
Hi Vijay, Thanks for your reply. Below the answers to your 3 questions. 1) Rather unlikely as my application does not run as root. That's the point my application runs as UID/GID 1000:1000 and the root of my GlusterFS volume needs to be owned by 1000 so that my application can write there. 2) Nothing in glustershd.log and there was not even a glfsheal-myvolume.log file until I ran "gluster volume heal myvolume info". 3) IIRC yes it does but I will have to test that again tomorrow as I now ran again manually a chown on the root of my volume through a fuse client. Regards, M. > Original Message > Subject: Re: [Gluster-users] set owner:group on root of volume > Local Time: July 23, 2017 8:15 PM > UTC Time: July 23, 2017 6:15 PM > From: vbel...@redhat.com > To: mabi , Gluster Users > On 07/20/2017 03:13 PM, mabi wrote: >> Anyone has an idea? or shall I open a bug for that? > This is an interesting problem. A few questions: > 1. Is there any chance that one of your applications does a chown on the > root? > 2. Do you notice any logs related to metadata self-heal on "/" in the > gluster logs? > 3. Does the ownership of all bricks reset to custom uid/gid after every > restart of the volume? > Thanks, > Vijay >> >> >>> Original Message >>> Subject: Re: set owner:group on root of volume >>> Local Time: July 18, 2017 3:46 PM >>> UTC Time: July 18, 2017 1:46 PM >>> From: m...@protonmail.ch >>> To: Gluster Users >>> >>> Unfortunately the root directory of my volume still get its owner and >>> group resetted to root. Can someone explain why or help with this >>> issue? I need it to be set to UID/GID 1000 and stay like that. >>> >>> Thanks >>> >>> >>> >>>> Original Message >>>> Subject: Re: set owner:group on root of volume >>>> Local Time: July 11, 2017 9:33 PM >>>> UTC Time: July 11, 2017 7:33 PM >>>> From: m...@protonmail.ch >>>> To: Gluster Users >>>> >>>> Just found out I needed to set following two parameters: >>>> >>>> gluster volume set myvol storage.owner-uid 1000 >>>> gluster volume set myvol storage.owner-gid 1000 >>>> >>>> >>>> >>>> In case that helps any one else :) >>>> >>>>> Original Message >>>>> Subject: set owner:group on root of volume >>>>> Local Time: July 11, 2017 8:15 PM >>>>> UTC Time: July 11, 2017 6:15 PM >>>>> From: m...@protonmail.ch >>>>> To: Gluster Users >>>>> >>>>> Hi, >>>>> >>>>> By default the owner and group of a GlusterFS seems to be root:root >>>>> now I changed this by first mounting my volume using glusterfs/fuse >>>>> on a client and did the following >>>>> >>>>> chmod 1000:1000 /mnt/myglustervolume >>>>> >>>>> This changed correctly the owner and group to UID/GID 1000 of my >>>>> volume but like 1-2 hours later it was back to root:root. I tried >>>>> again and this happens again. >>>>> >>>>> Am I doing something wrong here? I am using GlusterFS 3.8.11 on >>>>> Debian 8. >>>>> >>>>> Regards, >>>>> M. >>>>> >>>>> >>>>> >>>> >>> >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >>___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] set owner:group on root of volume
I can now also answer your question 3) so I just did a stop and start of the volume and yes the owner and group of the root directory of my volume gets set again correctly to UID/GID 1000. The problem is that it is now just a mater of time that it somehow gets reseted back to root:root... > Original Message > Subject: Re: [Gluster-users] set owner:group on root of volume > Local Time: July 23, 2017 8:15 PM > UTC Time: July 23, 2017 6:15 PM > From: vbel...@redhat.com > To: mabi , Gluster Users > On 07/20/2017 03:13 PM, mabi wrote: >> Anyone has an idea? or shall I open a bug for that? > This is an interesting problem. A few questions: > 1. Is there any chance that one of your applications does a chown on the > root? > 2. Do you notice any logs related to metadata self-heal on "/" in the > gluster logs? > 3. Does the ownership of all bricks reset to custom uid/gid after every > restart of the volume? > Thanks, > Vijay >> >> >>> Original Message >>> Subject: Re: set owner:group on root of volume >>> Local Time: July 18, 2017 3:46 PM >>> UTC Time: July 18, 2017 1:46 PM >>> From: m...@protonmail.ch >>> To: Gluster Users >>> >>> Unfortunately the root directory of my volume still get its owner and >>> group resetted to root. Can someone explain why or help with this >>> issue? I need it to be set to UID/GID 1000 and stay like that. >>> >>> Thanks >>> >>> >>> >>>> Original Message >>>> Subject: Re: set owner:group on root of volume >>>> Local Time: July 11, 2017 9:33 PM >>>> UTC Time: July 11, 2017 7:33 PM >>>> From: m...@protonmail.ch >>>> To: Gluster Users >>>> >>>> Just found out I needed to set following two parameters: >>>> >>>> gluster volume set myvol storage.owner-uid 1000 >>>> gluster volume set myvol storage.owner-gid 1000 >>>> >>>> >>>> >>>> In case that helps any one else :) >>>> >>>>> Original Message >>>>> Subject: set owner:group on root of volume >>>>> Local Time: July 11, 2017 8:15 PM >>>>> UTC Time: July 11, 2017 6:15 PM >>>>> From: m...@protonmail.ch >>>>> To: Gluster Users >>>>> >>>>> Hi, >>>>> >>>>> By default the owner and group of a GlusterFS seems to be root:root >>>>> now I changed this by first mounting my volume using glusterfs/fuse >>>>> on a client and did the following >>>>> >>>>> chmod 1000:1000 /mnt/myglustervolume >>>>> >>>>> This changed correctly the owner and group to UID/GID 1000 of my >>>>> volume but like 1-2 hours later it was back to root:root. I tried >>>>> again and this happens again. >>>>> >>>>> Am I doing something wrong here? I am using GlusterFS 3.8.11 on >>>>> Debian 8. >>>>> >>>>> Regards, >>>>> M. >>>>> >>>>> >>>>> >>>> >>> >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >>___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] /var/lib/misc/glusterfsd growing and using up space on OS disk
Hello, Today while freeing up some space on my OS disk I just discovered that there is a /var/lib/misc/glusterfsd directory which seems to save data related to geo-replication. In particular there is a hidden sub-directory called ".processed" as you can see here: /var/lib/misc/glusterfsd/woelkli-pro/ssh%3A%2F%2Froot%40192.168.0.10%3Agluster%3A%2F%2F127.0.0.1%3Amyvolume-geo/6d844f56e12ecd14d2e36242f045e38c/.processed which contains one archive file per month, example: -rw-r--r-- 1 root root 152494080 Apr 30 23:34 archive_201704.tar -rw-r--r-- 1 root root 43284480 May 31 23:35 archive_201705.tar ... These tar files seem to save the CHANGELOG files of geo replication. Are these the same files as located in /data/myvolume/brick/.glusterfs/changelogs ? As these file with the time take some space and are located on my OS disk I was wondering if I can safely remove all these processed file? Is there maybe a way of telling GlusterFS to regularly delete these files? Regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] How to deal with FAILURES count in geo rep
Can anyone tell me how to find out what is going wrong here? I the meantime I have 272 FAILURES count and I can't find anything in the GlusterFS documentation on how to troubleshoot the FAILURES count in geo-replication. Thank you. > Original Message > Subject: How to deal with FAILURES count in geo rep > Local Time: June 30, 2017 8:32 PM > UTC Time: June 30, 2017 6:32 PM > From: m...@protonmail.ch > To: Gluster Users > Hello, > I have a replica 2 with a remote slave node for geo-replication (GlusterFS > 3.8.11 on Debian 8) and saw for the first time a non zero number in the > FAILURES column when running: > gluster volume geo-replcation myvolume remotehost:remotevol status detail > Right now the number under the FAILURES column is 32 and have a few questions > regarding how to deal with that: > - first what does 32 mean? is it the number of files which failed to be geo > replicated onto to slave node? > - how can I find out which files failed to replicate? > - how can I make gluster geo-rep re-try to replicate these files? > Best regards, > Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Not possible to stop geo-rep after adding arbiter to replica 2
Hello To my two node replica volume I have added an arbiter node for safety purpose. On that volume I also have geo replication running and would like to stop it is status "Faulty" and keeps trying over and over to sync without success. I am using GlusterFS 3.8.11. So in order to stop geo-rep I use: gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo stop but it fails to stop it as you can see in the output below: Staging failed on arbiternode.domain.tld. Error: Geo-replication session between myvolume and gfs1geo.domain.tld::myvolume-geo does not exist. geo-replication command failed How can I now stop geo replication? Is there a manual way to do that? Regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Possible stale .glusterfs/indices/xattrop file?
Hi, Sorry for mailing again but as mentioned in my previous mail, I have added an arbiter node to my replica 2 volume and it seem to have gone fine except for the fact that there is one single file which needs healing and does not get healed as you can see here from the output of a "heal info": Brick node1.domain.tld:/data/myvolume/brick Status: Connected Number of entries: 0 Brick node2.domain.tld:/data/myvolume/brick Status: Connected Number of entries: 1 Brick arbiternode.domain.tld:/srv/glusterfs/myvolume/brick Status: Connected Number of entries: 0 On my node2 the respective .glusterfs/indices/xattrop directory contains two files as you can see below: ls -lai /data/myvolume/brick/.glusterfs/indices/xattrop total 76180 10 drw--- 2 root root 4 Jul 29 12:15 . 9 drw--- 5 root root 5 Apr 28 22:15 .. 2798404 -- 2 root root 0 Apr 28 22:51 29e0d13e-1217-41cc-9bda-1fbbf781c397 2798404 -- 2 root root 0 Apr 28 22:51 xattrop-6fa49ad5-71dd-4ec2-9246-7b302ab92d38 I tried to find the real file on my brick where this xattrop file points to using its inode number (command: find /data/myvolume/brick/data -inum 8394642) but it does not find any associated file. So my question here is, is it possible that this is a stale file which just forgot to get deleted from the indices/xattrop file by gluster for some unknown reason? If yes is it safe for me to delete these two files? or what would be the correct process in that case? Thank you for your input. Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Not possible to stop geo-rep after adding arbiter to replica 2
I managed to force stopping geo replication using the "force" parameter after the "stop" but there are still other issues related to the fact that my geo replication setup was created before I added the additional arbiter node to my replca. For example when I would like to stop my volume I simply can't and I get the following error: volume stop: myvolume: failed: Staging failed on arbiternode.domain.tld. Error: geo-replication Unable to get the status of active geo-replication session for the volume 'myvolume'. What should I do here? Should I delete the geo-replication and re-create it? Will it then have to sync again all my data? I have around 500 GB of data so I would like to avoid that if possible. Any other solutions? Thanks, M. > Original Message > Subject: Re: [Gluster-users] Not possible to stop geo-rep after adding > arbiter to replica 2 > Local Time: July 29, 2017 12:32 PM > UTC Time: July 29, 2017 10:32 AM > From: ksan...@redhat.com > To: mabi , Rahul Hinduja , Kotresh > Hiremath Ravishankar > Gluster Users > > Adding Rahul and Kothresh who are SME on geo replication > Thanks & Regards > Karan Sandha > > On Sat, Jul 29, 2017 at 3:37 PM, mabi wrote: > >> Hello >> To my two node replica volume I have added an arbiter node for safety >> purpose. On that volume I also have geo replication running and would like >> to stop it is status "Faulty" and keeps trying over and over to sync without >> success. I am using GlusterFS 3.8.11. >> So in order to stop geo-rep I use: >> gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo stop >> but it fails to stop it as you can see in the output below: >> Staging failed on arbiternode.domain.tld. Error: Geo-replication session >> between myvolume and gfs1geo.domain.tld::myvolume-geo does not exist. >> geo-replication command failed >> How can I now stop geo replication? Is there a manual way to do that? >> Regards, >> Mabi >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users > -- > > Regards & Thanks > Karan Sandha___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Not possible to stop geo-rep after adding arbiter to replica 2
For sake of completeness here below I pasted the relevant part of the etc-glusterfs-glusterd.vol.log log file from my arbiter node. Strangely enough one of the log message says that the file /var/lib/glusterd/geo-replication/gsyncd_template.conf does not exist but I checked and it does exist on all nodes including the arbiter node. [2017-07-29 14:02:02.872498] E [MSGID: 106293] [glusterd-geo-rep.c:676:glusterd_query_extutil_generic] 0-management: reading data from child failed [2017-07-29 14:02:02.872511] E [MSGID: 106305] [glusterd-geo-rep.c:1944:is_geo_rep_active] 0-management: Unable to get configuration data for myvolume(master), ssh://gfs1geo.domain.tld::myvolume -geo:fe090023-7add-40b2-a6c5-d7e7ea27e12a(slave) [2017-07-29 14:02:02.872588] E [MSGID: 106300] [glusterd-geo-rep.c:2104:glusterd_check_geo_rep_running] 0-management: _get_slave_satus failed [2017-07-29 14:02:02.872607] E [MSGID: 106301] [glusterd-op-sm.c:5423:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Stop', Status : -1 [2017-07-29 14:01:22.986242] W [MSGID: 106029] [glusterd-geo-rep.c:2552:glusterd_get_statefile_name] 0-management: Config file (/var/lib/glusterd/geo-replication/myvolume_gfs1geo.domain.tld_myvolume-geo/gsyncd.conf) missing. Looking for template config file (/var/lib/glusterd/geo-replication/gsyncd_template.conf) [No such file or directory] [2017-07-29 14:01:34.270084] E [MSGID: 106459] [glusterd-geo-rep.c:4601:glusterd_read_status_file] 0-management: Unable to get status data for myvolume(master), gfs1geo.domain.tld::myvolume-geo( slave), /srv/glusterfs/myvolume/brick(brick) > Original Message > Subject: Re: [Gluster-users] Not possible to stop geo-rep after adding > arbiter to replica 2 > Local Time: July 29, 2017 4:07 PM > UTC Time: July 29, 2017 2:07 PM > From: m...@protonmail.ch > To: Karan Sandha > Rahul Hinduja , Kotresh Hiremath Ravishankar > , Gluster Users > I managed to force stopping geo replication using the "force" parameter after > the "stop" but there are still other issues related to the fact that my geo > replication setup was created before I added the additional arbiter node to > my replca. > For example when I would like to stop my volume I simply can't and I get the > following error: > volume stop: myvolume: failed: Staging failed on arbiternode.domain.tld. > Error: geo-replication Unable to get the status of active geo-replication > session for the volume 'myvolume'. > What should I do here? Should I delete the geo-replication and re-create it? > Will it then have to sync again all my data? I have around 500 GB of data so > I would like to avoid that if possible. Any other solutions? > Thanks, > M. > >> Original Message >> Subject: Re: [Gluster-users] Not possible to stop geo-rep after adding >> arbiter to replica 2 >> Local Time: July 29, 2017 12:32 PM >> UTC Time: July 29, 2017 10:32 AM >> From: ksan...@redhat.com >> To: mabi , Rahul Hinduja , Kotresh >> Hiremath Ravishankar >> Gluster Users >> >> Adding Rahul and Kothresh who are SME on geo replication >> Thanks & Regards >> Karan Sandha >> >> On Sat, Jul 29, 2017 at 3:37 PM, mabi wrote: >> >>> Hello >>> To my two node replica volume I have added an arbiter node for safety >>> purpose. On that volume I also have geo replication running and would like >>> to stop it is status "Faulty" and keeps trying over and over to sync >>> without success. I am using GlusterFS 3.8.11. >>> So in order to stop geo-rep I use: >>> gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo >>> stop >>> but it fails to stop it as you can see in the output below: >>> Staging failed on arbiternode.domain.tld. Error: Geo-replication session >>> between myvolume and gfs1geo.domain.tld::myvolume-geo does not exist. >>> geo-replication command failed >>> How can I now stop geo replication? Is there a manual way to do that? >>> Regards, >>> Mabi >>> ___ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >> -- >> >> Regards & Thanks >> Karan Sandha___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file?
Hi Ravi, Thanks for your hints. Below you will find the answer to your questions. First I tried to start the healing process by running: gluster volume heal myvolume and then as you suggested watch the output of the glustershd.log file but nothing appeared in that log file after running the above command. I checked the files which need to be healing using the "heal info" command and it still shows that very same GFID on node2 to be healed. So nothing changed here. The file /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 is only on node2 and not on my nod1 nor on my arbiternode. This file seems to be a regular file and not a symlink. Here is the output of the stat command on it from my node2: File: ‘/data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397’ Size: 0 Blocks: 1 IO Block: 512 regular empty file Device: 25h/37d Inode: 2798404 Links: 2 Access: (/--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-04-28 22:51:15.215775269 +0200 Modify: 2017-04-28 22:51:15.215775269 +0200 Change: 2017-07-30 08:39:03.700872312 +0200 Birth: - I hope this is enough info for a starter, else let me know if you need any more info. I would be glad to resolve this weird file which needs to be healed but can not. Best regards, Mabi > Original Message > Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? > Local Time: July 30, 2017 3:31 AM > UTC Time: July 30, 2017 1:31 AM > From: ravishan...@redhat.com > To: mabi , Gluster Users > > On 07/29/2017 04:36 PM, mabi wrote: > >> Hi, >> Sorry for mailing again but as mentioned in my previous mail, I have added >> an arbiter node to my replica 2 volume and it seem to have gone fine except >> for the fact that there is one single file which needs healing and does not >> get healed as you can see here from the output of a "heal info": >> Brick node1.domain.tld:/data/myvolume/brick >> Status: Connected >> Number of entries: 0 >> Brick node2.domain.tld:/data/myvolume/brick >> >> Status: Connected >> Number of entries: 1 >> Brick arbiternode.domain.tld:/srv/glusterfs/myvolume/brick >> Status: Connected >> Number of entries: 0 >> On my node2 the respective .glusterfs/indices/xattrop directory contains two >> files as you can see below: >> ls -lai /data/myvolume/brick/.glusterfs/indices/xattrop >> total 76180 >> 10 drw--- 2 root root 4 Jul 29 12:15 . >> 9 drw--- 5 root root 5 Apr 28 22:15 .. >> 2798404 -- 2 root root 0 Apr 28 22:51 >> 29e0d13e-1217-41cc-9bda-1fbbf781c397 >> 2798404 -- 2 root root 0 Apr 28 22:51 >> xattrop-6fa49ad5-71dd-4ec2-9246-7b302ab92d38 >> I tried to find the real file on my brick where this xattrop file points to >> using its inode number (command: find /data/myvolume/brick/data -inum >> 8394642) but it does not find any associated file. >> So my question here is, is it possible that this is a stale file which just >> forgot to get deleted from the indices/xattrop file by gluster for some >> unknown reason? If yes is it safe for me to delete these two files? or what >> would be the correct process in that case? > > The 'xattrop-6fa...' is the base entry. gfids of files that need heal are > hard linked to this entry, so nothing needs to be done for it. But you need > to find out why '29e0d13...' is not healing. Launch the heal and observe the > glustershd logs for errors. I suppose the inode number for > .glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 is what is 8394642. Is > .glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 a regular file or > symlink? Does it exist in the other 2 bricks? What is the link count (as seen > from stat )? > -Ravi > >> Thank you for your input. >> Mabi >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file?
I did a find on this inode number and I could find the file but only on node1 (nothing on node2 and the new arbiternode). Here is an ls -lai of the file itself on node1: -rw-r--r-- 1 www-data www-data 32 Jun 19 17:42 fileKey As you can see it is a 32 bytes file and as you suggested I ran a "stat" on this very same file through a glusterfs mount (using fuse) but unfortunately nothing happened. The GFID is still being displayed to be healed. Just in case here is the output of the stat: File: ‘fileKey’ Size: 32 Blocks: 1 IO Block: 131072 regular file Device: 1eh/30d Inode: 12086351742306673840 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-06-19 17:42:35.339773495 +0200 Modify: 2017-06-19 17:42:35.343773437 +0200 Change: 2017-06-19 17:42:35.343773437 +0200 Birth: - What else can I do or try in order to fix this situation? > Original Message > Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? > Local Time: July 31, 2017 3:27 AM > UTC Time: July 31, 2017 1:27 AM > From: ravishan...@redhat.com > To: mabi > Gluster Users > > On 07/30/2017 02:24 PM, mabi wrote: > >> Hi Ravi, >> Thanks for your hints. Below you will find the answer to your questions. >> First I tried to start the healing process by running: >> gluster volume heal myvolume >> and then as you suggested watch the output of the glustershd.log file but >> nothing appeared in that log file after running the above command. I checked >> the files which need to be healing using the "heal info" command >> and it still shows that very same GFID on node2 to be healed. So nothing >> changed here. >> The file >> /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 >> is only on node2 and not on my nod1 nor on my arbiternode. This file seems >> to be a regular file and not a symlink. Here is the output of the stat >> command on it from my node2: >> File: >> ‘/data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397’ >> Size: 0 Blocks: 1 IO Block: 512 regular empty file >> Device: 25h/37d Inode: 2798404 Links: 2 > > Okay, link count of 2 means there is a hardlink somewhere on the brick. Try > the find command again. I see that the inode number is 2798404, not the one > you shared in your first mail. Once you find the path to the file, do a stat > of the file from mount. This should create the entry in the other 2 bricks > and do the heal. But FWIW, this seems to be a zero byte file. > Regards, > Ravi > >> Access: (/--) Uid: ( 0/ root) Gid: ( 0/ root) >> Access: 2017-04-28 22:51:15.215775269 +0200 >> Modify: 2017-04-28 22:51:15.215775269 +0200 >> Change: 2017-07-30 08:39:03.700872312 +0200 >> Birth: - >> I hope this is enough info for a starter, else let me know if you need any >> more info. I would be glad to resolve this weird file which needs to be >> healed but can not. >> Best regards, >> Mabi >> >>> Original Message >>> Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? >>> Local Time: July 30, 2017 3:31 AM >>> UTC Time: July 30, 2017 1:31 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch), Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> On 07/29/2017 04:36 PM, mabi wrote: >>> >>>> Hi, >>>> Sorry for mailing again but as mentioned in my previous mail, I have added >>>> an arbiter node to my replica 2 volume and it seem to have gone fine >>>> except for the fact that there is one single file which needs healing and >>>> does not get healed as you can see here from the output of a "heal info": >>>> Brick node1.domain.tld:/data/myvolume/brick >>>> Status: Connected >>>> Number of entries: 0 >>>> Brick node2.domain.tld:/data/myvolume/brick >>>> >>>> Status: Connected >>>> Number of entries: 1 >>>> Brick arbiternode.domain.tld:/srv/glusterfs/myvolume/brick >>>> Status: Connected >>>> Number of entries: 0 >>>> On my node2 the respective .glusterfs/indices/xattrop directory contains >>>> two files as you can see below: >>>> ls -lai /data/myvolume/brick/.glusterfs/indices/xattrop >>>> total 76180 >>>> 10 drw--- 2 root root 4 Jul 29 12:15 . >>>> 9 drw--- 5 root root 5 Apr 28 22:15 .. >>>> 2798404 -- 2 root root 0 Apr 28 22:51 >>>> 29e0d13e-1217-41cc-9bda-1fbbf781c397
Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file?
To quickly resume my current situation: on node2 I have found the following file xattrop/indices file which matches the GFID of the "heal info" command (below is there output of "ls -lai": 2798404 -- 2 root root 0 Apr 28 22:51 /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 As you can see this file has inode number 2798404, so I ran the following command on all my nodes (node1, node2 and arbiternode): sudo find /data/myvolume/brick -inum 2798404 -ls Here below are the results for all 3 nodes: node1: 2798404 19 -rw-r--r-- 2 www-data www-data 32 Jun 19 17:42 /data/myvolume/brick/.glusterfs/e6/5b/e65b77e2-a4c4-4824-a7bb-58df969ce4b0 2798404 19 -rw-r--r-- 2 www-data www-data 32 Jun 19 17:42 /data/myvolume/brick//fileKey node2: 2798404 1 -- 2 root root 0 Apr 28 22:51 /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 2798404 1 -- 2 root root 0 Apr 28 22:51 /data/myvolume/brick/.glusterfs/indices/xattrop/xattrop-6fa49ad5-71dd-4ec2-9246-7b302ab92d38 arbirternode: NOTHING As you requested I have tried to run on node1 a getfattr on the fileKey file by using the following command: getfattr -m . -d -e hex fileKey but there is no output. I am not familiar with the getfattr command so maybe I am using the wrong parameters, could you help me with that? > Original Message > Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? > Local Time: July 31, 2017 9:25 AM > UTC Time: July 31, 2017 7:25 AM > From: ravishan...@redhat.com > To: mabi > Gluster Users > On 07/31/2017 12:20 PM, mabi wrote: > >> I did a find on this inode number and I could find the file but only on >> node1 (nothing on node2 and the new arbiternode). Here is an ls -lai of the >> file itself on node1: > > Sorry I don't understand, isn't that (XFS) inode number specific to node2's > brick? If you want to use the same command, maybe you should try `find > /data/myvolume/brick -samefile > /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397` > on all 3 bricks. > >> -rw-r--r-- 1 www-data www-data 32 Jun 19 17:42 fileKey >> As you can see it is a 32 bytes file and as you suggested I ran a "stat" on >> this very same file through a glusterfs mount (using fuse) but unfortunately >> nothing happened. The GFID is still being displayed to be healed. Just in >> case here is the output of the stat: >> File: ‘fileKey’ >> Size: 32 Blocks: 1 IO Block: 131072 regular file >> Device: 1eh/30d Inode: 12086351742306673840 Links: 1 >> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >> Access: 2017-06-19 17:42:35.339773495 +0200 >> Modify: 2017-06-19 17:42:35.343773437 +0200 >> Change: 2017-06-19 17:42:35.343773437 +0200 >> Birth: - > > Is this 'fileKey' on node1 having the same gfid (see getfattr output)? Looks > like it is missing the hardlink inside .glusterfs folder since the link count > is only 1. > Thanks, > Ravi > >> What else can I do or try in order to fix this situation? >> >>> Original Message >>> Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? >>> Local Time: July 31, 2017 3:27 AM >>> UTC Time: July 31, 2017 1:27 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> On 07/30/2017 02:24 PM, mabi wrote: >>> >>>> Hi Ravi, >>>> Thanks for your hints. Below you will find the answer to your questions. >>>> First I tried to start the healing process by running: >>>> gluster volume heal myvolume >>>> and then as you suggested watch the output of the glustershd.log file but >>>> nothing appeared in that log file after running the above command. I >>>> checked the files which need to be healing using the "heal info" >>>> command and it still shows that very same GFID on node2 to be healed. So >>>> nothing changed here. >>>> The file >>>> /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 >>>> is only on node2 and not on my nod1 nor on my arbiternode. This file >>>> seems to be a regular file and not a symlink. Here is the output of the >>>> stat command on it from my node2: >>>> File: >>>> ‘/data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397’ >>>> Size: 0 Blocks: 1 IO Block: 512 regular empty file >>>> Device: 25h/37
Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file?
Now I understand what you mean the the "-samefile" parameter of "find". As requested I have now run the following command on all 3 nodes with the ouput of all 3 nodes below: sudo find /data/myvolume/brick -samefile /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 -ls node1: 8404683 0 lrwxrwxrwx 1 root root 66 Jul 27 15:43 /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 -> ../../fe/c0/fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810/OC_DEFAULT_MODULE node2: 8394638 0 lrwxrwxrwx 1 root root 66 Jul 27 15:43 /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 -> ../../fe/c0/fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810/OC_DEFAULT_MODULE arbiternode: find: '/data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397': No such file or directory Hope that helps. > Original Message > Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? > Local Time: July 31, 2017 10:55 AM > UTC Time: July 31, 2017 8:55 AM > From: ravishan...@redhat.com > To: mabi > Gluster Users > > On 07/31/2017 02:00 PM, mabi wrote: > >> To quickly resume my current situation: >> on node2 I have found the following file xattrop/indices file which matches >> the GFID of the "heal info" command (below is there output of "ls -lai": >> 2798404 -- 2 root root 0 Apr 28 22:51 >> /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 >> As you can see this file has inode number 2798404, so I ran the following >> command on all my nodes (node1, node2 and arbiternode): > > ...which is what I was saying is incorrect. 2798404 is an XFS inode number > and is not common to the same file across nodes. So you will get different > results. Use the -samefile flag I shared earlier. > -Ravi > >> sudo find /data/myvolume/brick -inum 2798404 -ls >> Here below are the results for all 3 nodes: >> node1: >> 2798404 19 -rw-r--r-- 2 www-data www-data 32 Jun 19 17:42 >> /data/myvolume/brick/.glusterfs/e6/5b/e65b77e2-a4c4-4824-a7bb-58df969ce4b0 >> 2798404 19 -rw-r--r-- 2 www-data www-data 32 Jun 19 17:42 >> /data/myvolume/brick//fileKey >> node2: >> 2798404 1 -- 2 root root 0 Apr 28 22:51 >> /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 >> 2798404 1 -- 2 root root 0 Apr 28 22:51 >> /data/myvolume/brick/.glusterfs/indices/xattrop/xattrop-6fa49ad5-71dd-4ec2-9246-7b302ab92d38 >> arbirternode: >> NOTHING >> As you requested I have tried to run on node1 a getfattr on the fileKey file >> by using the following command: >> getfattr -m . -d -e hex fileKey >> but there is no output. I am not familiar with the getfattr command so maybe >> I am using the wrong parameters, could you help me with that? >> >>> Original Message >>> Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? >>> Local Time: July 31, 2017 9:25 AM >>> UTC Time: July 31, 2017 7:25 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> On 07/31/2017 12:20 PM, mabi wrote: >>> >>>> I did a find on this inode number and I could find the file but only on >>>> node1 (nothing on node2 and the new arbiternode). Here is an ls -lai of >>>> the file itself on node1: >>> >>> Sorry I don't understand, isn't that (XFS) inode number specific to node2's >>> brick? If you want to use the same command, maybe you should try `find >>> /data/myvolume/brick -samefile >>> /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397` >>> on all 3 bricks. >>> >>>> -rw-r--r-- 1 www-data www-data 32 Jun 19 17:42 fileKey >>>> As you can see it is a 32 bytes file and as you suggested I ran a "stat" >>>> on this very same file through a glusterfs mount (using fuse) but >>>> unfortunately nothing happened. The GFID is still being displayed to be >>>> healed. Just in case here is the output of the stat: >>>> File: ‘fileKey’ >>>> Size: 32 Blocks: 1 IO Block: 131072 regular file >>>> Device: 1eh/30d Inode: 12086351742306673840 Links: 1 >>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>> Access: 2017-06-19 17:42:35.339773495 +0200 >>>> Modify: 2017-06-19 17:42:35.343773437 +0200 >>>> Change: 2017-06-19 17:42:35.343773437 +0200 >&g
Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file?
Thanks Ravi, that seem to have done the trick and now there are no more files to be healed. Just for your and other user's information: the OC_DEFAULT_MODULE was in fact a directory which contains 2 files. So I did a stat on that directory and then the "heal info" would show the 2 following files (and not the GFID anymore): Brick node2:/data/myvolume/brick /user/files_encryption/keys/files_trashbin/files/Library.db-journal.bc.d1501276401/OC_DEFAULT_MODULE/user.shareKey /user/files_encryption/keys/files_trashbin/files/Library.db-journal.bc.d1501276401/OC_DEFAULT_MODULE/fileKey Status: Connected Number of entries: 2 After that I just waited for the self-heal to do it's job on node2 just and it did as you can see below from the output of the glustershd.log file: [2017-07-31 09:40:05.045437] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-myvolume-replicate-0: Completed data selfheal on f1f0e091-2c4c-4a31-bc40-97949462dc4a. sources=0 [1] sinks=2 [2017-07-31 09:40:05.047194] I [MSGID: 108026] [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-myvolume-replicate-0: performing metadata selfheal on f1f0e091-2c4c-4a31-bc40-97949462dc4a [2017-07-31 09:40:05.050996] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-myvolume-replicate-0: Completed metadata selfheal on f1f0e091-2c4c-4a31-bc40-97949462dc4a. sources=0 [1] sinks=2 [2017-07-31 09:40:05.055781] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-myvolume-replicate-0: Completed data selfheal on be2b2097-2b1a-45e1-ad9e-3cf6bf5b4caa. sources=0 [1] sinks=2 [2017-07-31 09:40:05.057026] I [MSGID: 108026] [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-myvolume-replicate-0: performing metadata selfheal on be2b2097-2b1a-45e1-ad9e-3cf6bf5b4caa [2017-07-31 09:40:05.060716] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-myvolume-replicate-0: Completed metadata selfheal on be2b2097-2b1a-45e1-ad9e-3cf6bf5b4caa. sources=0 [1] sinks=2 Thanks again Ravi for your help in this procedure. Now I continue and try to fix my geo-replication issue (which I have documented on the mailing list a few days ago). Best, M. > Original Message > Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? > Local Time: July 31, 2017 11:24 AM > UTC Time: July 31, 2017 9:24 AM > From: ravishan...@redhat.com > To: mabi > Gluster Users > > On 07/31/2017 02:33 PM, mabi wrote: > >> Now I understand what you mean the the "-samefile" parameter of "find". As >> requested I have now run the following command on all 3 nodes with the ouput >> of all 3 nodes below: >> sudo find /data/myvolume/brick -samefile >> /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 >> -ls >> node1: >> 8404683 0 lrwxrwxrwx 1 root root 66 Jul 27 15:43 >> /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 >> -> ../../fe/c0/fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810/OC_DEFAULT_MODULE >> node2: >> 8394638 0 lrwxrwxrwx 1 root root 66 Jul 27 15:43 >> /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 >> -> ../../fe/c0/fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810/OC_DEFAULT_MODULE >> arbiternode: >> find: >> '/data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397': >> No such file or directory > > Right, so the file OC_DEFAULT_MODULE is missing in this brick It's parent > directory has gfid fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810. > Goal is to do a stat of this file from the fuse mount. If you know the > complete path to this file, good. Otherwise you can use this script [1] to > find the path to the parent dir corresponding to the gfid > fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810 like so: > `./gfid-to-dirname.sh /data/myvolume/brick > fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810` > [1] https://github.com/gluster/glusterfs/blob/master/extras/gfid-to-dirname.sh > Try to stat the file from a new (temporary) fuse mount to avoid any caching > effects. > -Ravi > >> Hope that helps. >> >>> Original Message >>> Subject: Re: [Gluster-users] Possible stale .glusterfs/indices/xattrop file? >>> Local Time: July 31, 2017 10:55 AM >>> UTC Time: July 31, 2017 8:55 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> On 07/31/2017 02:00 PM, mabi wrote: >>> >>>> To quickly resume my current situation: >>>> on node2 I have found the following file xattrop/indices file which >>>> matches the GFID of the "heal info" comma
[Gluster-users] Quotas not working after adding arbiter brick to replica 2
Hello, As you might have read in my previous post on the mailing list I have added an arbiter node to my GlusterFS 3.8.11 replica 2 volume. After some healing issues and help of Ravi that could get fixed but now I just noticed that my quotas are all gone. When I run the following command: glusterfs volume quota myvolume list There is no output... In the /var/log/glusterfs/quotad.log I can see the following two lines when running the list command: [2017-08-01 06:46:04.451765] W [dict.c:581:dict_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) [0x7fe868e21f3d] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) [0x7fe868e22d82] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] [2017-08-01 06:46:04.459154] W [dict.c:581:dict_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) [0x7fe868e21f3d] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) [0x7fe868e22d82] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] In case you need this info, I have added by arbiter node to the replica 2 by using this command: gluster volume add-brick myvolume replica 3 arbiter 1 arbiternode.domain.tld:/srv/glusterfs/myvolume/brick How can I get my quotas back working as before? I had defined around 20 quotas on different directories of that volume. Regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Quotas not working after adding arbiter brick to replica 2
I also just noticed quite a few of the following warning messages in the quotad.log log file: [2017-08-01 07:59:27.834202] W [MSGID: 108027] [afr-common.c:2496:afr_discover_done] 0-myvolume-replicate-0: no read subvols for (null) > Original Message > Subject: [Gluster-users] Quotas not working after adding arbiter brick to > replica 2 > Local Time: August 1, 2017 8:49 AM > UTC Time: August 1, 2017 6:49 AM > From: m...@protonmail.ch > To: Gluster Users > Hello, > As you might have read in my previous post on the mailing list I have added > an arbiter node to my GlusterFS 3.8.11 replica 2 volume. After some healing > issues and help of Ravi that could get fixed but now I just noticed that my > quotas are all gone. > When I run the following command: > glusterfs volume quota myvolume list > There is no output... > In the /var/log/glusterfs/quotad.log I can see the following two lines when > running the list command: > [2017-08-01 06:46:04.451765] W [dict.c:581:dict_unref] > (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) > [0x7fe868e21f3d] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) > [0x7fe868e22d82] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) > [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] > [2017-08-01 06:46:04.459154] W [dict.c:581:dict_unref] > (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) > [0x7fe868e21f3d] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) > [0x7fe868e22d82] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) > [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] > In case you need this info, I have added by arbiter node to the replica 2 by > using this command: > gluster volume add-brick myvolume replica 3 arbiter 1 > arbiternode.domain.tld:/srv/glusterfs/myvolume/brick > How can I get my quotas back working as before? I had defined around 20 > quotas on different directories of that volume. > Regards, > Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] How to delete geo-replication session?
Hi, I would like to delete a geo-replication session on my GluterFS 3.8.11 replicat 2 volume in order to re-create it. Unfortunately the "delete" command does not work as you can see below: $ sudo gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo delete Staging failed on arbiternode.domain.tld. Error: Geo-replication session between myvolume and arbiternode.domain.tld::myvolume-geo does not exist. geo-replication command failed I also tried with "force" but no luck here either: $ sudo gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo delete force Usage: volume geo-replication [] [] {create [[ssh-port n] [[no-verify]|[push-pem]]] [force]|start [force]|stop [force]|pause [force]|resume [force]|config|status [detail]|delete [reset-sync-time]} [options...] So how can I delete my geo-replication session manually? Mind that I do not want to reset-sync-time, I would like to delete it and re-create it so that it continues to geo replicate where it left from. Thanks, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Quotas not working after adding arbiter brick to replica 2
Hi Sanoj, I copied over the quota.conf file from the affected volume (node 1) and opened it up with a hex editor but can not recognize anything really except for the first few header/version bytes. I have attached it within this mail (compressed with bzip2) as requested. Should I recreate them manually? there where around 10 of them. Or is there a hope of recovering these quotas? Regards, M. > Original Message > Subject: Re: [Gluster-users] Quotas not working after adding arbiter brick to > replica 2 > Local Time: August 2, 2017 1:06 PM > UTC Time: August 2, 2017 11:06 AM > From: sunni...@redhat.com > To: mabi > Gluster Users > > Mabi, > We have fixed a couple of issues in the quota list path. > Could you also please attach the quota.conf file > (/var/lib/glusterd/vols/patchy/quota.conf) > (Ideally, the first few bytes would be ascii characters followed by 17 bytes > per directory on which quota limit is set) > Regards, > Sanoj > > On Tue, Aug 1, 2017 at 1:36 PM, mabi wrote: > >> I also just noticed quite a few of the following warning messages in the >> quotad.log log file: >> [2017-08-01 07:59:27.834202] W [MSGID: 108027] >> [afr-common.c:2496:afr_discover_done] 0-myvolume-replicate-0: no read >> subvols for (null) >> >>> Original Message >>> Subject: [Gluster-users] Quotas not working after adding arbiter brick to >>> replica 2 >>> Local Time: August 1, 2017 8:49 AM >>> UTC Time: August 1, 2017 6:49 AM >>> From: m...@protonmail.ch >>> To: Gluster Users >>> Hello, >>> As you might have read in my previous post on the mailing list I have added >>> an arbiter node to my GlusterFS 3.8.11 replica 2 volume. After some healing >>> issues and help of Ravi that could get fixed but now I just noticed that my >>> quotas are all gone. >>> When I run the following command: >>> glusterfs volume quota myvolume list >>> There is no output... >>> In the /var/log/glusterfs/quotad.log I can see the following two lines when >>> running the list command: >>> [2017-08-01 06:46:04.451765] W [dict.c:581:dict_unref] >>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) >>> [0x7fe868e21f3d] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) >>> [0x7fe868e22d82] >>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) >>> [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] >>> [2017-08-01 06:46:04.459154] W [dict.c:581:dict_unref] >>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) >>> [0x7fe868e21f3d] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) >>> [0x7fe868e22d82] >>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) >>> [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] >>> In case you need this info, I have added by arbiter node to the replica 2 >>> by using this command: >>> gluster volume add-brick myvolume replica 3 arbiter 1 >>> arbiternode.domain.tld:/srv/glusterfs/myvolume/brick >>> How can I get my quotas back working as before? I had defined around 20 >>> quotas on different directories of that volume. >>> Regards, >>> Mabi >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users quota.conf.bz2 Description: application/bzip ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Quotas not working after adding arbiter brick to replica 2
I tried to re-create manually my quotas but not even that works now. Running the "limit-usage" command as showed below returns success: $ sudo gluster volume quota myvolume limit-usage /userdirectory 50GB volume quota : success but when I list the quotas using "list" nothing appears. What can I do to fix that issue with the quotas? > Original Message > Subject: Re: [Gluster-users] Quotas not working after adding arbiter brick to > replica 2 > Local Time: August 2, 2017 2:35 PM > UTC Time: August 2, 2017 12:35 PM > From: m...@protonmail.ch > To: Sanoj Unnikrishnan > Gluster Users > Hi Sanoj, > I copied over the quota.conf file from the affected volume (node 1) and > opened it up with a hex editor but can not recognize anything really except > for the first few header/version bytes. I have attached it within this mail > (compressed with bzip2) as requested. > Should I recreate them manually? there where around 10 of them. Or is there a > hope of recovering these quotas? > Regards, > M. > >> Original Message >> Subject: Re: [Gluster-users] Quotas not working after adding arbiter brick >> to replica 2 >> Local Time: August 2, 2017 1:06 PM >> UTC Time: August 2, 2017 11:06 AM >> From: sunni...@redhat.com >> To: mabi >> Gluster Users >> >> Mabi, >> We have fixed a couple of issues in the quota list path. >> Could you also please attach the quota.conf file >> (/var/lib/glusterd/vols/patchy/quota.conf) >> (Ideally, the first few bytes would be ascii characters followed by 17 bytes >> per directory on which quota limit is set) >> Regards, >> Sanoj >> >> On Tue, Aug 1, 2017 at 1:36 PM, mabi wrote: >> >>> I also just noticed quite a few of the following warning messages in the >>> quotad.log log file: >>> [2017-08-01 07:59:27.834202] W [MSGID: 108027] >>> [afr-common.c:2496:afr_discover_done] 0-myvolume-replicate-0: no read >>> subvols for (null) >>> >>>> Original Message >>>> Subject: [Gluster-users] Quotas not working after adding arbiter brick to >>>> replica 2 >>>> Local Time: August 1, 2017 8:49 AM >>>> UTC Time: August 1, 2017 6:49 AM >>>> From: m...@protonmail.ch >>>> To: Gluster Users >>>> Hello, >>>> As you might have read in my previous post on the mailing list I have >>>> added an arbiter node to my GlusterFS 3.8.11 replica 2 volume. After some >>>> healing issues and help of Ravi that could get fixed but now I just >>>> noticed that my quotas are all gone. >>>> When I run the following command: >>>> glusterfs volume quota myvolume list >>>> There is no output... >>>> In the /var/log/glusterfs/quotad.log I can see the following two lines >>>> when running the list command: >>>> [2017-08-01 06:46:04.451765] W [dict.c:581:dict_unref] >>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) >>>> [0x7fe868e21f3d] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) >>>> [0x7fe868e22d82] >>>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) >>>> [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] >>>> [2017-08-01 06:46:04.459154] W [dict.c:581:dict_unref] >>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x1f3d) >>>> [0x7fe868e21f3d] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/xlator/features/quotad.so(+0x2d82) >>>> [0x7fe868e22d82] >>>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_unref+0xc0) >>>> [0x7fe86f5c2b10] ) 0-dict: dict is NULL [Invalid argument] >>>> In case you need this info, I have added by arbiter node to the replica 2 >>>> by using this command: >>>> gluster volume add-brick myvolume replica 3 arbiter 1 >>>> arbiternode.domain.tld:/srv/glusterfs/myvolume/brick >>>> How can I get my quotas back working as before? I had defined around 20 >>>> quotas on different directories of that volume. >>>> Regards, >>>> Mabi >>> ___ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] How to delete geo-replication session?
I am still stuck without being able to delete the geo-replication session. Can anyone help? > Original Message > Subject: How to delete geo-replication session? > Local Time: August 1, 2017 12:15 PM > UTC Time: August 1, 2017 10:15 AM > From: m...@protonmail.ch > To: Gluster Users > Hi, > I would like to delete a geo-replication session on my GluterFS 3.8.11 > replicat 2 volume in order to re-create it. Unfortunately the "delete" > command does not work as you can see below: > $ sudo gluster volume geo-replication myvolume > gfs1geo.domain.tld::myvolume-geo delete > Staging failed on arbiternode.domain.tld. Error: Geo-replication session > between myvolume and arbiternode.domain.tld::myvolume-geo does not exist. > geo-replication command failed > I also tried with "force" but no luck here either: > $ sudo gluster volume geo-replication myvolume > gfs1geo.domain.tld::myvolume-geo delete force > Usage: volume geo-replication [] [] {create [[ssh-port n] > [[no-verify]|[push-pem]]] [force]|start [force]|stop [force]|pause > [force]|resume [force]|config|status [detail]|delete [reset-sync-time]} > [options...] > So how can I delete my geo-replication session manually? > Mind that I do not want to reset-sync-time, I would like to delete it and > re-create it so that it continues to geo replicate where it left from. > Thanks, > M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] GlusterFS 3.8 Debian 8 apt repository broken
Hello, I want to upgrade from 3.8.11 to 3.8.14 on my Debian 8 (jessie) servers but it looks like the official GlusterFS apt repository has a mistake as you can see here: Get:14 http://download.gluster.org jessie InRelease [2'083 B] Get:15 http://download.gluster.org jessie/main amd64 Packages [1'602 B] Fetched 23.7 kB in 2s (10.6 kB/s) W: Failed to fetch http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt/dists/jessie/InRelease Unable to find expected entry 'main/binary-i386/Packages' in Release file (Wrong sources.list entry or malformed file) Here is the content of my /etc/apt/sources.list.d/gluster.list file: deb http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt jessie main Could someone fix this please? Thanks, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] GlusterFS 3.8 Debian 8 apt repository broken
I managed to workaround this issue by addding "[arch=amd64]" to my apt source list for gluster like this: deb [arch=amd64] http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt jessie main In case that can help any others with the same siutation (where they also have i386 arch enabled on the computer). > Original Message > Subject: GlusterFS 3.8 Debian 8 apt repository broken > Local Time: August 4, 2017 10:52 AM > UTC Time: August 4, 2017 8:52 AM > From: m...@protonmail.ch > To: Gluster Users > Hello, > I want to upgrade from 3.8.11 to 3.8.14 on my Debian 8 (jessie) servers but > it looks like the official GlusterFS apt repository has a mistake as you can > see here: > Get:14 http://download.gluster.org jessie InRelease [2'083 B] > Get:15 http://download.gluster.org jessie/main amd64 Packages [1'602 B] > Fetched 23.7 kB in 2s (10.6 kB/s) > W: Failed to fetch > http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt/dists/jessie/InRelease > Unable to find expected entry 'main/binary-i386/Packages' in Release file > (Wrong sources.list entry or malformed file) > Here is the content of my /etc/apt/sources.list.d/gluster.list file: > deb > http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt > jessie main > Could someone fix this please? > Thanks, > Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] GlusterFS 3.8 Debian 8 apt repository broken
I just needed to add "[arch=amd64]" in my sources.list file for the gluster repo then it worked. My laptop (amd64 arch) has i386 architecture enabled (for skype) and I suppose that apt checks then all repositories for i386 index files too. This is just my hypothesis but it would explain this behavior. > Original Message > Subject: Re: [Gluster-users] GlusterFS 3.8 Debian 8 apt repository broken > Local Time: August 4, 2017 12:33 PM > UTC Time: August 4, 2017 10:33 AM > From: kkeit...@redhat.com > To: mabi > Gluster Users > What would the fix be exactly? > The apt repos are built the same way they"ve been built for the last 3+ years > and you"re the first person to trip over whatever it is you"re tripping over. > And there have never been packages for i386 for Debian. > - Original Message - >> From: "mabi" >> To: "Gluster Users" >> Sent: Friday, August 4, 2017 4:56:30 AM >> Subject: Re: [Gluster-users] GlusterFS 3.8 Debian 8 apt repository broken >> >> I managed to workaround this issue by addding "[arch=amd64]" to my apt source >> list for gluster like this: >> >> deb [arch=amd64] >> http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt >> jessie main >> >> >> >> In case that can help any others with the same siutation (where they also >> have i386 arch enabled on the computer). >> >> >> >> >> >> Original Message >> Subject: GlusterFS 3.8 Debian 8 apt repository broken >> Local Time: August 4, 2017 10:52 AM >> UTC Time: August 4, 2017 8:52 AM >> From: m...@protonmail.ch >> To: Gluster Users >> >> Hello, >> >> I want to upgrade from 3.8.11 to 3.8.14 on my Debian 8 (jessie) servers but >> it looks like the official GlusterFS apt repository has a mistake as you can >> see here: >> >> Get:14 http://download.gluster.org jessie InRelease [2"083 B] >> Get:15 http://download.gluster.org jessie/main amd64 Packages [1"602 B] >> Fetched 23.7 kB in 2s (10.6 kB/s) >> W: Failed to fetch >> http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt/dists/jessie/InRelease >> Unable to find expected entry "main/binary-i386/Packages" in Release file >> (Wrong sources.list entry or malformed file) >> >> Here is the content of my /etc/apt/sources.list.d/gluster.list file: >> >> deb >> http://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.14/Debian/jessie/apt >> jessie main >> >> Could someone fix this please? >> >> Thanks, >> Mabi >> >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Quotas not working after adding arbiter brick to replica 2
Thank you very much Sanoj, I ran your script once and it worked. I now have quotas again... Question: do you know in which release this issue will be fixed? > Original Message > Subject: Re: [Gluster-users] Quotas not working after adding arbiter brick to > replica 2 > Local Time: August 4, 2017 3:28 PM > UTC Time: August 4, 2017 1:28 PM > From: sunni...@redhat.com > To: mabi > Gluster Users > > Hi mabi, > This is a likely issue where the last gfid entry in the quota.conf file is > stale (because the directory was deleted with quota limit on it being removed) > (https://review.gluster.org/#/c/16507/) > To fix the issue, we need to remove the last entry (last 17 bytes/ 16bytes > based on quota version) in the file. > Please use the below work around for the same until next upgrade. > you only need to change $vol to the name of volume. > === > > vol= > qconf=/var/lib/glusterd/vols/$vol/quota.conf > qconf_bk="$qconf".bk > cp $qconf $qconf_bk > grep "GlusterFS Quota conf | version: v1.2" > /var/lib/glusterd/vols/v5/quota.conf > if [ $? -eq 0 ]; > then > entry_size=17; > else > entry_size=16; > fi > size=`ls -l $qconf | awk '{print $5}'` > (( size_new = size - entry_size )) > dd if=$qconf_bk of=$qconf bs=1 count=$size_new > gluster v quota v5 list > > In the unlikely case that there are multiple stale entries in the end of file > you may have to run it multiple times > to fix the issue (each time one stale entry at the end is removed) > > On Thu, Aug 3, 2017 at 1:17 PM, mabi wrote: > >> I tried to re-create manually my quotas but not even that works now. Running >> the "limit-usage" command as showed below returns success: >> $ sudo gluster volume quota myvolume limit-usage /userdirectory 50GB >> volume quota : success >> but when I list the quotas using "list" nothing appears. >> What can I do to fix that issue with the quotas? >> >>> Original Message >>> Subject: Re: [Gluster-users] Quotas not working after adding arbiter brick >>> to replica 2 >>> >>> Local Time: August 2, 2017 2:35 PM >>> UTC Time: August 2, 2017 12:35 PM >>> From: m...@protonmail.ch >>> To: Sanoj Unnikrishnan >>> Gluster Users >>> Hi Sanoj, >>> I copied over the quota.conf file from the affected volume (node 1) and >>> opened it up with a hex editor but can not recognize anything really except >>> for the first few header/version bytes. I have attached it within this mail >>> (compressed with bzip2) as requested. >>> Should I recreate them manually? there where around 10 of them. Or is there >>> a hope of recovering these quotas? >>> Regards, >>> M. >>> >>>> Original Message >>>> Subject: Re: [Gluster-users] Quotas not working after adding arbiter brick >>>> to replica 2 >>>> Local Time: August 2, 2017 1:06 PM >>>> UTC Time: August 2, 2017 11:06 AM >>>> From: sunni...@redhat.com >>>> To: mabi >>>> Gluster Users >>>> >>>> Mabi, >>>> We have fixed a couple of issues in the quota list path. >>>> Could you also please attach the quota.conf file >>>> (/var/lib/glusterd/vols/patchy/quota.conf) >>>> (Ideally, the first few bytes would be ascii characters followed by 17 >>>> bytes per directory on which quota limit is set) >>>> Regards, >>>> Sanoj >>>> >>>> On Tue, Aug 1, 2017 at 1:36 PM, mabi wrote: >>>> >>>>> I also just noticed quite a few of the following warning messages in the >>>>> quotad.log log file: >>>>> [2017-08-01 07:59:27.834202] W [MSGID: 108027] >>>>> [afr-common.c:2496:afr_discover_done] 0-myvolume-replicate-0: no read >>>>> subvols for (null) >>>>> >>>>>> Original Message >>>>>> Subject: [Gluster-users] Quotas not working after adding arbiter brick >>>>>> to replica 2 >>>>>> Local Time: August 1, 2017 8:49 AM >>>>>> UTC Time: August 1, 2017 6:49 AM >>>>>> From: m...@protonmail.ch >>>>>> To: Gluster Users >>>>>> Hello, >>>>>> As you might have read in my previous post on the mailing list I have >>>>>> added an arbiter node to my GlusterFS 3.8.11 replica 2 volume. After >>&g
[Gluster-users] State: Peer Rejected (Connected)
Hi, I have a 3 nodes replica (including arbiter) volume with GlusterFS 3.8.11 and this night one of my nodes (node1) had an out of memory for some unknown reason and as such the Linux OOM killer has killed the glusterd and glusterfs process. I restarted the glusterd process but now that node is in "Peer Rejected" state from the other nodes and from itself it rejects the two other nodes as you can see below from the output of "gluster peer status": Number of Peers: 2 Hostname: arbiternode.domain.tld Uuid: 60a03a81-ba92-4b84-90fe-7b6e35a10975 State: Peer Rejected (Connected) Hostname: node2.domain.tld Uuid: 4834dceb-4356-4efb-ad8d-8baba44b967c State: Peer Rejected (Connected) I also rebooted my node1 just in case but that did not help. I read here http://www.spinics.net/lists/gluster-users/msg25803.html that the problem could have to do something with the volume info file, in my case I checked the file: /var/lib/glusterd/vols/myvolume/info and they are the same on node1 and arbiternode but on node2 the order of the following volume parameters are different: features.quota-deem-statfs=on features.inode-quota=on nfs.disable=on performance.readdir-ahead=on Could that be the reason why the peer is in rejected status? can I simply edit this file on node2 to re-order the parameters like on the other 2 nodes? What else should I do to investigate the reason for this rejected peer state? Thank you in advance for the help. Best, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] State: Peer Rejected (Connected)
Hi Ji-Hyeon, Thanks to your help I could find out the problematic file. This would be the quota file of my volume it has a different checksum on node1 whereas node2 and arbiternode have the same checksum. This is expected as I had issues which my quota file and had to fix it manually with a script (more details on this mailing list in a previous post) and I only did that on node1. So what I now did is to copy /var/lib/glusterd/vols/myvolume/quota.conf file from node1 to node2 and arbiternode and then restart the glusterd process on node1 but somehow this did not fix the issue. I suppose I am missing a step here and maybe you have an idea what? Here would be the relevant part of my glusterd.log file taken from node1: [2017-08-06 08:16:57.699131] E [MSGID: 106012] [glusterd-utils.c:2988:glusterd_compare_friend_volume] 0-management: Cksums of quota configuration of volume myvolume differ. local cksum = 3823389269, remote cksum = 733515336 on peer node2.domain.tld [2017-08-06 08:16:57.275558] E [MSGID: 106012] [glusterd-utils.c:2988:glusterd_compare_friend_volume] 0-management: Cksums of quota configuration of volume myvolume differ. local cksum = 3823389269, remote cksum = 733515336 on peer arbiternode.intra.oriented.ch Best regards, Mabi > Original Message > Subject: Re: [Gluster-users] State: Peer Rejected (Connected) > Local Time: August 6, 2017 9:31 AM > UTC Time: August 6, 2017 7:31 AM > From: potato...@potatogim.net > To: mabi > Gluster Users > On 2017년 08월 06일 15:59, mabi wrote: >> Hi, >> >> I have a 3 nodes replica (including arbiter) volume with GlusterFS >> 3.8.11 and this night one of my nodes (node1) had an out of memory for >> some unknown reason and as such the Linux OOM killer has killed the >> glusterd and glusterfs process. I restarted the glusterd process but >> now that node is in "Peer Rejected" state from the other nodes and >> from itself it rejects the two other nodes as you can see below from >> the output of "gluster peer status": >> >> Number of Peers: 2 >> >> Hostname: arbiternode.domain.tld >> Uuid: 60a03a81-ba92-4b84-90fe-7b6e35a10975 >> State: Peer Rejected (Connected) >> >> Hostname: node2.domain.tld >> Uuid: 4834dceb-4356-4efb-ad8d-8baba44b967c >> State: Peer Rejected (Connected) >> >> >> >> I also rebooted my node1 just in case but that did not help. >> >> I read here http://www.spinics.net/lists/gluster-users/msg25803.html >> that the problem could have to do something with the volume info file, >> in my case I checked the file: >> >> /var/lib/glusterd/vols/myvolume/info >> >> and they are the same on node1 and arbiternode but on node2 the order >> of the following volume parameters are different: >> >> features.quota-deem-statfs=on >> features.inode-quota=on >> nfs.disable=on >> performance.readdir-ahead=on >> >> Could that be the reason why the peer is in rejected status? can I >> simply edit this file on node2 to re-order the parameters like on the >> other 2 nodes? >> >> What else should I do to investigate the reason for this rejected peer >> state? >> >> Thank you in advance for the help. >> >> Best, >> Mabi >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users > Hi mabi. > In my opinion, It caused by some volfile/checksum mismatch. try to look > glusterd log file(/var/log/glusterfs/glusterd.log) in REJECTED node, and > find some log like below > [2014-06-17 04:21:11.266398] I > [glusterd-handler.c:2050:__glusterd_handle_incoming_friend_req] 0-glusterd: > Received probe from uuid: 81857e74-a726-4f48-8d1b-c2a4bdbc094f > [2014-06-17 04:21:11.266485] E > [glusterd-utils.c:2373:glusterd_compare_friend_volume] 0-management: Cksums > of volume supportgfs differ. local cksum = 52468988, remote cksum = > 2201279699 on peer 172.26.178.254 > [2014-06-17 04:21:11.266542] I > [glusterd-handler.c:3085:glusterd_xfer_friend_add_resp] 0-glusterd: Responded > to 172.26.178.254 (0), ret: 0 > [2014-06-17 04:21:11.272206] I > [glusterd-rpc-ops.c:356:__glusterd_friend_add_cbk] 0-glusterd: Received RJT > from uuid: 81857e74-a726-4f48-8d1b-c2a4bdbc094f, host: 172.26.178.254, port: 0 > if it is, you need to sync volfile files/directories under > /var/lib/glusterd/vols/ from one of GOOD nodes. > for details to resolve this problem, please show more information such > as glusterd log :) > -- > Best regards. > -- > Ji-Hyeon Gim > Research Engineer, Gluesys > Address. Gluesys R&D Cen
Re: [Gluster-users] State: Peer Rejected (Connected)
I now also restarted the glusterd daemon on node2 and arbiternode and it seems to work again. It's not healing some files and I hope all goes well. Thanks so far for your help. By the way I identified the process which suck up all the memory of my node1, it was that stupid mlocate script which runs in the early morning to index all files of a Linux server. I would recommend anyone using GlusterFS to uninstall the mlocate package to avoid this situation. > Original Message > Subject: Re: [Gluster-users] State: Peer Rejected (Connected) > Local Time: August 6, 2017 10:26 AM > UTC Time: August 6, 2017 8:26 AM > From: m...@protonmail.ch > To: Ji-Hyeon Gim > Gluster Users > Hi Ji-Hyeon, > Thanks to your help I could find out the problematic file. This would be the > quota file of my volume it has a different checksum on node1 whereas node2 > and arbiternode have the same checksum. This is expected as I had issues > which my quota file and had to fix it manually with a script (more details on > this mailing list in a previous post) and I only did that on node1. > So what I now did is to copy /var/lib/glusterd/vols/myvolume/quota.conf file > from node1 to node2 and arbiternode and then restart the glusterd process on > node1 but somehow this did not fix the issue. I suppose I am missing a step > here and maybe you have an idea what? > Here would be the relevant part of my glusterd.log file taken from node1: > [2017-08-06 08:16:57.699131] E [MSGID: 106012] > [glusterd-utils.c:2988:glusterd_compare_friend_volume] 0-management: Cksums > of quota configuration of volume myvolume differ. local cksum = 3823389269, > remote cksum = 733515336 on peer node2.domain.tld > [2017-08-06 08:16:57.275558] E [MSGID: 106012] > [glusterd-utils.c:2988:glusterd_compare_friend_volume] 0-management: Cksums > of quota configuration of volume myvolume differ. local cksum = 3823389269, > remote cksum = 733515336 on peer arbiternode.intra.oriented.ch > Best regards, > Mabi > >> Original Message >> Subject: Re: [Gluster-users] State: Peer Rejected (Connected) >> Local Time: August 6, 2017 9:31 AM >> UTC Time: August 6, 2017 7:31 AM >> From: potato...@potatogim.net >> To: mabi >> Gluster Users >> On 2017년 08월 06일 15:59, mabi wrote: >>> Hi, >>> >>> I have a 3 nodes replica (including arbiter) volume with GlusterFS >>> 3.8.11 and this night one of my nodes (node1) had an out of memory for >>> some unknown reason and as such the Linux OOM killer has killed the >>> glusterd and glusterfs process. I restarted the glusterd process but >>> now that node is in "Peer Rejected" state from the other nodes and >>> from itself it rejects the two other nodes as you can see below from >>> the output of "gluster peer status": >>> >>> Number of Peers: 2 >>> >>> Hostname: arbiternode.domain.tld >>> Uuid: 60a03a81-ba92-4b84-90fe-7b6e35a10975 >>> State: Peer Rejected (Connected) >>> >>> Hostname: node2.domain.tld >>> Uuid: 4834dceb-4356-4efb-ad8d-8baba44b967c >>> State: Peer Rejected (Connected) >>> >>> >>> >>> I also rebooted my node1 just in case but that did not help. >>> >>> I read here http://www.spinics.net/lists/gluster-users/msg25803.html >>> that the problem could have to do something with the volume info file, >>> in my case I checked the file: >>> >>> /var/lib/glusterd/vols/myvolume/info >>> >>> and they are the same on node1 and arbiternode but on node2 the order >>> of the following volume parameters are different: >>> >>> features.quota-deem-statfs=on >>> features.inode-quota=on >>> nfs.disable=on >>> performance.readdir-ahead=on >>> >>> Could that be the reason why the peer is in rejected status? can I >>> simply edit this file on node2 to re-order the parameters like on the >>> other 2 nodes? >>> >>> What else should I do to investigate the reason for this rejected peer >>> state? >>> >>> Thank you in advance for the help. >>> >>> Best, >>> Mabi >>> >>> >>> ___ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >> Hi mabi. >> In my opinion, It caused by some volfile/checksum mismatch. try to look >> glusterd log file(/var/log/glusterfs/glusterd.log) in REJECTED node, and >> find some log like below >
Re: [Gluster-users] How to delete geo-replication session?
Hi, I would really like to get rid of this geo-replication session as I am stuck with it right now. For example I can't even stop my volume as it complains about that geo-replcation... Can someone let me know how I can delete it? Thanks > Original Message > Subject: How to delete geo-replication session? > Local Time: August 1, 2017 12:15 PM > UTC Time: August 1, 2017 10:15 AM > From: m...@protonmail.ch > To: Gluster Users > Hi, > I would like to delete a geo-replication session on my GluterFS 3.8.11 > replicat 2 volume in order to re-create it. Unfortunately the "delete" > command does not work as you can see below: > $ sudo gluster volume geo-replication myvolume > gfs1geo.domain.tld::myvolume-geo delete > Staging failed on arbiternode.domain.tld. Error: Geo-replication session > between myvolume and arbiternode.domain.tld::myvolume-geo does not exist. > geo-replication command failed > I also tried with "force" but no luck here either: > $ sudo gluster volume geo-replication myvolume > gfs1geo.domain.tld::myvolume-geo delete force > Usage: volume geo-replication [] [] {create [[ssh-port n] > [[no-verify]|[push-pem]]] [force]|start [force]|stop [force]|pause > [force]|resume [force]|config|status [detail]|delete [reset-sync-time]} > [options...] > So how can I delete my geo-replication session manually? > Mind that I do not want to reset-sync-time, I would like to delete it and > re-create it so that it continues to geo replicate where it left from. > Thanks, > M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] How to delete geo-replication session?
When I run the "gluster volume geo-replication status" I see my geo replication session correctly including the volume name under the "VOL" column. I see my two nodes (node1 and node2) but not arbiternode as I have added it later after setting up geo-replication. For more details have a quick look at my previous post here: http://lists.gluster.org/pipermail/gluster-users/2017-July/031911.html Sorry for repeating myself but again: how can I manually delete this problematic geo-replication session? It seems to me that when I added the arbiternode it broke geo-replication. Alternatively how can I fix this situation? but I think the easiest would be to delete the geo replication session. Regards, Mabi > Original Message > Subject: Re: [Gluster-users] How to delete geo-replication session? > Local Time: August 8, 2017 7:19 AM > UTC Time: August 8, 2017 5:19 AM > From: avish...@redhat.com > To: mabi , Gluster Users > Do you see any session listed when Geo-replication status command is > run(without any volume name) > > gluster volume geo-replication status > > Volume stop force should work even if Geo-replication session exists. From > the error it looks like node "arbiternode.domain.tld" in Master cluster is > down or not reachable. > > regards > Aravinda VK > > On 08/07/2017 10:01 PM, mabi wrote: > >> Hi, >> >> I would really like to get rid of this geo-replication session as I am stuck >> with it right now. For example I can't even stop my volume as it complains >> about that geo-replcation... >> Can someone let me know how I can delete it? >> Thanks >> >>> Original Message >>> Subject: How to delete geo-replication session? >>> Local Time: August 1, 2017 12:15 PM >>> UTC Time: August 1, 2017 10:15 AM >>> From: m...@protonmail.ch >>> To: Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> Hi, >>> I would like to delete a geo-replication session on my GluterFS 3.8.11 >>> replicat 2 volume in order to re-create it. Unfortunately the "delete" >>> command does not work as you can see below: >>> $ sudo gluster volume geo-replication myvolume >>> gfs1geo.domain.tld::myvolume-geo delete >>> Staging failed on arbiternode.domain.tld. Error: Geo-replication session >>> between myvolume and arbiternode.domain.tld::myvolume-geo does not exist. >>> geo-replication command failed >>> I also tried with "force" but no luck here either: >>> $ sudo gluster volume geo-replication myvolume >>> gfs1geo.domain.tld::myvolume-geo delete force >>> Usage: volume geo-replication [] [] {create [[ssh-port >>> n] [[no-verify]|[push-pem]]] [force]|start [force]|stop [force]|pause >>> [force]|resume [force]|config|status [detail]|delete [reset-sync-time]} >>> [options...] >>> So how can I delete my geo-replication session manually? >>> Mind that I do not want to reset-sync-time, I would like to delete it and >>> re-create it so that it continues to geo replicate where it left from. >>> Thanks, >>> M. >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] How to delete geo-replication session?
Thank you very much Aravinda. I used your instructions and I have geo-replication working again and I am very happy about that. That's great if you can add this process to the documentation. I am sure others will also benefit from that. Hopefully last question: as mentioned in a previous post on this mailing list (http://lists.gluster.org/pipermail/gluster-users/2017-July/031906.html) I have a count 272 under the FAILURES column of my geo-replication sessions and I was wondering what is the procedure to fix or deal with this? > Original Message > Subject: Re: [Gluster-users] How to delete geo-replication session? > Local Time: August 8, 2017 11:20 AM > UTC Time: August 8, 2017 9:20 AM > From: avish...@redhat.com > To: mabi > Gluster Users > Sorry I missed your previous mail. > > Please perform the following steps once a new node is added > > - Run gsec create command again > gluster system:: execute gsec_create > > - Run Geo-rep create command with force and run start force > > gluster volume geo-replication :: create > push-pem force > gluster volume geo-replication :: start force > With these steps you will be able to stop/delete the Geo-rep session. I will > add these steps in the documentation > page(http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Geo%20Replication/). > > regards > Aravinda VK > > On 08/08/2017 12:08 PM, mabi wrote: > >> When I run the "gluster volume geo-replication status" I see my geo >> replication session correctly including the volume name under the "VOL" >> column. I see my two nodes (node1 and node2) but not arbiternode as I have >> added it later after setting up geo-replication. For more details have a >> quick look at my previous post here: >> http://lists.gluster.org/pipermail/gluster-users/2017-July/031911.html >> Sorry for repeating myself but again: how can I manually delete this >> problematic geo-replication session? >> It seems to me that when I added the arbiternode it broke geo-replication. >> Alternatively how can I fix this situation? but I think the easiest would be >> to delete the geo replication session. >> Regards, >> Mabi >> >>> Original Message >>> Subject: Re: [Gluster-users] How to delete geo-replication session? >>> Local Time: August 8, 2017 7:19 AM >>> UTC Time: August 8, 2017 5:19 AM >>> From: avish...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch), Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> Do you see any session listed when Geo-replication status command is >>> run(without any volume name) >>> >>> gluster volume geo-replication status >>> >>> Volume stop force should work even if Geo-replication session exists. From >>> the error it looks like node "arbiternode.domain.tld" in Master cluster is >>> down or not reachable. >>> >>> regards >>> Aravinda VK >>> >>> On 08/07/2017 10:01 PM, mabi wrote: >>> >>>> Hi, >>>> >>>> I would really like to get rid of this geo-replication session as I am >>>> stuck with it right now. For example I can't even stop my volume as it >>>> complains about that geo-replcation... >>>> Can someone let me know how I can delete it? >>>> Thanks >>>> >>>>> Original Message >>>>> Subject: How to delete geo-replication session? >>>>> Local Time: August 1, 2017 12:15 PM >>>>> UTC Time: August 1, 2017 10:15 AM >>>>> From: m...@protonmail.ch >>>>> To: Gluster Users >>>>> [](mailto:gluster-users@gluster.org) >>>>> Hi, >>>>> I would like to delete a geo-replication session on my GluterFS 3.8.11 >>>>> replicat 2 volume in order to re-create it. Unfortunately the "delete" >>>>> command does not work as you can see below: >>>>> $ sudo gluster volume geo-replication myvolume >>>>> gfs1geo.domain.tld::myvolume-geo delete >>>>> Staging failed on arbiternode.domain.tld. Error: Geo-replication session >>>>> between myvolume and arbiternode.domain.tld::myvolume-geo does not exist. >>>>> geo-replication command failed >>>>> I also tried with "force" but no luck here either: >>>>> $ sudo gluster volume geo-replication myvolume >>>>> gfs1geo.domain.tld::myvolume-geo delete force >>>>> Usage: volume geo-replication [] [] {create >>>>> [[ssh-port n] [[no-verify]|[push-pem]]] [force]|start [force]|stop >>>>> [force]|pause [force]|resume [force]|config|status [detail]|delete >>>>> [reset-sync-time]} [options...] >>>>> So how can I delete my geo-replication session manually? >>>>> Mind that I do not want to reset-sync-time, I would like to delete it and >>>>> re-create it so that it continues to geo replicate where it left from. >>>>> Thanks, >>>>> M. >>>> >>>> ___ >>>> Gluster-users mailing list >>>> Gluster-users@gluster.org >>>> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Manually delete .glusterfs/changelogs directory ?
Hello, I just deleted (permanently) my geo-replication session using the following command: gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo delete and noticed that the .glusterfs/changelogs on my volume still exists. Is it safe to delete the whole directly myself with "rm -rf .glusterfs/changelogs" ? As far as I understand the CHANGELOG.* files are only needed for geo-replication, correct? Finally shouldn't the geo-replication delete command I used above delete these files automatically for me? Regards, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] gverify.sh purpose
Hi, When creating a geo-replication session is the gverify.sh used or ran respectively? or is gverify.sh just an ad-hoc command to test manually if creating a geo-replication creationg would succeed? Best, M.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] gverify.sh purpose
Thanks Saravana for your quick answer. I was wondering because I have an issue where on my master geo-rep cluster I run a self-compiled version of GlusterFS from git and on my slave geo-replication node I run an official version. As such the versions do not match and the gverify.sh script fails. I have posted a mail in the gluster-devel mailing list yesterday as I think this has more to do with compilation. > Original Message > Subject: Re: [Gluster-users] gverify.sh purpose > Local Time: August 21, 2017 10:39 AM > UTC Time: August 21, 2017 8:39 AM > From: sarum...@redhat.com > To: mabi , Gluster Users > > On Saturday 19 August 2017 02:05 AM, mabi wrote: > >> Hi, >> >> When creating a geo-replication session is the gverify.sh used or ran >> respectively? > > Yes, It is executed as part of geo-replication session creation. > >> or is gverify.sh just an ad-hoc command to test manually if creating a >> geo-replication creationg would succeed? > > No need to run separately > > ~ > Saravana___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] self-heal not working
Hi, I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is currently one file listed to be healed as you can see below but never gets healed by the self-heal daemon: Brick node1.domain.tld:/data/myvolume/brick /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Status: Connected Number of entries: 1 Brick node2.domain.tld:/data/myvolume/brick /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Status: Connected Number of entries: 1 Brick node3.domain.tld:/srv/glusterfs/myvolume/brick /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Status: Connected Number of entries: 1 As once recommended on this mailing list I have mounted that glusterfs volume temporarily through fuse/glusterfs and ran a "stat" on that file which is listed above but nothing happened. The file itself is available on all 3 nodes/bricks but on the last node it has a different date. By the way this file is 0 kBytes big. Is that maybe the reason why the self-heal does not work? And how can I now make this file to heal? Thanks, Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] self-heal not working
Hi Ben, So it is really a 0 kBytes file everywhere (all nodes including the arbiter and from the client). Here below you will find the output you requested. Hopefully that will help to find out why this specific file is not healing... Let me know if you need any more information. Btw node3 is my arbiter node. NODE1: STAT: File: ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ Size: 0 Blocks: 38 IO Block: 131072 regular empty file Device: 24h/36d Inode: 10033884Links: 2 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:11:46.407404779 +0200 Change: 2017-08-14 17:11:46.407404779 +0200 Birth: - GETFATTR: trusted.afr.dirty=0sAQAA trusted.bit-rot.version=0sAgBZhuknAAlJAg== trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo= NODE2: STAT: File: ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ Size: 0 Blocks: 38 IO Block: 131072 regular empty file Device: 26h/38d Inode: 10031330Links: 2 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:11:46.403704181 +0200 Change: 2017-08-14 17:11:46.403704181 +0200 Birth: - GETFATTR: trusted.afr.dirty=0sAQAA trusted.bit-rot.version=0sAgBZhu6wAA8Hpw== trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE= NODE3: STAT: File: /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: ca11h/51729d Inode: 405208959 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:04:55.530681000 +0200 Change: 2017-08-14 17:11:46.604380051 +0200 Birth: - GETFATTR: trusted.afr.dirty=0sAQAA trusted.bit-rot.version=0sAgBZe6ejAAKPAg== trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4= CLIENT GLUSTER MOUNT: STAT: File: '/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png' Size: 0 Blocks: 0 IO Block: 131072 regular empty file Device: 1eh/30d Inode: 11897049013408443114 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:11:46.407404779 +0200 Change: 2017-08-14 17:11:46.407404779 +0200 Birth: - > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 21, 2017 9:34 PM > UTC Time: August 21, 2017 7:34 PM > From: btur...@redhat.com > To: mabi > Gluster Users > > ----- Original Message - >> From: "mabi" >> To: "Gluster Users" >> Sent: Monday, August 21, 2017 9:28:24 AM >> Subject: [Gluster-users] self-heal not working >> >> Hi, >> >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is >> currently one file listed to be healed as you can see below but never gets >> healed by the self-heal daemon: >> >> Brick node1.domain.tld:/data/myvolume/brick >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >> Status: Connected >> Number of entries: 1 >> >> Brick node2.domain.tld:/data/myvolume/brick >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >> Status: Connected >> Number of entries: 1 >> >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >> Status: Connected >> Number of entries: 1 >> >> As once recommended on this mailing list I have mounted that glusterfs volume >> temporarily through fuse/glusterfs and ran a "stat" on that file which is >> listed above but nothing happened. >> >> The file itself is available on all 3 nodes/bricks but on the last node it >> has a different date. By the way this file is 0 kBytes big. Is that maybe >> the reason why the self-heal does not work? > > Is the file actually 0 bytes or is it just 0 bytes on the arbiter(0 bytes are > expected on the arbiter, it just stores metadata)? Can you send us the output > from stat on all 3 nodes: > > $ stat > $ getfattr -d -m - > $ stat > > Lets see what things look like on the back end, it should tell us why healing > is failing. > > -b > >> >> And how can I now make this file to heal? >> >> Thanks, >> Mabi >> >> >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] self-heal not working
Sure, it doesn't look like a split brain based on the output: Brick node1.domain.tld:/data/myvolume/brick Status: Connected Number of entries in split-brain: 0 Brick node2.domain.tld:/data/myvolume/brick Status: Connected Number of entries in split-brain: 0 Brick node3.domain.tld:/srv/glusterfs/myvolume/brick Status: Connected Number of entries in split-brain: 0 > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 21, 2017 11:35 PM > UTC Time: August 21, 2017 9:35 PM > From: btur...@redhat.com > To: mabi > Gluster Users > > Can you also provide: > > gluster v heal info split-brain > > If it is split brain just delete the incorrect file from the brick and run > heal again. I haven"t tried this with arbiter but I assume the process is the > same. > > -b > > - Original Message - >> From: "mabi" >> To: "Ben Turner" >> Cc: "Gluster Users" >> Sent: Monday, August 21, 2017 4:55:59 PM >> Subject: Re: [Gluster-users] self-heal not working >> >> Hi Ben, >> >> So it is really a 0 kBytes file everywhere (all nodes including the arbiter >> and from the client). >> Here below you will find the output you requested. Hopefully that will help >> to find out why this specific file is not healing... Let me know if you need >> any more information. Btw node3 is my arbiter node. >> >> NODE1: >> >> STAT: >> File: >> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ >> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >> Device: 24h/36d Inode: 10033884 Links: 2 >> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >> Access: 2017-08-14 17:04:55.530681000 +0200 >> Modify: 2017-08-14 17:11:46.407404779 +0200 >> Change: 2017-08-14 17:11:46.407404779 +0200 >> Birth: - >> >> GETFATTR: >> trusted.afr.dirty=0sAQAA >> trusted.bit-rot.version=0sAgBZhuknAAlJAg== >> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== >> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo= >> >> NODE2: >> >> STAT: >> File: >> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ >> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >> Device: 26h/38d Inode: 10031330 Links: 2 >> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >> Access: 2017-08-14 17:04:55.530681000 +0200 >> Modify: 2017-08-14 17:11:46.403704181 +0200 >> Change: 2017-08-14 17:11:46.403704181 +0200 >> Birth: - >> >> GETFATTR: >> trusted.afr.dirty=0sAQAA >> trusted.bit-rot.version=0sAgBZhu6wAA8Hpw== >> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== >> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE= >> >> NODE3: >> STAT: >> File: >> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >> Size: 0 Blocks: 0 IO Block: 4096 regular empty file >> Device: ca11h/51729d Inode: 405208959 Links: 2 >> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >> Access: 2017-08-14 17:04:55.530681000 +0200 >> Modify: 2017-08-14 17:04:55.530681000 +0200 >> Change: 2017-08-14 17:11:46.604380051 +0200 >> Birth: - >> >> GETFATTR: >> trusted.afr.dirty=0sAQAA >> trusted.bit-rot.version=0sAgBZe6ejAAKPAg== >> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== >> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4= >> >> CLIENT GLUSTER MOUNT: >> STAT: >> File: >> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" >> Size: 0 Blocks: 0 IO Block: 131072 regular empty file >> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 >> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >> Access: 2017-08-14 17:04:55.530681000 +0200 >> Modify: 2017-08-14 17:11:46.407404779 +0200 >> Change: 2017-08-14 17:11:46.407404779 +0200 >> Birth: - >> >> > Original Message >> > Subject: Re: [Gluster-users] self-heal not working >> > Local Time: August 21, 2017 9:34 PM >> > UTC Time: August 21, 2017 7:34 PM >> > From: btur...@redhat.com >> > To: mabi >> > Gluster Users >> > >> > - Original Message - >> >> From: "mabi" >> >> To: "Gluster Users" >> >> Sent: Monday, August 21, 2017 9:28:24 AM >> >> Subject: [Gluster-users] se
Re: [Gluster-users] self-heal not working
Thanks for the additional hints, I have the following 2 questions first: - In order to launch the index heal is the following command correct: gluster volume heal myvolume - If I run a "volume start force" will it have any short disruptions on my clients which mount the volume through FUSE? If yes, how long? This is a production system that's why I am asking. > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 22, 2017 6:26 AM > UTC Time: August 22, 2017 4:26 AM > From: ravishan...@redhat.com > To: mabi , Ben Turner > Gluster Users > > Explore the following: > > - Launch index heal and look at the glustershd logs of all bricks for > possible errors > > - See if the glustershd in each node is connected to all bricks. > > - If not try to restart shd by `volume start force` > > - Launch index heal again and try. > > - Try debugging the shd log by setting client-log-level to DEBUG temporarily. > > On 08/22/2017 03:19 AM, mabi wrote: > >> Sure, it doesn't look like a split brain based on the output: >> >> Brick node1.domain.tld:/data/myvolume/brick >> Status: Connected >> Number of entries in split-brain: 0 >> >> Brick node2.domain.tld:/data/myvolume/brick >> Status: Connected >> Number of entries in split-brain: 0 >> >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >> Status: Connected >> Number of entries in split-brain: 0 >> >>> Original Message ---- >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 21, 2017 11:35 PM >>> UTC Time: August 21, 2017 9:35 PM >>> From: btur...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> Can you also provide: >>> >>> gluster v heal info split-brain >>> >>> If it is split brain just delete the incorrect file from the brick and run >>> heal again. I haven"t tried this with arbiter but I assume the process is >>> the same. >>> >>> -b >>> >>> - Original Message - >>>> From: "mabi" [](mailto:m...@protonmail.ch) >>>> To: "Ben Turner" [](mailto:btur...@redhat.com) >>>> Cc: "Gluster Users" >>>> [](mailto:gluster-users@gluster.org) >>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>> Hi Ben, >>>> >>>> So it is really a 0 kBytes file everywhere (all nodes including the arbiter >>>> and from the client). >>>> Here below you will find the output you requested. Hopefully that will help >>>> to find out why this specific file is not healing... Let me know if you >>>> need >>>> any more information. Btw node3 is my arbiter node. >>>> >>>> NODE1: >>>> >>>> STAT: >>>> File: >>>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ >>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>> Device: 24h/36d Inode: 10033884 Links: 2 >>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>> Birth: - >>>> >>>> GETFATTR: >>>> trusted.afr.dirty=0sAQAA >>>> trusted.bit-rot.version=0sAgBZhuknAAlJAg== >>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== >>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo= >>>> >>>> NODE2: >>>> >>>> STAT: >>>> File: >>>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ >>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>> Device: 26h/38d Inode: 10031330 Links: 2 >>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>> Modify: 2017-08-14 17:11:46.403704181 +0200 >>>> Change: 2017-08-14 17:11:46.403704181 +0200 >>>> Birth: - >>>> >>>> GETFATTR: >>>> trusted.afr.dirty=0sAQAA >>>> trusted.bit-rot.version=0sAgBZhu6wAA8Hpw== >>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6
Re: [Gluster-users] self-heal not working
Yes, I have indeed a small test cluster with 3 Raspberry Pis but unfortunately I have other issues with that one. So I tried the "volume start force" which restarted glustershd on every nodes but nothing changed and running the heal does not do anything, there is still that one single file to be healed. Finally I would like to use your last suggestion to increase client-log-level to DEBUG. In order to do that would the following command be correct? gluster volume set myvolume diagnostics.client-log-level DEBUG > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 22, 2017 11:51 AM > UTC Time: August 22, 2017 9:51 AM > From: ravishan...@redhat.com > To: mabi > Ben Turner , Gluster Users > > On 08/22/2017 02:30 PM, mabi wrote: > >> Thanks for the additional hints, I have the following 2 questions first: >> >> - In order to launch the index heal is the following command correct: >> gluster volume heal myvolume > > Yes > >> - If I run a "volume start force" will it have any short disruptions on my >> clients which mount the volume through FUSE? If yes, how long? This is a >> production system that's why I am asking. > > No. You can actually create a test volume on your personal linux box to try > these kinds of things without needing multiple machines. This is how we > develop and test our patches :) > 'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} force` > and so on. > > HTH, > Ravi > >>> Original Message >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 22, 2017 6:26 AM >>> UTC Time: August 22, 2017 4:26 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch), Ben Turner >>> [](mailto:btur...@redhat.com) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> Explore the following: >>> >>> - Launch index heal and look at the glustershd logs of all bricks for >>> possible errors >>> >>> - See if the glustershd in each node is connected to all bricks. >>> >>> - If not try to restart shd by `volume start force` >>> >>> - Launch index heal again and try. >>> >>> - Try debugging the shd log by setting client-log-level to DEBUG >>> temporarily. >>> >>> On 08/22/2017 03:19 AM, mabi wrote: >>> >>>> Sure, it doesn't look like a split brain based on the output: >>>> >>>> Brick node1.domain.tld:/data/myvolume/brick >>>> Status: Connected >>>> Number of entries in split-brain: 0 >>>> >>>> Brick node2.domain.tld:/data/myvolume/brick >>>> Status: Connected >>>> Number of entries in split-brain: 0 >>>> >>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>> Status: Connected >>>> Number of entries in split-brain: 0 >>>> >>>>> Original Message >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> Local Time: August 21, 2017 11:35 PM >>>>> UTC Time: August 21, 2017 9:35 PM >>>>> From: btur...@redhat.com >>>>> To: mabi [](mailto:m...@protonmail.ch) >>>>> Gluster Users >>>>> [](mailto:gluster-users@gluster.org) >>>>> >>>>> Can you also provide: >>>>> >>>>> gluster v heal info split-brain >>>>> >>>>> If it is split brain just delete the incorrect file from the brick and >>>>> run heal again. I haven"t tried this with arbiter but I assume the >>>>> process is the same. >>>>> >>>>> -b >>>>> >>>>> - Original Message - >>>>>> From: "mabi" [](mailto:m...@protonmail.ch) >>>>>> To: "Ben Turner" [](mailto:btur...@redhat.com) >>>>>> Cc: "Gluster Users" >>>>>> [](mailto:gluster-users@gluster.org) >>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> >>>>>> Hi Ben, >>>>>> >>>>>> So it is really a 0 kBytes file everywhere (all nodes including the >>>>>> arbiter >>>>>> and from the client). >>>>>> Here below you will find the output you requested. Hopefully tha
Re: [Gluster-users] self-heal not working
I just saw the following bug which was fixed in 3.8.15: https://bugzilla.redhat.com/show_bug.cgi?id=1471613 Is it possible that the problem I described in this post is related to that bug? > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 22, 2017 11:51 AM > UTC Time: August 22, 2017 9:51 AM > From: ravishan...@redhat.com > To: mabi > Ben Turner , Gluster Users > > On 08/22/2017 02:30 PM, mabi wrote: > >> Thanks for the additional hints, I have the following 2 questions first: >> >> - In order to launch the index heal is the following command correct: >> gluster volume heal myvolume > > Yes > >> - If I run a "volume start force" will it have any short disruptions on my >> clients which mount the volume through FUSE? If yes, how long? This is a >> production system that's why I am asking. > > No. You can actually create a test volume on your personal linux box to try > these kinds of things without needing multiple machines. This is how we > develop and test our patches :) > 'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} force` > and so on. > > HTH, > Ravi > >>> Original Message >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 22, 2017 6:26 AM >>> UTC Time: August 22, 2017 4:26 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch), Ben Turner >>> [](mailto:btur...@redhat.com) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> Explore the following: >>> >>> - Launch index heal and look at the glustershd logs of all bricks for >>> possible errors >>> >>> - See if the glustershd in each node is connected to all bricks. >>> >>> - If not try to restart shd by `volume start force` >>> >>> - Launch index heal again and try. >>> >>> - Try debugging the shd log by setting client-log-level to DEBUG >>> temporarily. >>> >>> On 08/22/2017 03:19 AM, mabi wrote: >>> >>>> Sure, it doesn't look like a split brain based on the output: >>>> >>>> Brick node1.domain.tld:/data/myvolume/brick >>>> Status: Connected >>>> Number of entries in split-brain: 0 >>>> >>>> Brick node2.domain.tld:/data/myvolume/brick >>>> Status: Connected >>>> Number of entries in split-brain: 0 >>>> >>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>> Status: Connected >>>> Number of entries in split-brain: 0 >>>> >>>>> Original Message >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> Local Time: August 21, 2017 11:35 PM >>>>> UTC Time: August 21, 2017 9:35 PM >>>>> From: btur...@redhat.com >>>>> To: mabi [](mailto:m...@protonmail.ch) >>>>> Gluster Users >>>>> [](mailto:gluster-users@gluster.org) >>>>> >>>>> Can you also provide: >>>>> >>>>> gluster v heal info split-brain >>>>> >>>>> If it is split brain just delete the incorrect file from the brick and >>>>> run heal again. I haven"t tried this with arbiter but I assume the >>>>> process is the same. >>>>> >>>>> -b >>>>> >>>>> - Original Message - >>>>>> From: "mabi" [](mailto:m...@protonmail.ch) >>>>>> To: "Ben Turner" [](mailto:btur...@redhat.com) >>>>>> Cc: "Gluster Users" >>>>>> [](mailto:gluster-users@gluster.org) >>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> >>>>>> Hi Ben, >>>>>> >>>>>> So it is really a 0 kBytes file everywhere (all nodes including the >>>>>> arbiter >>>>>> and from the client). >>>>>> Here below you will find the output you requested. Hopefully that will >>>>>> help >>>>>> to find out why this specific file is not healing... Let me know if you >>>>>> need >>>>>> any more information. Btw node3 is my arbiter node. >>>>>> >>>>>> NODE1: >>>>>> >
Re: [Gluster-users] self-heal not working
Hi Ravi, Did you get a chance to have a look at the log files I have attached in my last mail? Best, Mabi > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 24, 2017 12:08 PM > UTC Time: August 24, 2017 10:08 AM > From: m...@protonmail.ch > To: Ravishankar N > Ben Turner , Gluster Users > > Thanks for confirming the command. I have now enabled DEBUG client-log-level, > run a heal and then attached the glustershd log files of all 3 nodes in this > mail. > > The volume concerned is called myvol-pro, the other 3 volumes have no problem > so far. > > Also note that in the mean time it looks like the file has been deleted by > the user and as such the heal info command does not show the file name > anymore but just is GFID which is: > > gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea > > Hope that helps for debugging this issue. > >> Original Message >> Subject: Re: [Gluster-users] self-heal not working >> Local Time: August 24, 2017 5:58 AM >> UTC Time: August 24, 2017 3:58 AM >> From: ravishan...@redhat.com >> To: mabi >> Ben Turner , Gluster Users >> >> Unlikely. In your case only the afr.dirty is set, not the >> afr.volname-client-xx xattr. >> >> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is right. >> >> On 08/23/2017 10:31 PM, mabi wrote: >> >>> I just saw the following bug which was fixed in 3.8.15: >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>> >>> Is it possible that the problem I described in this post is related to that >>> bug? >>> >>>> Original Message >>>> Subject: Re: [Gluster-users] self-heal not working >>>> Local Time: August 22, 2017 11:51 AM >>>> UTC Time: August 22, 2017 9:51 AM >>>> From: ravishan...@redhat.com >>>> To: mabi [](mailto:m...@protonmail.ch) >>>> Ben Turner [](mailto:btur...@redhat.com), Gluster >>>> Users [](mailto:gluster-users@gluster.org) >>>> >>>> On 08/22/2017 02:30 PM, mabi wrote: >>>> >>>>> Thanks for the additional hints, I have the following 2 questions first: >>>>> >>>>> - In order to launch the index heal is the following command correct: >>>>> gluster volume heal myvolume >>>> >>>> Yes >>>> >>>>> - If I run a "volume start force" will it have any short disruptions on >>>>> my clients which mount the volume through FUSE? If yes, how long? This is >>>>> a production system that's why I am asking. >>>> >>>> No. You can actually create a test volume on your personal linux box to >>>> try these kinds of things without needing multiple machines. This is how >>>> we develop and test our patches :) >>>> 'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} >>>> force` and so on. >>>> >>>> HTH, >>>> Ravi >>>> >>>>>> Original Message >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 22, 2017 6:26 AM >>>>>> UTC Time: August 22, 2017 4:26 AM >>>>>> From: ravishan...@redhat.com >>>>>> To: mabi [](mailto:m...@protonmail.ch), Ben Turner >>>>>> [](mailto:btur...@redhat.com) >>>>>> Gluster Users >>>>>> [](mailto:gluster-users@gluster.org) >>>>>> >>>>>> Explore the following: >>>>>> >>>>>> - Launch index heal and look at the glustershd logs of all bricks for >>>>>> possible errors >>>>>> >>>>>> - See if the glustershd in each node is connected to all bricks. >>>>>> >>>>>> - If not try to restart shd by `volume start force` >>>>>> >>>>>> - Launch index heal again and try. >>>>>> >>>>>> - Try debugging the shd log by setting client-log-level to DEBUG >>>>>> temporarily. >>>>>> >>>>>> On 08/22/2017 03:19 AM, mabi wrote: >>>>>> >>>>>>> Sure, it doesn't look like a split brain based on the output: >>>>>>> >>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>> Status: Connected >>>>
Re: [Gluster-users] self-heal not working
Thanks Ravi for your analysis. So as far as I understand nothing to worry about but my question now would be: how do I get rid of this file from the heal info? > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 27, 2017 3:45 PM > UTC Time: August 27, 2017 1:45 PM > From: ravishan...@redhat.com > To: mabi > Ben Turner , Gluster Users > > Yes, the shds did pick up the file for healing (I saw messages like " got > entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. > > Anyway I reproduced it by manually setting the afr.dirty bit for a zero byte > file on all 3 bricks. Since there are no afr pending xattrs indicating > good/bad copies and all files are zero bytes, the data self-heal algorithm > just picks the file with the latest ctime as source. In your case that was > the arbiter brick. In the code, there is a check to prevent data heals if > arbiter is the source. So heal was not happening and the entries were not > removed from heal-info output. > > Perhaps we should add a check in the code to just remove the entries from > heal-info if size is zero bytes in all bricks. > > -Ravi > > On 08/25/2017 06:33 PM, mabi wrote: > >> Hi Ravi, >> >> Did you get a chance to have a look at the log files I have attached in my >> last mail? >> >> Best, >> Mabi >> >>> Original Message >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 24, 2017 12:08 PM >>> UTC Time: August 24, 2017 10:08 AM >>> From: m...@protonmail.ch >>> To: Ravishankar N [](mailto:ravishan...@redhat.com) >>> Ben Turner [](mailto:btur...@redhat.com), Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> Thanks for confirming the command. I have now enabled DEBUG >>> client-log-level, run a heal and then attached the glustershd log files of >>> all 3 nodes in this mail. >>> >>> The volume concerned is called myvol-pro, the other 3 volumes have no >>> problem so far. >>> >>> Also note that in the mean time it looks like the file has been deleted by >>> the user and as such the heal info command does not show the file name >>> anymore but just is GFID which is: >>> >>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>> >>> Hope that helps for debugging this issue. >>> >>>> Original Message >>>> Subject: Re: [Gluster-users] self-heal not working >>>> Local Time: August 24, 2017 5:58 AM >>>> UTC Time: August 24, 2017 3:58 AM >>>> From: ravishan...@redhat.com >>>> To: mabi [](mailto:m...@protonmail.ch) >>>> Ben Turner [](mailto:btur...@redhat.com), Gluster >>>> Users [](mailto:gluster-users@gluster.org) >>>> >>>> Unlikely. In your case only the afr.dirty is set, not the >>>> afr.volname-client-xx xattr. >>>> >>>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is right. >>>> >>>> On 08/23/2017 10:31 PM, mabi wrote: >>>> >>>>> I just saw the following bug which was fixed in 3.8.15: >>>>> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>>>> >>>>> Is it possible that the problem I described in this post is related to >>>>> that bug? >>>>> >>>>>> Original Message >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 22, 2017 11:51 AM >>>>>> UTC Time: August 22, 2017 9:51 AM >>>>>> From: ravishan...@redhat.com >>>>>> To: mabi [](mailto:m...@protonmail.ch) >>>>>> Ben Turner [](mailto:btur...@redhat.com), Gluster >>>>>> Users [](mailto:gluster-users@gluster.org) >>>>>> >>>>>> On 08/22/2017 02:30 PM, mabi wrote: >>>>>> >>>>>>> Thanks for the additional hints, I have the following 2 questions first: >>>>>>> >>>>>>> - In order to launch the index heal is the following command correct: >>>>>>> gluster volume heal myvolume >>>>>> >>>>>> Yes >>>>>> >>>>>>> - If I run a "volume start force" will it have any short disruptions on >>>>>>> my clients which mount the volume through FUSE? If yes, how long? This >>
Re: [Gluster-users] self-heal not working
Excuse me for my naive questions but how do I reset the afr.dirty xattr on the file to be healed? and do I need to do that through a FUSE mount? or simply on every bricks directly? > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 28, 2017 5:58 AM > UTC Time: August 28, 2017 3:58 AM > From: ravishan...@redhat.com > To: Ben Turner , mabi > Gluster Users > > On 08/28/2017 01:57 AM, Ben Turner wrote: >> - Original Message - >>> From: "mabi" >>> To: "Ravishankar N" >>> Cc: "Ben Turner" , "Gluster Users" >>> >>> Sent: Sunday, August 27, 2017 3:15:33 PM >>> Subject: Re: [Gluster-users] self-heal not working >>> >>> Thanks Ravi for your analysis. So as far as I understand nothing to worry >>> about but my question now would be: how do I get rid of this file from the >>> heal info? >> Correct me if I am wrong but clearing this is just a matter of resetting the >> afr.dirty xattr? @Ravi - Is this correct? > > Yes resetting the xattr and launching index heal or running heal-info > command should serve as a workaround. > -Ravi > >> >> -b >> >>>> Original Message >>>> Subject: Re: [Gluster-users] self-heal not working >>>> Local Time: August 27, 2017 3:45 PM >>>> UTC Time: August 27, 2017 1:45 PM >>>> From: ravishan...@redhat.com >>>> To: mabi >>>> Ben Turner , Gluster Users >>>> >>>> Yes, the shds did pick up the file for healing (I saw messages like " got >>>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. >>>> >>>> Anyway I reproduced it by manually setting the afr.dirty bit for a zero >>>> byte file on all 3 bricks. Since there are no afr pending xattrs >>>> indicating good/bad copies and all files are zero bytes, the data >>>> self-heal algorithm just picks the file with the latest ctime as source. >>>> In your case that was the arbiter brick. In the code, there is a check to >>>> prevent data heals if arbiter is the source. So heal was not happening and >>>> the entries were not removed from heal-info output. >>>> >>>> Perhaps we should add a check in the code to just remove the entries from >>>> heal-info if size is zero bytes in all bricks. >>>> >>>> -Ravi >>>> >>>> On 08/25/2017 06:33 PM, mabi wrote: >>>> >>>>> Hi Ravi, >>>>> >>>>> Did you get a chance to have a look at the log files I have attached in my >>>>> last mail? >>>>> >>>>> Best, >>>>> Mabi >>>>> >>>>>> Original Message >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 24, 2017 12:08 PM >>>>>> UTC Time: August 24, 2017 10:08 AM >>>>>> From: m...@protonmail.ch >>>>>> To: Ravishankar N >>>>>> [](mailto:ravishan...@redhat.com) >>>>>> Ben Turner [](mailto:btur...@redhat.com), Gluster >>>>>> Users [](mailto:gluster-users@gluster.org) >>>>>> >>>>>> Thanks for confirming the command. I have now enabled DEBUG >>>>>> client-log-level, run a heal and then attached the glustershd log files >>>>>> of all 3 nodes in this mail. >>>>>> >>>>>> The volume concerned is called myvol-pro, the other 3 volumes have no >>>>>> problem so far. >>>>>> >>>>>> Also note that in the mean time it looks like the file has been deleted >>>>>> by the user and as such the heal info command does not show the file >>>>>> name anymore but just is GFID which is: >>>>>> >>>>>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>>>>> >>>>>> Hope that helps for debugging this issue. >>>>>> >>>>>>> Original Message ---- >>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>> Local Time: August 24, 2017 5:58 AM >>>>>>> UTC Time: August 24, 2017 3:58 AM >>>>>>> From: ravishan...@redhat.com >>>>>>> To: mabi [](mailto:m...@protonmail.ch) >>>>>>> Ben Turner [](mailto:btur
Re: [Gluster-users] self-heal not working
Thank you for the command. I ran it on all my nodes and now finally the the self-heal daemon does not report any files to be healed. Hopefully this scenario can get handled properly in newer versions of GlusterFS. > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 28, 2017 10:41 AM > UTC Time: August 28, 2017 8:41 AM > From: ravishan...@redhat.com > To: mabi > Ben Turner , Gluster Users > > On 08/28/2017 01:29 PM, mabi wrote: > >> Excuse me for my naive questions but how do I reset the afr.dirty xattr on >> the file to be healed? and do I need to do that through a FUSE mount? or >> simply on every bricks directly? > > Directly on the bricks: `setfattr -n trusted.afr.dirty -v > 0x > /data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png` > -Ravi > >>> Original Message >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 28, 2017 5:58 AM >>> UTC Time: August 28, 2017 3:58 AM >>> From: ravishan...@redhat.com >>> To: Ben Turner [](mailto:btur...@redhat.com), mabi >>> [](mailto:m...@protonmail.ch) >>> Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> On 08/28/2017 01:57 AM, Ben Turner wrote: >>>> - Original Message - >>>>> From: "mabi" [](mailto:m...@protonmail.ch) >>>>> To: "Ravishankar N" >>>>> [](mailto:ravishan...@redhat.com) >>>>> Cc: "Ben Turner" [](mailto:btur...@redhat.com), >>>>> "Gluster Users" >>>>> [](mailto:gluster-users@gluster.org) >>>>> Sent: Sunday, August 27, 2017 3:15:33 PM >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> >>>>> Thanks Ravi for your analysis. So as far as I understand nothing to worry >>>>> about but my question now would be: how do I get rid of this file from the >>>>> heal info? >>>> Correct me if I am wrong but clearing this is just a matter of resetting >>>> the afr.dirty xattr? @Ravi - Is this correct? >>> >>> Yes resetting the xattr and launching index heal or running heal-info >>> command should serve as a workaround. >>> -Ravi >>> >>>> >>>> -b >>>> >>>>>> Original Message >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 27, 2017 3:45 PM >>>>>> UTC Time: August 27, 2017 1:45 PM >>>>>> From: ravishan...@redhat.com >>>>>> To: mabi [](mailto:m...@protonmail.ch) >>>>>> Ben Turner [](mailto:btur...@redhat.com), Gluster >>>>>> Users [](mailto:gluster-users@gluster.org) >>>>>> >>>>>> Yes, the shds did pick up the file for healing (I saw messages like " got >>>>>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. >>>>>> >>>>>> Anyway I reproduced it by manually setting the afr.dirty bit for a zero >>>>>> byte file on all 3 bricks. Since there are no afr pending xattrs >>>>>> indicating good/bad copies and all files are zero bytes, the data >>>>>> self-heal algorithm just picks the file with the latest ctime as source. >>>>>> In your case that was the arbiter brick. In the code, there is a check to >>>>>> prevent data heals if arbiter is the source. So heal was not happening >>>>>> and >>>>>> the entries were not removed from heal-info output. >>>>>> >>>>>> Perhaps we should add a check in the code to just remove the entries from >>>>>> heal-info if size is zero bytes in all bricks. >>>>>> >>>>>> -Ravi >>>>>> >>>>>> On 08/25/2017 06:33 PM, mabi wrote: >>>>>> >>>>>>> Hi Ravi, >>>>>>> >>>>>>> Did you get a chance to have a look at the log files I have attached in >>>>>>> my >>>>>>> last mail? >>>>>>> >>>>>>> Best, >>>>>>> Mabi >>>>>>> >>>>>>>> Original Message >>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>
Re: [Gluster-users] self-heal not working
As suggested I have now opened a bug on bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1486063 > Original Message > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 28, 2017 4:29 PM > UTC Time: August 28, 2017 2:29 PM > From: ravishan...@redhat.com > To: mabi > Ben Turner , Gluster Users > > Great, can you raise a bug for the issue so that it is easier to keep track > (plus you'll be notified if the patch is posted) of it? The general > guidelines are @ > https://gluster.readthedocs.io/en/latest/Contributors-Guide/Bug-Reporting-Guidelines > but you just need to provide whatever you described in this email thread in > the bug: > > i.e. volume info, heal info, getfattr and stat output of the file in question. > > Thanks! > Ravi > > On 08/28/2017 07:49 PM, mabi wrote: > >> Thank you for the command. I ran it on all my nodes and now finally the the >> self-heal daemon does not report any files to be healed. Hopefully this >> scenario can get handled properly in newer versions of GlusterFS. >> >>> Original Message >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 28, 2017 10:41 AM >>> UTC Time: August 28, 2017 8:41 AM >>> From: ravishan...@redhat.com >>> To: mabi [](mailto:m...@protonmail.ch) >>> Ben Turner [](mailto:btur...@redhat.com), Gluster Users >>> [](mailto:gluster-users@gluster.org) >>> >>> On 08/28/2017 01:29 PM, mabi wrote: >>> >>>> Excuse me for my naive questions but how do I reset the afr.dirty xattr on >>>> the file to be healed? and do I need to do that through a FUSE mount? or >>>> simply on every bricks directly? >>> >>> Directly on the bricks: `setfattr -n trusted.afr.dirty -v >>> 0x >>> /data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png` >>> -Ravi >>> >>>>> Original Message >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> Local Time: August 28, 2017 5:58 AM >>>>> UTC Time: August 28, 2017 3:58 AM >>>>> From: ravishan...@redhat.com >>>>> To: Ben Turner [](mailto:btur...@redhat.com), mabi >>>>> [](mailto:m...@protonmail.ch) >>>>> Gluster Users >>>>> [](mailto:gluster-users@gluster.org) >>>>> >>>>> On 08/28/2017 01:57 AM, Ben Turner wrote: >>>>>> - Original Message - >>>>>>> From: "mabi" [](mailto:m...@protonmail.ch) >>>>>>> To: "Ravishankar N" >>>>>>> [](mailto:ravishan...@redhat.com) >>>>>>> Cc: "Ben Turner" [](mailto:btur...@redhat.com), >>>>>>> "Gluster Users" >>>>>>> [](mailto:gluster-users@gluster.org) >>>>>>> Sent: Sunday, August 27, 2017 3:15:33 PM >>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>> >>>>>>> Thanks Ravi for your analysis. So as far as I understand nothing to >>>>>>> worry >>>>>>> about but my question now would be: how do I get rid of this file from >>>>>>> the >>>>>>> heal info? >>>>>> Correct me if I am wrong but clearing this is just a matter of resetting >>>>>> the afr.dirty xattr? @Ravi - Is this correct? >>>>> >>>>> Yes resetting the xattr and launching index heal or running heal-info >>>>> command should serve as a workaround. >>>>> -Ravi >>>>> >>>>>> >>>>>> -b >>>>>> >>>>>>>> Original Message >>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>> Local Time: August 27, 2017 3:45 PM >>>>>>>> UTC Time: August 27, 2017 1:45 PM >>>>>>>> From: ravishan...@redhat.com >>>>>>>> To: mabi [](mailto:m...@protonmail.ch) >>>>>>>> Ben Turner [](mailto:btur...@redhat.com), Gluster >>>>>>>> Users [](mailto:gluster-users@gluster.org) >>>>>>>> >>>>>>>> Yes, the shds did pick up the file for healing (I saw messages like " >>>>>>>> got >>>>>>>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea")
Re: [Gluster-users] Manually delete .glusterfs/changelogs directory ?
Hi, has anyone any advice to give about my question below? Thanks! > Original Message > Subject: Manually delete .glusterfs/changelogs directory ? > Local Time: August 16, 2017 5:59 PM > UTC Time: August 16, 2017 3:59 PM > From: m...@protonmail.ch > To: Gluster Users > > Hello, > > I just deleted (permanently) my geo-replication session using the following > command: > > gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo > delete > > and noticed that the .glusterfs/changelogs on my volume still exists. Is it > safe to delete the whole directly myself with "rm -rf .glusterfs/changelogs" > ? As far as I understand the CHANGELOG.* files are only needed for > geo-replication, correct? > > Finally shouldn't the geo-replication delete command I used above delete > these files automatically for me? > > Regards, > Mabi___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Manually delete .glusterfs/changelogs directory ?
Hi Everton, Thanks for your tip regarding the "reset-sync-time". I understand now that I should have used this additional parameter in order to get rid of the CHANGELOG files. I will now manually delete them from all bricks. Also I have noticed the following 3 geo-replication related volume parameters are still set on my volume: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on I will also remove them manually. Best, M. > Original Message > Subject: Re: [Gluster-users] Manually delete .glusterfs/changelogs directory ? > Local Time: August 31, 2017 8:56 AM > UTC Time: August 31, 2017 6:56 AM > From: broglia...@gmail.com > To: mabi > Gluster Users > > Hi Mabi, > If you will not use that geo-replication volume session again, I believe it > is safe to manually delete the files in the brick directory using rm -rf. > > However, the gluster documentation specifies that if the session is to be > permanently deleted, this is the command to use: > gluster volume geo-replication gv1 snode1::gv2 delete reset-sync-time > > https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Geo%20Replication/#deleting-the-session > > Regards, > Everton Brogliatto > > On Thu, Aug 31, 2017 at 12:15 AM, mabi wrote: > >> Hi, has anyone any advice to give about my question below? Thanks! >> >>> Original Message >>> Subject: Manually delete .glusterfs/changelogs directory ? >>> Local Time: August 16, 2017 5:59 PM >>> UTC Time: August 16, 2017 3:59 PM >>> From: m...@protonmail.ch >>> To: Gluster Users >>> >>> Hello, >>> >>> I just deleted (permanently) my geo-replication session using the following >>> command: >>> >>> gluster volume geo-replication myvolume gfs1geo.domain.tld::myvolume-geo >>> delete >>> >>> and noticed that the .glusterfs/changelogs on my volume still exists. Is it >>> safe to delete the whole directly myself with "rm -rf >>> .glusterfs/changelogs" ? As far as I understand the CHANGELOG.* files are >>> only needed for geo-replication, correct? >>> >>> Finally shouldn't the geo-replication delete command I used above delete >>> these files automatically for me? >>> >>> Regards, >>> Mabi >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Gluster Health Report tool
Hi Aravinda, Very nice initiative, thank you very much! As as small recommendation it would be nice to have a "nagios/icinga" mode, maybe through a "-n" parameter which will do the health check and output the status ina nagios/icinga compatible format. As such this tool could be directly used by nagios for monitoring. Best, M. > Original Message > Subject: [Gluster-devel] Gluster Health Report tool > Local Time: October 25, 2017 2:11 PM > UTC Time: October 25, 2017 12:11 PM > From: avish...@redhat.com > To: Gluster Devel , gluster-users > > > Hi, > > We started a new project to identify issues/misconfigurations in > Gluster nodes. This project is very young and not yet ready for > Production use, Feedback on the existing reports and ideas for more > Reports are welcome. > > This tool needs to run in every Gluster node to detect the local > issues (Example: Parsing log files, checking disk space etc) in each > Nodes. But some of the reports use Gluster CLI to identify the issues > Which can be run in any one node.(For example > gluster-health-report --run-only glusterd-peer-disconnect) > > Install > > sudo pip install gluster-health-report > > Usage > > Run gluster-health-report --help for help > > gluster-health-report > > Example output is available here > https://github.com/aravindavk/gluster-health-report > > Project Details > > - Issue page: https://github.com/gluster/glusterfs/issues/313 > > - Project page: https://github.com/aravindavk/gluster-health-report > > - Open new issue if you have new report suggestion or found issue with > existing report > https://github.com/aravindavk/gluster-health-report/issues > > -- > > regards > Aravinda VK > > --- > > Gluster-devel mailing list > gluster-de...@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] GlusterFS 4.1.9 Debian stretch packages missing
Hello, I would like to upgrade my GlusterFS 4.1.8 cluster to 4.1.9 on my Debian stretch nodes. Unfortunately the packages are missing as you can see here: https://download.gluster.org/pub/gluster/glusterfs/4.1/4.1.9/Debian/stretch/amd64/apt/ As far as I know GlusterFS 4.1 is not yet EOL so I don't understand why the packages are missing... Maybe an error? Could please someone check? Thank you very much in advance. Best, M. ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] GlusterFS FUSE client on BSD
Hello, Is there a way to mount a GlusterFS volume using FUSE on an BSD machine such as OpenBSD? If not, what is the alternative, I guess NFS? Regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Announcing Gluster release 5.11
Dear Hari, Nearly 10 days after your announcement unfortunately the 5.11 Debian stretch packages are still missing: https://download.gluster.org/pub/gluster/glusterfs/5/5.11/Debian/stretch/amd64/apt/pool/main/g/glusterfs/ Do you know when they will be available? or has this maybe been forgotten? Thank you very much in advance. Best regards, Mabi ‐‐‐ Original Message ‐‐‐ On Wednesday, December 18, 2019 4:56 AM, Hari Gowtham wrote: > Hi, > > The Gluster community is pleased to announce the release of Gluster > 5.11 (packages available at [1]). > > Release notes for the release can be found at [2]. > > Major changes, features and limitations addressed in this release: > None > > Thanks, > Gluster community > > [1] Packages for 5.11: > https://download.gluster.org/pub/gluster/glusterfs/5/5.11/ > > [2] Release notes for 5.11: > https://docs.gluster.org/en/latest/release-notes/5.11/ > > -- > Regards, > Hari Gowtham. Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Announcing Gluster release 5.11
Thank you very much for your fast response and for adding the missing Debian packages. ‐‐‐ Original Message ‐‐‐ On Friday, December 27, 2019 10:36 AM, Shwetha Acharya wrote: > Hi Mabi, > > Glusterfs 5.11 Debian amd64 stretch packages are now available. > > Regards, > Shwetha > > On Fri, Dec 27, 2019 at 1:37 PM mabi wrote: > >> Dear Hari, >> >> Nearly 10 days after your announcement unfortunately the 5.11 Debian stretch >> packages are still missing: >> >> https://download.gluster.org/pub/gluster/glusterfs/5/5.11/Debian/stretch/amd64/apt/pool/main/g/glusterfs/ >> >> Do you know when they will be available? or has this maybe been forgotten? >> >> Thank you very much in advance. >> >> Best regards, >> Mabi >> >> ‐‐‐ Original Message ‐‐‐ >> On Wednesday, December 18, 2019 4:56 AM, Hari Gowtham >> wrote: >> >>> Hi, >>> >>> The Gluster community is pleased to announce the release of Gluster >>> 5.11 (packages available at [1]). >>> >>> Release notes for the release can be found at [2]. >>> >>> Major changes, features and limitations addressed in this release: >>> None >>> >>> Thanks, >>> Gluster community >>> >>> [1] Packages for 5.11: >>> https://download.gluster.org/pub/gluster/glusterfs/5/5.11/ >>> >>> [2] Release notes for 5.11: >>> https://docs.gluster.org/en/latest/release-notes/5.11/ >>> >>> -- >>> Regards, >>> Hari Gowtham. >> >> >> >> Community Meeting Calendar: >> >> APAC Schedule - >> Every 2nd and 4th Tuesday at 11:30 AM IST >> Bridge: https://bluejeans.com/441850968 >> >> NA/EMEA Schedule - >> Every 1st and 3rd Tuesday at 01:00 PM EDT >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] writing to fuse device failed: No such file or directory
Hello, On the FUSE clients of my GlusterFS 5.11 two-node replica+arbitrer I see quite a lot of the following error message repeatedly: [2020-03-02 14:12:40.297690] E [fuse-bridge.c:219:check_and_dump_fuse_W] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7f93d5c13cfe] (--> /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x789a)[0x7f93d331989a] (--> /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x7c33)[0x7f93d3319c33] (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7f93d4e8f4a4] (--> /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f93d46ead0f] ) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory Both the server and clients are Debian 9. What exactly does this error message mean? And is it normal? or what should I do to fix that? Regards, Mabi Community Meeting Calendar: Schedule - Every Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] writing to fuse device failed: No such file or directory
‐‐‐ Original Message ‐‐‐ On Tuesday, March 3, 2020 6:11 AM, Hari Gowtham wrote: > I checked on the backport and found that this patch hasn't yet been > backported to any of the release branches. > If this is the fix, it would be great to have them backported for the next > release. Thanks to everyone who responded to my post. Now I wanted to ask if the fix to this bug will also be backported to GlusterFS 5? and if yes, will it be available in the next GlusterFS version 5.13? Community Meeting Calendar: Schedule - Every Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] writing to fuse device failed: No such file or directory
Hello, Now that GlusterFS 5.13 has been released, could someone let me know if this issue (see mail below) has been fixed in 5.13? Thanks and regards, Mabi ‐‐‐ Original Message ‐‐‐ On Monday, March 2, 2020 3:17 PM, mabi wrote: > Hello, > > On the FUSE clients of my GlusterFS 5.11 two-node replica+arbitrer I see > quite a lot of the following error message repeatedly: > > [2020-03-02 14:12:40.297690] E [fuse-bridge.c:219:check_and_dump_fuse_W] (--> > /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7f93d5c13cfe] > (--> > /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x789a)[0x7f93d331989a] > (--> > /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x7c33)[0x7f93d3319c33] > (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7f93d4e8f4a4] (--> > /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f93d46ead0f] ) > 0-glusterfs-fuse: writing to fuse device failed: No such file or directory > > Both the server and clients are Debian 9. > > What exactly does this error message mean? And is it normal? or what should I > do to fix that? > > Regards, > Mabi Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] writing to fuse device failed: No such file or directory
Dear Artem, Thank you for your answer. If you still see these errors messages with GlusterFS 5.13 I suppose then that this bug fix has not been backported to 5.x. Could someone of the dev team please confirm? It was said on this list that this bug fix would be back ported to 5.x, so I am a bit surprised. Best regards, Mabi ‐‐‐ Original Message ‐‐‐ On Monday, May 4, 2020 9:57 PM, Artem Russakovskii wrote: > I'm on 5.13, and these are the only error messages I'm still seeing (after > downgrading from the failed v7 update): > > [2020-05-04 19:56:29.391121] E [fuse-bridge.c:219:check_and_dump_fuse_W] (--> > /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f0f9a5f324d] (--> > /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x849a)[0x7f0f969d649a] (--> > /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x87bb)[0x7f0f969d67bb] (--> > /lib64/libpthread.so.0(+0x84f9)[0x7f0f99b434f9] (--> > /lib64/libc.so.6(clone+0x3f)[0x7f0f9987bf2f] ) 0-glusterfs-fuse: writing > to fuse device failed: No such file or directory > [2020-05-04 19:56:29.400541] E [fuse-bridge.c:219:check_and_dump_fuse_W] (--> > /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f0f9a5f324d] (--> > /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x849a)[0x7f0f969d649a] (--> > /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x87bb)[0x7f0f969d67bb] (--> > /lib64/libpthread.so.0(+0x84f9)[0x7f0f99b434f9] (--> > /lib64/libc.so.6(clone+0x3f)[0x7f0f9987bf2f] ) 0-glusterfs-fuse: writing > to fuse device failed: No such file or directory > > Sincerely, > Artem > > -- > Founder, [Android Police](http://www.androidpolice.com), [APK > Mirror](http://www.apkmirror.com/), Illogical Robot LLC > [beerpla.net](http://beerpla.net/) | [@ArtemR](http://twitter.com/ArtemR) > > On Mon, May 4, 2020 at 5:46 AM mabi wrote: > >> Hello, >> >> Now that GlusterFS 5.13 has been released, could someone let me know if this >> issue (see mail below) has been fixed in 5.13? >> >> Thanks and regards, >> Mabi >> >> ‐‐‐ Original Message ‐‐‐ >> On Monday, March 2, 2020 3:17 PM, mabi wrote: >> >>> Hello, >>> >>> On the FUSE clients of my GlusterFS 5.11 two-node replica+arbitrer I see >>> quite a lot of the following error message repeatedly: >>> >>> [2020-03-02 14:12:40.297690] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7f93d5c13cfe] >>> (--> >>> /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x789a)[0x7f93d331989a] >>> (--> >>> /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x7c33)[0x7f93d3319c33] >>> (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7f93d4e8f4a4] (--> >>> /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f93d46ead0f] ) >>> 0-glusterfs-fuse: writing to fuse device failed: No such file or directory >>> >>> Both the server and clients are Debian 9. >>> >>> What exactly does this error message mean? And is it normal? or what should >>> I do to fix that? >>> >>> Regards, >>> Mabi >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] writing to fuse device failed: No such file or directory
Dear Hari, Thank you for your answer. A few months ago when I reported this issue initially I was told that the fix would be backported to 5.x, at that time 5.x was not EOL. So I guess I should upgrade to 7 but reading this list it seems that version 7 has a few other open issues. Is it safe the use version 7 in production or should I better use version 6? And is it possible to upgrade from 5.11 directly to 7.5? Regards, Mabi ‐‐‐ Original Message ‐‐‐ On Tuesday, May 5, 2020 1:40 PM, Hari Gowtham wrote: > Hi, > > I don't see the above mentioned fix to be backported to any branch. > I have just cherry picked them for the release-6 and 7. > Release-5 has reached EOL and so, it won't have the fix. > Note: release 6 will have one more release and will be EOLed as well. > Release-8 is being worked on and it will have the fix as a part of the way > it's branched. > Once it gets merged, it should be available in the release-6 and 7. but I do > recommend switching from > the older branches to the newer ones (at least release-7 in this case). > > https://review.gluster.org/#/q/change:I510158843e4b1d482bdc496c2e97b1860dc1ba93 > > On Tue, May 5, 2020 at 11:52 AM mabi wrote: > >> Dear Artem, >> >> Thank you for your answer. If you still see these errors messages with >> GlusterFS 5.13 I suppose then that this bug fix has not been backported to >> 5.x. >> >> Could someone of the dev team please confirm? It was said on this list that >> this bug fix would be back ported to 5.x, so I am a bit surprised. >> >> Best regards, >> Mabi >> >> ‐‐‐ Original Message ‐‐‐ >> On Monday, May 4, 2020 9:57 PM, Artem Russakovskii >> wrote: >> >>> I'm on 5.13, and these are the only error messages I'm still seeing (after >>> downgrading from the failed v7 update): >>> >>> [2020-05-04 19:56:29.391121] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f0f9a5f324d] >>> (--> >>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x849a)[0x7f0f969d649a] >>> (--> >>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x87bb)[0x7f0f969d67bb] >>> (--> /lib64/libpthread.so.0(+0x84f9)[0x7f0f99b434f9] (--> >>> /lib64/libc.so.6(clone+0x3f)[0x7f0f9987bf2f] ) 0-glusterfs-fuse: >>> writing to fuse device failed: No such file or directory >>> [2020-05-04 19:56:29.400541] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f0f9a5f324d] >>> (--> >>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x849a)[0x7f0f969d649a] >>> (--> >>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x87bb)[0x7f0f969d67bb] >>> (--> /lib64/libpthread.so.0(+0x84f9)[0x7f0f99b434f9] (--> >>> /lib64/libc.so.6(clone+0x3f)[0x7f0f9987bf2f] ) 0-glusterfs-fuse: >>> writing to fuse device failed: No such file or directory >>> >>> Sincerely, >>> Artem >>> >>> -- >>> Founder, [Android Police](http://www.androidpolice.com), [APK >>> Mirror](http://www.apkmirror.com/), Illogical Robot LLC >>> [beerpla.net](http://beerpla.net/) | [@ArtemR](http://twitter.com/ArtemR) >>> >>> On Mon, May 4, 2020 at 5:46 AM mabi wrote: >>> >>>> Hello, >>>> >>>> Now that GlusterFS 5.13 has been released, could someone let me know if >>>> this issue (see mail below) has been fixed in 5.13? >>>> >>>> Thanks and regards, >>>> Mabi >>>> >>>> ‐‐‐ Original Message ‐‐‐ >>>> On Monday, March 2, 2020 3:17 PM, mabi wrote: >>>> >>>>> Hello, >>>>> >>>>> On the FUSE clients of my GlusterFS 5.11 two-node replica+arbitrer I see >>>>> quite a lot of the following error message repeatedly: >>>>> >>>>> [2020-03-02 14:12:40.297690] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>>>> (--> >>>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7f93d5c13cfe] >>>>> (--> >>>>> /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x789a)[0x7f93d331989a] >>>>> (--> >>>>> /usr/lib/x86_64-linux-gnu/glusterfs/5.11/xlator/mount/fuse.so(+0x7c33)[0x7f93d3319c33] >>>>> (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7f93d4e8f4a4] (--> >>>>> /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0
Re: [Gluster-users] writing to fuse device failed: No such file or directory
Hi everyone, So because upgrading introduces additional problems, does this means I should stick with 5.x even if it is EOL? Or what is a "safe" version to upgrade to? Regards, Mabi ‐‐‐ Original Message ‐‐‐ On Wednesday, May 6, 2020 2:44 AM, Artem Russakovskii wrote: > Hi Hari, > > Hmm, given how poorly our migration from 5.13 to 7.5 went, I am not sure how > I'd move forward with what you suggested at this point. > > Sincerely, > Artem > > -- > Founder, [Android Police](http://www.androidpolice.com), [APK > Mirror](http://www.apkmirror.com/), Illogical Robot LLC > [beerpla.net](http://beerpla.net/) | [@ArtemR](http://twitter.com/ArtemR) > > On Tue, May 5, 2020 at 4:41 AM Hari Gowtham wrote: > >> Hi, >> >> I don't see the above mentioned fix to be backported to any branch. >> I have just cherry picked them for the release-6 and 7. >> Release-5 has reached EOL and so, it won't have the fix. >> Note: release 6 will have one more release and will be EOLed as well. >> Release-8 is being worked on and it will have the fix as a part of the way >> it's branched. >> Once it gets merged, it should be available in the release-6 and 7. but I do >> recommend switching from >> the older branches to the newer ones (at least release-7 in this case). >> >> https://review.gluster.org/#/q/change:I510158843e4b1d482bdc496c2e97b1860dc1ba93 >> >> On Tue, May 5, 2020 at 11:52 AM mabi wrote: >> >>> Dear Artem, >>> >>> Thank you for your answer. If you still see these errors messages with >>> GlusterFS 5.13 I suppose then that this bug fix has not been backported to >>> 5.x. >>> >>> Could someone of the dev team please confirm? It was said on this list that >>> this bug fix would be back ported to 5.x, so I am a bit surprised. >>> >>> Best regards, >>> Mabi >>> >>> ‐‐‐ Original Message ‐‐‐ >>> On Monday, May 4, 2020 9:57 PM, Artem Russakovskii >>> wrote: >>> >>>> I'm on 5.13, and these are the only error messages I'm still seeing (after >>>> downgrading from the failed v7 update): >>>> >>>> [2020-05-04 19:56:29.391121] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f0f9a5f324d] >>>> (--> >>>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x849a)[0x7f0f969d649a] >>>> (--> >>>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x87bb)[0x7f0f969d67bb] >>>> (--> /lib64/libpthread.so.0(+0x84f9)[0x7f0f99b434f9] (--> >>>> /lib64/libc.so.6(clone+0x3f)[0x7f0f9987bf2f] ) 0-glusterfs-fuse: >>>> writing to fuse device failed: No such file or directory >>>> [2020-05-04 19:56:29.400541] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f0f9a5f324d] >>>> (--> >>>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x849a)[0x7f0f969d649a] >>>> (--> >>>> /usr/lib64/glusterfs/5.13/xlator/mount/fuse.so(+0x87bb)[0x7f0f969d67bb] >>>> (--> /lib64/libpthread.so.0(+0x84f9)[0x7f0f99b434f9] (--> >>>> /lib64/libc.so.6(clone+0x3f)[0x7f0f9987bf2f] ))))) 0-glusterfs-fuse: >>>> writing to fuse device failed: No such file or directory >>>> >>>> Sincerely, >>>> Artem >>>> >>>> -- >>>> Founder, [Android Police](http://www.androidpolice.com), [APK >>>> Mirror](http://www.apkmirror.com/), Illogical Robot LLC >>>> [beerpla.net](http://beerpla.net/) | [@ArtemR](http://twitter.com/ArtemR) >>>> >>>> On Mon, May 4, 2020 at 5:46 AM mabi wrote: >>>> >>>>> Hello, >>>>> >>>>> Now that GlusterFS 5.13 has been released, could someone let me know if >>>>> this issue (see mail below) has been fixed in 5.13? >>>>> >>>>> Thanks and regards, >>>>> Mabi >>>>> >>>>> ‐‐‐ Original Message ‐‐‐ >>>>> On Monday, March 2, 2020 3:17 PM, mabi wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> On the FUSE clients of my GlusterFS 5.11 two-node replica+arbitrer I see >>>>>> quite a lot of the following error message repeatedly: >>>>>> >>>>>> [2020-03-02 14:12:40.297690] E [fuse-bridge.c:219:check_and_dump_fuse_W] >>
[Gluster-users] Failed to get quota limits
Hello, I am running GlusterFS 3.10.7 and just noticed by doing a "gluster volume quota list" that my quotas on that volume are broken. The command returns no output and no errors but by looking in /var/log/glusterfs.cli I found the following errors: [2018-02-09 19:31:24.242324] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for 3df709ee-641d-46a2-bd61-889583e3033c [2018-02-09 19:31:24.249790] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for a27818fe-0248-40fe-bb23-d43d61010478 [2018-02-09 19:31:24.252378] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for daf97388-bcec-4cc0-a8ef-5b93f05b30f6 [2018-02-09 19:31:24.256775] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for 3c768b36-2625-4509-87ef-fe5214cb9b01 [2018-02-09 19:31:24.257434] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for f8cf47d4-4f54-43c5-ab0d-75b45b4677a3 [2018-02-09 19:31:24.259126] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for b4c81a39-2152-45c5-95d3-b796d88226fe [2018-02-09 19:31:24.261664] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for 16ac4cde-a5d4-451f-adcc-422a542fea24 [2018-02-09 19:31:24.261719] I [input.c:31:cli_batch] 0-: Exiting with: 0 How can I fix my quota on that volume again? I had around 30 quotas set on different directories of that volume. Thanks in advance. Regards, M. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Failed to get quota limits
Would anyone be able to help me fix my quotas again? Thanks Original Message On February 9, 2018 8:35 PM, mabi wrote: >Hello, > > I am running GlusterFS 3.10.7 and just noticed by doing a "gluster volume > quota list" that my quotas on that volume are broken. The command > returns no output and no errors but by looking in /var/log/glusterfs.cli I > found the following errors: > > [2018-02-09 19:31:24.242324] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for 3df709ee-641d-46a2-bd61-889583e3033c > [2018-02-09 19:31:24.249790] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for a27818fe-0248-40fe-bb23-d43d61010478 > [2018-02-09 19:31:24.252378] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for daf97388-bcec-4cc0-a8ef-5b93f05b30f6 > [2018-02-09 19:31:24.256775] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for 3c768b36-2625-4509-87ef-fe5214cb9b01 > [2018-02-09 19:31:24.257434] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for f8cf47d4-4f54-43c5-ab0d-75b45b4677a3 > [2018-02-09 19:31:24.259126] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for b4c81a39-2152-45c5-95d3-b796d88226fe > [2018-02-09 19:31:24.261664] E > [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get > quota limits for 16ac4cde-a5d4-451f-adcc-422a542fea24 > [2018-02-09 19:31:24.261719] I [input.c:31:cli_batch] 0-: Exiting with: 0 > > How can I fix my quota on that volume again? I had around 30 quotas set on > different directories of that volume. > > Thanks in advance. > > Regards, > M. > >Gluster-users mailing list >Gluster-users@gluster.org >http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Failed to get quota limits
li_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for a27818fe-0248-40fe-bb23-d43d61010478 [2018-02-13 08:16:14.082067] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for daf97388-bcec-4cc0-a8ef-5b93f05b30f6 [2018-02-13 08:16:14.086929] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for 3c768b36-2625-4509-87ef-fe5214cb9b01 [2018-02-13 08:16:14.087905] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for f8cf47d4-4f54-43c5-ab0d-75b45b4677a3 [2018-02-13 08:16:14.089788] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for b4c81a39-2152-45c5-95d3-b796d88226fe [2018-02-13 08:16:14.092919] E [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get quota limits for 16ac4cde-a5d4-451f-adcc-422a542fea24 [2018-02-13 08:16:14.092980] I [input.c:31:cli_batch] 0-: Exiting with: 0 *** /var/log/glusterfs/bricks/data-myvolume-brick.log *** [2018-02-13 08:16:13.948065] I [addr.c:182:gf_auth] 0-/data/myvolume/brick: allowed = "*", received addr = "127.0.0.1" [2018-02-13 08:16:13.948105] I [login.c:76:gf_auth] 0-auth/login: allowed user names: bea3e634-e174-4bb3-a1d6-25b09d03b536 [2018-02-13 08:16:13.948125] I [MSGID: 115029] [server-handshake.c:695:server_setvolume] 0-myvolume-server: accepted client from gfs1a-14348-2018/02/13-08:16:09:933625-myvolume-client-0-0-0 (version: 3.10.7) [2018-02-13 08:16:14.022257] I [MSGID: 115036] [server.c:559:server_rpc_notify] 0-myvolume-server: disconnecting connection from gfs1a-14348-2018/02/13-08:16:09:933625-myvolume-client-0-0-0 [2018-02-13 08:16:14.022465] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-myvolume-server: Shutting down connection gfs1a-14348-2018/02/13-08:16:09:933625-myvolume-client-0-0-0 Original Message On February 13, 2018 12:47 AM, Hari Gowtham wrote: > Hi, > > Can you provide more information like, the volume configuration, quota.conf > file and the log files. > > On Sat, Feb 10, 2018 at 1:05 AM, mabi wrote: >> Hello, >> >> I am running GlusterFS 3.10.7 and just noticed by doing a "gluster volume >> quota list" that my quotas on that volume are broken. The command >> returns no output and no errors but by looking in /var/log/glusterfs.cli I >> found the following errors: >> >> [2018-02-09 19:31:24.242324] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for 3df709ee-641d-46a2-bd61-889583e3033c >> [2018-02-09 19:31:24.249790] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for a27818fe-0248-40fe-bb23-d43d61010478 >> [2018-02-09 19:31:24.252378] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for daf97388-bcec-4cc0-a8ef-5b93f05b30f6 >> [2018-02-09 19:31:24.256775] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for 3c768b36-2625-4509-87ef-fe5214cb9b01 >> [2018-02-09 19:31:24.257434] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for f8cf47d4-4f54-43c5-ab0d-75b45b4677a3 >> [2018-02-09 19:31:24.259126] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for b4c81a39-2152-45c5-95d3-b796d88226fe >> [2018-02-09 19:31:24.261664] E >> [cli-cmd-volume.c:1674:cli_cmd_quota_handle_list_all] 0-cli: Failed to get >> quota limits for 16ac4cde-a5d4-451f-adcc-422a542fea24 >> [2018-02-09 19:31:24.261719] I [input.c:31:cli_batch] 0-: Exiting with: 0 >> >> How can I fix my quota on that volume again? I had around 30 quotas set on >> different directories of that volume. >> >> Thanks in advance. >> >> Regards, >> M. >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users > > -- > Regards, > Hari Gowtham.___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Failed to get quota limits
Hi Hari, Sure no problem, I will send you in a minute another mail where you can download all the relevant log files including the quota.conf binary file. Let me know if you need anything else. In the mean time here below is the output of a volume status. Best regards, M. Status of volume: myvolume Gluster process TCP Port RDMA Port Online Pid -- Brick gfs1a.domain.local:/data/myvolume /brick 49153 0 Y 3214 Brick gfs1b.domain.local:/data/myvolume /brick 49154 0 Y 3256 Brick gfs1c.domain.local:/srv/glusterf s/myvolume/brick 49153 0 Y 515 Self-heal Daemon on localhost N/A N/AY 3186 Quota Daemon on localhost N/A N/AY 3195 Self-heal Daemon on gfs1b.domain.local N/A N/AY 3217 Quota Daemon on gfs1b.domain.local N/A N/AY 3229 Self-heal Daemon on gfs1c.domain.local N/A N/AY 486 Quota Daemon on gfs1c.domain.local N/A N/AY 495 Task Status of Volume myvolume -- There are no active volume tasks Original Message On February 13, 2018 10:09 AM, Hari Gowtham wrote: >Hi, > > A part of the log won't be enough to debug the issue. > Need the whole log messages till date. > You can send it as attachments. > > Yes the quota.conf is a binary file. > > And I need the volume status output too. > > On Tue, Feb 13, 2018 at 1:56 PM, mabi m...@protonmail.ch wrote: >>Hi Hari, >> Sorry for not providing you more details from the start. Here below you will >> find all the relevant log entries and info. Regarding the quota.conf file I >> have found one for my volume but it is a binary file. Is it supposed to be >> binary or text? >> Regards, >> M. >>*** gluster volume info myvolume *** >>Volume Name: myvolume >> Type: Replicate >> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: gfs1a.domain.local:/data/myvolume/brick >> Brick2: gfs1b.domain.local:/data/myvolume/brick >> Brick3: gfs1c.domain.local:/srv/glusterfs/myvolume/brick (arbiter) >> Options Reconfigured: >> server.event-threads: 4 >> client.event-threads: 4 >> performance.readdir-ahead: on >> nfs.disable: on >> features.quota: on >> features.inode-quota: on >> features.quota-deem-statfs: on >> transport.address-family: inet >>*** /var/log/glusterfs/glusterd.log *** >>[2018-02-13 08:16:09.929568] E [MSGID: 101042] [compat.c:569:gf_umount_lazy] >> 0-management: Lazy unmount of /var/run/gluster/myvolume_quota_list/ >> [2018-02-13 08:16:28.596527] I [MSGID: 106499] >> [glusterd-handler.c:4363:__glusterd_handle_status_volume] 0-management: >> Received status volume req for volume myvolume >> [2018-02-13 08:16:28.601097] I [MSGID: 106419] >> [glusterd-utils.c:6110:glusterd_add_inode_size_to_dict] 0-management: the >> brick on data/myvolume (zfs) uses dynamic inode sizes >>*** /var/log/glusterfs/cmd_history.log *** >>[2018-02-13 08:16:28.605478] : volume status myvolume detail : SUCCESS >>*** /var/log/glusterfs/quota-mount-myvolume.log *** >>[2018-02-13 08:16:09.934117] I [MSGID: 100030] [glusterfsd.c:2503:main] >> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10.7 >> (args: /usr/sbin/glusterfs --volfile-server localhost --volfile-id myvolume >> -l /var/log/glusterfs/quota-mount-myvolume.log -p >> /var/run/gluster/myvolume_quota_list.pid --client-pid -5 >> /var/run/gluster/myvolume_quota_list/) >> [2018-02-13 08:16:09.940432] I [MSGID: 101190] >> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with >> index 1 >> [2018-02-13 08:16:09.940491] E [socket.c:2327:socket_connect_finish] >> 0-glusterfs: connection to ::1:24007 failed (Connection refused); >> disconnecting socket >> [2018-02-13 08:16:09.940519] I [glusterfsd-mgmt.c:2134:mgmt_rpc_notify] >> 0-glusterfsd-mgmt: disconnected from remote-host: localhost >> [2018-02-13 08:16:13.943827] I [afr.c:94:fix_quorum_options] >> 0-myvolume-replicate-0: reindeer: incoming qtype = none >> [2018-02-13 08:16:13.943857] I [afr.c:116:fix_quorum_options] >> 0-myvolume-replicate-0: reindeer: quorum_count = 2147483647 >> [2018-02-13 08:16:13.945194] I [MSGID: 101190] >> [event-epoll
Re: [Gluster-users] Failed to get quota limits
Thank you for your answer. This problem seem to have started since last week, so should I also send you the same log files but for last week? I think logrotate rotates them on a weekly basis. The only two quota commands we use are the following: gluster volume quota myvolume limit-usage /directory 10GB gluster volume quota myvolume list basically to set a new quota or to list the current quotas. The quota list was working in the past yes but we already had a similar issue where the quotas disappeared last August 2017: http://lists.gluster.org/pipermail/gluster-users/2017-August/031946.html In the mean time the only thing we did is to upgrade from 3.8 to 3.10. There are actually no errors to be seen using any gluster commands. The "quota myvolume list" returns simply nothing. In order to lookup the directories should I run a "stat" on them? and if yes should I do that on a client through the fuse mount? Original Message On February 13, 2018 10:58 AM, Hari Gowtham wrote: >The log provided are from 11th, you have seen the issue a while before > that itself. > > The logs help us to know if something has actually went wrong. > once something goes wrong the output might get affected and i need to know > what > went wrong. Which means i need the log from the beginning. > > and i need to know a few more things, > Was the quota list command was working as expected at the beginning? > If yes, what were the commands issued, before you noticed this problem. > Is there any other error that you see other than this? > > And can you try looking up the directories the limits are set on and > check if that fixes the error? > >> Original Message >> On February 13, 2018 10:44 AM, mabi m...@protonmail.ch wrote: >>>Hi Hari, >>>Sure no problem, I will send you in a minute another mail where you can >>>download all the relevant log files including the quota.conf binary file. >>>Let me know if you need anything else. In the mean time here below is the >>>output of a volume status. >>>Best regards, >>> M. >>>Status of volume: myvolume >>> Gluster process TCP Port RDMA Port Online Pid >>>Brick gfs1a.domain.local:/data/myvolume >>> /brick 49153 0 Y 3214 >>> Brick gfs1b.domain.local:/data/myvolume >>> /brick 49154 0 Y 3256 >>> Brick gfs1c.domain.local:/srv/glusterf >>> s/myvolume/brick 49153 0 Y 515 >>> Self-heal Daemon on localhost N/A N/AY >>> 3186 >>> Quota Daemon on localhost N/A N/AY >>> 3195 >>> Self-heal Daemon on gfs1b.domain.local N/A N/AY 3217 >>> Quota Daemon on gfs1b.domain.local N/A N/AY 3229 >>> Self-heal Daemon on gfs1c.domain.local N/A N/AY 486 >>> Quota Daemon on gfs1c.domain.local N/A N/AY 495 >>>Task Status of Volume myvolume >>>There are no active volume tasks >>> Original Message >>> On February 13, 2018 10:09 AM, Hari Gowtham hgowt...@redhat.com wrote: >>>>Hi, >>>> A part of the log won't be enough to debug the issue. >>>> Need the whole log messages till date. >>>> You can send it as attachments. >>>> Yes the quota.conf is a binary file. >>>> And I need the volume status output too. >>>> On Tue, Feb 13, 2018 at 1:56 PM, mabi m...@protonmail.ch wrote: >>>>>Hi Hari, >>>>> Sorry for not providing you more details from the start. Here below you >>>>> will >>>>> find all the relevant log entries and info. Regarding the quota.conf file >>>>> I >>>>> have found one for my volume but it is a binary file. Is it supposed to be >>>>> binary or text? >>>>> Regards, >>>>> M. >>>>> *** gluster volume info myvolume *** >>>>> Volume Name: myvolume >>>>> Type: Replicate >>>>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5 >>>>> Status: Started >>>>> Snapshot Count: 0 >>>>> Number of Bricks: 1 x (2 + 1) = 3 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: gfs1a.domain.local:/data/myvolume/brick >>>>> Brick2: gfs1b.domain.local:/data/myvolume/brick >>>>> Brick3: gfs1c.domain.local:/srv/glusterfs/myvolume/br
Re: [Gluster-users] Failed to get quota limits
I tried to set the limits as you suggest by running the following command. $ sudo gluster volume quota myvolume limit-usage /directory 200GB volume quota : success but then when I list the quotas there is still nothing, so nothing really happened. I also tried to run stat on all directories which have a quota but nothing happened either. I will send you tomorrow all the other logfiles as requested. Original Message On February 13, 2018 12:20 PM, Hari Gowtham wrote: >Were you able to set new limits after seeing this error? > > On Tue, Feb 13, 2018 at 4:19 PM, Hari Gowtham hgowt...@redhat.com wrote: >>Yes, I need the log files in that duration, the log rotated file after >> hitting the >> issue aren't necessary, but the ones before hitting the issues are needed >> (not just when you hit it, the ones even before you hit it). >>Yes, you have to do a stat from the client through fuse mount. >>On Tue, Feb 13, 2018 at 3:56 PM, mabi m...@protonmail.ch wrote: >>>Thank you for your answer. This problem seem to have started since last >>>week, so should I also send you the same log files but for last week? I >>>think logrotate rotates them on a weekly basis. >>>The only two quota commands we use are the following: >>>gluster volume quota myvolume limit-usage /directory 10GB >>> gluster volume quota myvolume list >>>basically to set a new quota or to list the current quotas. The quota list >>>was working in the past yes but we already had a similar issue where the >>>quotas disappeared last August 2017: >>>http://lists.gluster.org/pipermail/gluster-users/2017-August/031946.html >>>In the mean time the only thing we did is to upgrade from 3.8 to 3.10. >>>There are actually no errors to be seen using any gluster commands. The >>>"quota myvolume list" returns simply nothing. >>>In order to lookup the directories should I run a "stat" on them? and if yes >>>should I do that on a client through the fuse mount? >>> Original Message >>> On February 13, 2018 10:58 AM, Hari Gowtham hgowt...@redhat.com wrote: >>>>The log provided are from 11th, you have seen the issue a while before >>>> that itself. >>>>The logs help us to know if something has actually went wrong. >>>> once something goes wrong the output might get affected and i need to know >>>> what >>>> went wrong. Which means i need the log from the beginning. >>>>and i need to know a few more things, >>>> Was the quota list command was working as expected at the beginning? >>>> If yes, what were the commands issued, before you noticed this problem. >>>> Is there any other error that you see other than this? >>>>And can you try looking up the directories the limits are set on and >>>> check if that fixes the error? >>>>> Original Message >>>>> On February 13, 2018 10:44 AM, mabi m...@protonmail.ch wrote: >>>>>>Hi Hari, >>>>>> Sure no problem, I will send you in a minute another mail where you can >>>>>> download all the relevant log files including the quota.conf binary >>>>>> file. Let me know if you need anything else. In the mean time here below >>>>>> is the output of a volume status. >>>>>> Best regards, >>>>>> M. >>>>>> Status of volume: myvolume >>>>>> Gluster process TCP Port RDMA Port Online >>>>>> Pid >>>>>> Brick gfs1a.domain.local:/data/myvolume >>>>>> /brick 49153 0 Y 3214 >>>>>> Brick gfs1b.domain.local:/data/myvolume >>>>>> /brick 49154 0 Y 3256 >>>>>> Brick gfs1c.domain.local:/srv/glusterf >>>>>> s/myvolume/brick 49153 0 Y 515 >>>>>> Self-heal Daemon on localhost N/A N/AY >>>>>> 3186 >>>>>> Quota Daemon on localhost N/A N/AY >>>>>> 3195 >>>>>> Self-heal Daemon on gfs1b.domain.local N/A N/AY 3217 >>>>>> Quota Daemon on gfs1b.domain.local N/A N/AY 3229 >>>>>> Self-heal Daemon on gfs1c.domain.local N/A N/AY 486 >>>>>> Quota Daemon on gfs1c.domain.local N/A N/A
Re: [Gluster-users] Failed to get quota limits
Dear Hari, Thank you for getting back to me after having analysed the problem. As you said I tried to run "gluster volume quota list " for all of my directories which have a quota and found out that there was one directory quota which was missing (stale) as you can see below: $ gluster volume quota myvolume list /demo.domain.tld Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? --- /demo.domain.tldN/AN/A 8.0MB N/A N/AN/A So as you suggest I added again the quota on that directory and now the "list" finally works again and show the quotas for every directories as I defined them. That did the trick! Now do you know if this bug is already corrected in a new release of GlusterFS? if not do you know when it will be fixed? Again many thanks for your help here! Best regards, M. ‐‐‐ Original Message ‐‐‐ On February 23, 2018 7:45 AM, Hari Gowtham wrote: > > > Hi, > > There is a bug in 3.10 which doesn't allow the quota list command to > > output, if the last entry on the conf file is a stale entry. > > The workaround for this is to remove the stale entry at the end. (If > > the last two entries are stale then both have to be removed and so on > > until the last entry on the conf file is a valid entry). > > This can be avoided by adding a new limit. As the new limit you added > > didn't work there is another way to check this. > > Try quota list command with a specific limit mentioned in the command. > > gluster volume quota list > > Make sure this path and the limit are set. > > If this works then you need to clean up the last stale entry. > > If this doesn't work we need to look further. > > Thanks Sanoj for the guidance. > > On Wed, Feb 14, 2018 at 1:36 AM, mabi m...@protonmail.ch wrote: > > > I tried to set the limits as you suggest by running the following command. > > > > $ sudo gluster volume quota myvolume limit-usage /directory 200GB > > > > volume quota : success > > > > but then when I list the quotas there is still nothing, so nothing really > > happened. > > > > I also tried to run stat on all directories which have a quota but nothing > > happened either. > > > > I will send you tomorrow all the other logfiles as requested. > > > > \-\-\-\-\-\-\-\- Original Message > > > > On February 13, 2018 12:20 PM, Hari Gowtham hgowt...@redhat.com wrote: > > > > > Were you able to set new limits after seeing this error? > > > > > > On Tue, Feb 13, 2018 at 4:19 PM, Hari Gowtham hgowt...@redhat.com wrote: > > > > > > > Yes, I need the log files in that duration, the log rotated file after > > > > > > > > hitting the > > > > > > > > issue aren't necessary, but the ones before hitting the issues are > > > > needed > > > > > > > > (not just when you hit it, the ones even before you hit it). > > > > > > > > Yes, you have to do a stat from the client through fuse mount. > > > > > > > > On Tue, Feb 13, 2018 at 3:56 PM, mabi m...@protonmail.ch wrote: > > > > > > > > > Thank you for your answer. This problem seem to have started since > > > > > last week, so should I also send you the same log files but for last > > > > > week? I think logrotate rotates them on a weekly basis. > > > > > > > > > > The only two quota commands we use are the following: > > > > > > > > > > gluster volume quota myvolume limit-usage /directory 10GB > > > > > > > > > > gluster volume quota myvolume list > > > > > > > > > > basically to set a new quota or to list the current quotas. The quota > > > > > list was working in the past yes but we already had a similar issue > > > > > where the quotas disappeared last August 2017: > > > > > > > > > > http://lists.gluster.org/pipermail/gluster-users/2017-August/031946.html > > > > > > > > > > In the mean time the only thing we did is to upgrade from 3.8 to 3.10. > > > > > > > > > > There are actually no errors to be seen using any gluster commands. > > > > > The "quota myvolume list" returns simply noth
Re: [Gluster-users] Failed to get quota limits
Hi, Thanks for the link to the bug. We should be hopefully moving soon onto 3.12 so I guess this bug is also fixed there. Best regards, M. ‐‐‐ Original Message ‐‐‐ On February 27, 2018 9:38 AM, Hari Gowtham wrote: > > > Hi Mabi, > > The bugs is fixed from 3.11. For 3.10 it is yet to be backported and > > made available. > > The bug is https://bugzilla.redhat.com/show_bug.cgi?id=1418259. > > On Sat, Feb 24, 2018 at 4:05 PM, mabi m...@protonmail.ch wrote: > > > Dear Hari, > > > > Thank you for getting back to me after having analysed the problem. > > > > As you said I tried to run "gluster volume quota list " for > > all of my directories which have a quota and found out that there was one > > directory quota which was missing (stale) as you can see below: > > > > $ gluster volume quota myvolume list /demo.domain.tld > > > > Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit > > exceeded? > > > > > > -- > > > > /demo.domain.tld N/A N/A 8.0MB N/A N/A N/A > > > > So as you suggest I added again the quota on that directory and now the > > "list" finally works again and show the quotas for every directories as I > > defined them. That did the trick! > > > > Now do you know if this bug is already corrected in a new release of > > GlusterFS? if not do you know when it will be fixed? > > > > Again many thanks for your help here! > > > > Best regards, > > > > M. > > > > ‐‐‐ Original Message ‐‐‐ > > > > On February 23, 2018 7:45 AM, Hari Gowtham hgowt...@redhat.com wrote: > > > > > Hi, > > > > > > There is a bug in 3.10 which doesn't allow the quota list command to > > > > > > output, if the last entry on the conf file is a stale entry. > > > > > > The workaround for this is to remove the stale entry at the end. (If > > > > > > the last two entries are stale then both have to be removed and so on > > > > > > until the last entry on the conf file is a valid entry). > > > > > > This can be avoided by adding a new limit. As the new limit you added > > > > > > didn't work there is another way to check this. > > > > > > Try quota list command with a specific limit mentioned in the command. > > > > > > gluster volume quota list > > > > > > Make sure this path and the limit are set. > > > > > > If this works then you need to clean up the last stale entry. > > > > > > If this doesn't work we need to look further. > > > > > > Thanks Sanoj for the guidance. > > > > > > On Wed, Feb 14, 2018 at 1:36 AM, mabi m...@protonmail.ch wrote: > > > > > > > I tried to set the limits as you suggest by running the following > > > > command. > > > > > > > > $ sudo gluster volume quota myvolume limit-usage /directory 200GB > > > > > > > > volume quota : success > > > > > > > > but then when I list the quotas there is still nothing, so nothing > > > > really happened. > > > > > > > > I also tried to run stat on all directories which have a quota but > > > > nothing happened either. > > > > > > > > I will send you tomorrow all the other logfiles as requested. > > > > > > > > \-\-\-\-\-\-\-\- Original Message > > > > > > > > On February 13, 2018 12:20 PM, Hari Gowtham hgowt...@redhat.com wrote: > > > > > > > > > Were you able to set new limits after seeing this error? > > > > > > > > > > On Tue, Feb 13, 2018 at 4:19 PM, Hari Gowtham hgowt...@redhat.com > > > > > wrote: > > > > > > > > > > > Yes, I need the log files in that duration, the log rotated file > > > > > > after > > > > > > > > > > > > hitting the > > > > > > > > > > > > issue aren't necessary, but the ones before hitting the issues are > > > > > > needed > > > > > > > > > > > > (not just when you hit it, the ones even before you hit it). > > > > > > > > > > > > Yes, you have to