Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-16 Thread Kotresh Hiremath Ravishankar
Answers inline.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" <m...@protonmail.ch>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Gluster Users" <gluster-users@gluster.org>
> Sent: Thursday, April 13, 2017 8:51:29 PM
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
> "(unreachable)")
> 
> Hi Kotresh,
> 
> Thanks for your feedback.
> 
> So do you mean I can simply login into the geo-replication slave node, mount
> the volume with fuse, and delete the problematic directory, and finally
> restart geo-replcation?
> 
   Trying to delete the problematic directory on slave might still result with
  the same ENOTEMPTY error. Try that out, if it does not work out, it needs to
  be deleted from backend bricks directly from all the nodes.

> I am planning to migrate to 3.8 as soon as I have a backup (geo-replication).
> Is this issue with DHT fixed in the latest 3.8.x release?
>
   Most of the issues are addressed.

> Regards,
> M.
> 
>  Original Message 
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> "(unreachable)")
> Local Time: April 13, 2017 7:57 AM
> UTC Time: April 13, 2017 5:57 AM
> From: khire...@redhat.com
> To: mabi <m...@protonmail.ch>
> Gluster Users <gluster-users@gluster.org>
> 
> Hi,
> 
> I think the directory Workhours_2017 is deleted on master and on
> slave it's failing to delete because there might be stale linkto files
> at the back end. These issues are fixed in DHT with latest versions.
> Upgrading to latest version would solve these issues.
> 
> To workaround the issue, you might need to cleanup the problematic
> directory on slave from the backend.
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "mabi" <m...@protonmail.ch>
> > To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> > Cc: "Gluster Users" <gluster-users@gluster.org>
> > Sent: Thursday, April 13, 2017 12:28:50 AM
> > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> >
> > Hi Kotresh,
> >
> > Thanks for your hint, adding the "--ignore-missing-args" option to rsync
> > and
> > restarting geo-replication worked but it only managed to sync approximately
> > 1/3 of the data until it put the geo replication in status "Failed" this
> > time. Now I have a different type of error as you can see below from the
> > log
> > extract on my geo replication slave node:
> >
> > [2017-04-12 18:01:55.268923] I [MSGID: 109066]
> > [dht-rename.c:1574:dht_rename]
> > 0-myvol-private-geo-dht: renaming
> > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017
> > empty.xls.ocTransferId2118183895.part
> > (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) =>
> > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls
> > (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0)
> > [2017-04-12 18:01:55.269842] W [fuse-bridge.c:1787:fuse_rename_cbk]
> > 0-glusterfs-fuse: 4786:
> > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017
> > empty.xls.ocTransferId2118183895.part ->
> > /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls => -1
> > (Directory not empty)
> > [2017-04-12 18:01:55.314062] I [fuse-bridge.c:5016:fuse_thread_proc]
> > 0-fuse:
> > unmounting /tmp/gsyncd-aux-mount-PNSR8s
> > [2017-04-12 18:01:55.314311] W [glusterfsd.c:1251:cleanup_and_exit]
> > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7f97d3129064]
> > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f97d438a725]
> > -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x7f97d438a5a7] ) 0-:
> > received signum (15), shutting down
> > [2017-04-12 18:01:55.314335] I [fuse-bridge.c:5720:fini] 0-fuse: Unmounting
> > '/tmp/gsyncd-aux-mount-PNSR8s'.
> >
> > How can I fix now this issue and have geo-replication continue
> > synchronising
> > again?
> >
> > Best regards,
> > M.
> >
> >  Original Message 
> > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> > Local Time: April 11, 2017 9:18 AM
> > UTC Time: April 11, 2017 7:18 AM
> > From: khire...@redhat.com
> > To: mabi <m...@protonmail.ch>
> > Gluster Users <gluster-users@gluster.org>
> >
> > Hi,
> >
> >

Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-13 Thread mabi
Hi Kotresh,

Thanks for your feedback.

So do you mean I can simply login into the geo-replication slave node, mount 
the volume with fuse, and delete the problematic directory, and finally restart 
geo-replcation?

I am planning to migrate to 3.8 as soon as I have a backup (geo-replication). 
Is this issue with DHT fixed in the latest 3.8.x release?

Regards,
M.

 Original Message 
Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
"(unreachable)")
Local Time: April 13, 2017 7:57 AM
UTC Time: April 13, 2017 5:57 AM
From: khire...@redhat.com
To: mabi <m...@protonmail.ch>
Gluster Users <gluster-users@gluster.org>

Hi,

I think the directory Workhours_2017 is deleted on master and on
slave it's failing to delete because there might be stale linkto files
at the back end. These issues are fixed in DHT with latest versions.
Upgrading to latest version would solve these issues.

To workaround the issue, you might need to cleanup the problematic
directory on slave from the backend.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" <m...@protonmail.ch>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Gluster Users" <gluster-users@gluster.org>
> Sent: Thursday, April 13, 2017 12:28:50 AM
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
> "(unreachable)")
>
> Hi Kotresh,
>
> Thanks for your hint, adding the "--ignore-missing-args" option to rsync and
> restarting geo-replication worked but it only managed to sync approximately
> 1/3 of the data until it put the geo replication in status "Failed" this
> time. Now I have a different type of error as you can see below from the log
> extract on my geo replication slave node:
>
> [2017-04-12 18:01:55.268923] I [MSGID: 109066] [dht-rename.c:1574:dht_rename]
> 0-myvol-private-geo-dht: renaming
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017
> empty.xls.ocTransferId2118183895.part
> (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) =>
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls
> (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0)
> [2017-04-12 18:01:55.269842] W [fuse-bridge.c:1787:fuse_rename_cbk]
> 0-glusterfs-fuse: 4786:
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017
> empty.xls.ocTransferId2118183895.part ->
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls => -1
> (Directory not empty)
> [2017-04-12 18:01:55.314062] I [fuse-bridge.c:5016:fuse_thread_proc] 0-fuse:
> unmounting /tmp/gsyncd-aux-mount-PNSR8s
> [2017-04-12 18:01:55.314311] W [glusterfsd.c:1251:cleanup_and_exit]
> (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7f97d3129064]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f97d438a725]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x7f97d438a5a7] ) 0-:
> received signum (15), shutting down
> [2017-04-12 18:01:55.314335] I [fuse-bridge.c:5720:fini] 0-fuse: Unmounting
> '/tmp/gsyncd-aux-mount-PNSR8s'.
>
> How can I fix now this issue and have geo-replication continue synchronising
> again?
>
> Best regards,
> M.
>
>  Original Message 
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> "(unreachable)")
> Local Time: April 11, 2017 9:18 AM
> UTC Time: April 11, 2017 7:18 AM
> From: khire...@redhat.com
> To: mabi <m...@protonmail.ch>
> Gluster Users <gluster-users@gluster.org>
>
> Hi,
>
> Then please use set the following rsync config and let us know if it helps.
>
> gluster vol geo-rep  :: config rsync-options
> "--ignore-missing-args"
>
> Thanks and Regards,
> Kotresh H R
>
> - Original Message -
> > From: "mabi" <m...@protonmail.ch>
> > To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> > Cc: "Gluster Users" <gluster-users@gluster.org>
> > Sent: Tuesday, April 11, 2017 2:15:54 AM
> > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> >
> > Hi Kotresh,
> >
> > I am using the official Debian 8 (jessie) package which has rsync version
> > 3.1.1.
> >
> > Regards,
> > M.
> >
> >  Original Message 
> > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> > Local Time: April 10, 2017 6:33 AM
> > UTC Time: April 10, 2017 4:33 AM
> > From: khire...@redhat.com
> > To: mabi <m...@protonmail.ch>
> > Gluster Users <gluster-users@gluster.org>
> &g

Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-12 Thread Kotresh Hiremath Ravishankar
Hi,

I think the directory Workhours_2017 is deleted on master and on
slave it's failing to delete because there might be stale linkto files
at the back end. These issues are fixed in DHT with latest versions.
Upgrading to latest version would solve these issues.

To workaround the issue, you might need to cleanup the problematic
directory on slave from the backend.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" <m...@protonmail.ch>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Gluster Users" <gluster-users@gluster.org>
> Sent: Thursday, April 13, 2017 12:28:50 AM
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
> "(unreachable)")
> 
> Hi Kotresh,
> 
> Thanks for your hint, adding the "--ignore-missing-args" option to rsync and
> restarting geo-replication worked but it only managed to sync approximately
> 1/3 of the data until it put the geo replication in status "Failed" this
> time. Now I have a different type of error as you can see below from the log
> extract on my geo replication slave node:
> 
> [2017-04-12 18:01:55.268923] I [MSGID: 109066] [dht-rename.c:1574:dht_rename]
> 0-myvol-private-geo-dht: renaming
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017
> empty.xls.ocTransferId2118183895.part
> (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) =>
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls
> (hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0)
> [2017-04-12 18:01:55.269842] W [fuse-bridge.c:1787:fuse_rename_cbk]
> 0-glusterfs-fuse: 4786:
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017
> empty.xls.ocTransferId2118183895.part ->
> /.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls => -1
> (Directory not empty)
> [2017-04-12 18:01:55.314062] I [fuse-bridge.c:5016:fuse_thread_proc] 0-fuse:
> unmounting /tmp/gsyncd-aux-mount-PNSR8s
> [2017-04-12 18:01:55.314311] W [glusterfsd.c:1251:cleanup_and_exit]
> (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7f97d3129064]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f97d438a725]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x7f97d438a5a7] ) 0-:
> received signum (15), shutting down
> [2017-04-12 18:01:55.314335] I [fuse-bridge.c:5720:fini] 0-fuse: Unmounting
> '/tmp/gsyncd-aux-mount-PNSR8s'.
> 
> How can I fix now this issue and have geo-replication continue synchronising
> again?
> 
> Best regards,
> M.
> 
>  Original Message 
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> "(unreachable)")
> Local Time: April 11, 2017 9:18 AM
> UTC Time: April 11, 2017 7:18 AM
> From: khire...@redhat.com
> To: mabi <m...@protonmail.ch>
> Gluster Users <gluster-users@gluster.org>
> 
> Hi,
> 
> Then please use set the following rsync config and let us know if it helps.
> 
> gluster vol geo-rep  :: config rsync-options
> "--ignore-missing-args"
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "mabi" <m...@protonmail.ch>
> > To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> > Cc: "Gluster Users" <gluster-users@gluster.org>
> > Sent: Tuesday, April 11, 2017 2:15:54 AM
> > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> >
> > Hi Kotresh,
> >
> > I am using the official Debian 8 (jessie) package which has rsync version
> > 3.1.1.
> >
> > Regards,
> > M.
> >
> >  Original Message 
> > Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> > Local Time: April 10, 2017 6:33 AM
> > UTC Time: April 10, 2017 4:33 AM
> > From: khire...@redhat.com
> > To: mabi <m...@protonmail.ch>
> > Gluster Users <gluster-users@gluster.org>
> >
> > Hi Mabi,
> >
> > What's the rsync version being used?
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > - Original Message -
> > > From: "mabi" <m...@protonmail.ch>
> > > To: "Gluster Users" <gluster-users@gluster.org>
> > > Sent: Saturday, April 8, 2017 4:20:25 PM
> > > Subject: [Gluster-users] Geo replication stuck (rsync: link_stat
> > > "(unreachable)")
> > >
> > > Hello,
> > >
> > > I am using distributed geo replication with two of my GlusterFS 3.7.20
> > > r

Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-12 Thread mabi
Hi Kotresh,

Thanks for your hint, adding the "--ignore-missing-args" option to rsync and 
restarting geo-replication worked but it only managed to sync approximately 1/3 
of the data until it put the geo replication in status "Failed" this time. Now 
I have a different type of error as you can see below from the log extract on 
my geo replication slave node:

[2017-04-12 18:01:55.268923] I [MSGID: 109066] [dht-rename.c:1574:dht_rename] 
0-myvol-private-geo-dht: renaming 
/.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 
empty.xls.ocTransferId2118183895.part 
(hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0) => 
/.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls 
(hash=myvol-private-geo-client-0/cache=myvol-private-geo-client-0)
[2017-04-12 18:01:55.269842] W [fuse-bridge.c:1787:fuse_rename_cbk] 
0-glusterfs-fuse: 4786: 
/.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 
empty.xls.ocTransferId2118183895.part -> 
/.gfid/1678ff37-f708-4197-bed0-3ecd87ae1314/Workhours_2017 empty.xls => -1 
(Directory not empty)
[2017-04-12 18:01:55.314062] I [fuse-bridge.c:5016:fuse_thread_proc] 0-fuse: 
unmounting /tmp/gsyncd-aux-mount-PNSR8s
[2017-04-12 18:01:55.314311] W [glusterfsd.c:1251:cleanup_and_exit] 
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7f97d3129064] 
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f97d438a725] 
-->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x7f97d438a5a7] ) 0-: received 
signum (15), shutting down
[2017-04-12 18:01:55.314335] I [fuse-bridge.c:5720:fini] 0-fuse: Unmounting 
'/tmp/gsyncd-aux-mount-PNSR8s'.

How can I fix now this issue and have geo-replication continue synchronising 
again?

Best regards,
M.

---- Original Message ----
Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
"(unreachable)")
Local Time: April 11, 2017 9:18 AM
UTC Time: April 11, 2017 7:18 AM
From: khire...@redhat.com
To: mabi <m...@protonmail.ch>
Gluster Users <gluster-users@gluster.org>

Hi,

Then please use set the following rsync config and let us know if it helps.

gluster vol geo-rep  :: config rsync-options 
"--ignore-missing-args"

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" <m...@protonmail.ch>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Gluster Users" <gluster-users@gluster.org>
> Sent: Tuesday, April 11, 2017 2:15:54 AM
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
> "(unreachable)")
>
> Hi Kotresh,
>
> I am using the official Debian 8 (jessie) package which has rsync version
> 3.1.1.
>
> Regards,
> M.
>
>  Original Message 
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> "(unreachable)")
> Local Time: April 10, 2017 6:33 AM
> UTC Time: April 10, 2017 4:33 AM
> From: khire...@redhat.com
> To: mabi <m...@protonmail.ch>
> Gluster Users <gluster-users@gluster.org>
>
> Hi Mabi,
>
> What's the rsync version being used?
>
> Thanks and Regards,
> Kotresh H R
>
> - Original Message -
> > From: "mabi" <m...@protonmail.ch>
> > To: "Gluster Users" <gluster-users@gluster.org>
> > Sent: Saturday, April 8, 2017 4:20:25 PM
> > Subject: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> >
> > Hello,
> >
> > I am using distributed geo replication with two of my GlusterFS 3.7.20
> > replicated volumes and just noticed that the geo replication for one volume
> > is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried
> > to stop and restart geo replication but still it stays stuck at that
> > specific date and time under the DATA field of the geo replication "status
> > detail" command I can see 3879 and that it has "Active" as STATUS but still
> > nothing happens. I noticed that the rsync process is running but does not
> > do
> > anything, then I did a strace on the PID of rsync and saw the following:
> >
> > write(2, "rsync: link_stat \"(unreachable)/"..., 114
> >
> > It looks like rsync can't read or find a file and stays stuck on that. In
> > the
> > geo-replication log files of GlusterFS master I can't find any error
> > messages just informational message. For example when I restart the geo
> > replication I see the following log entries:
> >
> > [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] :
> > slave
> > bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}]
> > [2017-04-07 21:43:05.666435] I [monitor(monitor):468:d

Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-11 Thread Kotresh Hiremath Ravishankar
Hi,

Then please use set the following rsync config and let us know if it helps.

gluster vol geo-rep  :: config rsync-options 
"--ignore-missing-args"

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" <m...@protonmail.ch>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Gluster Users" <gluster-users@gluster.org>
> Sent: Tuesday, April 11, 2017 2:15:54 AM
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
> "(unreachable)")
> 
> Hi Kotresh,
> 
> I am using the official Debian 8 (jessie) package which has rsync version
> 3.1.1.
> 
> Regards,
> M.
> 
> ---- Original Message 
> Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat
> "(unreachable)")
> Local Time: April 10, 2017 6:33 AM
> UTC Time: April 10, 2017 4:33 AM
> From: khire...@redhat.com
> To: mabi <m...@protonmail.ch>
> Gluster Users <gluster-users@gluster.org>
> 
> Hi Mabi,
> 
> What's the rsync version being used?
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "mabi" <m...@protonmail.ch>
> > To: "Gluster Users" <gluster-users@gluster.org>
> > Sent: Saturday, April 8, 2017 4:20:25 PM
> > Subject: [Gluster-users] Geo replication stuck (rsync: link_stat
> > "(unreachable)")
> >
> > Hello,
> >
> > I am using distributed geo replication with two of my GlusterFS 3.7.20
> > replicated volumes and just noticed that the geo replication for one volume
> > is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried
> > to stop and restart geo replication but still it stays stuck at that
> > specific date and time under the DATA field of the geo replication "status
> > detail" command I can see 3879 and that it has "Active" as STATUS but still
> > nothing happens. I noticed that the rsync process is running but does not
> > do
> > anything, then I did a strace on the PID of rsync and saw the following:
> >
> > write(2, "rsync: link_stat \"(unreachable)/"..., 114
> >
> > It looks like rsync can't read or find a file and stays stuck on that. In
> > the
> > geo-replication log files of GlusterFS master I can't find any error
> > messages just informational message. For example when I restart the geo
> > replication I see the following log entries:
> >
> > [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] :
> > slave
> > bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}]
> > [2017-04-07 21:43:05.666435] I [monitor(monitor):468:distribute] :
> > worker specs: [('/data/private/brick', 'ssh:// root@gfs1geo.domain
> > :gluster://localhost:private-geo', '1', False)]
> > [2017-04-07 21:43:05.823931] I [monitor(monitor):267:monitor] Monitor:
> > 
> > [2017-04-07 21:43:05.824204] I [monitor(monitor):268:monitor] Monitor:
> > starting gsyncd worker
> > [2017-04-07 21:43:05.930124] I [gsyncd(/data/private/brick):733:main_i]
> > : syncing: gluster://localhost:private -> ssh:// root@gfs1geo.domain
> > :gluster://localhost:private-geo
> > [2017-04-07 21:43:05.931169] I [changelogagent(agent):73:__init__]
> > ChangelogAgent: Agent listining...
> > [2017-04-07 21:43:08.558648] I
> > [master(/data/private/brick):83:gmaster_builder] : setting up xsync
> > change detection mode
> > [2017-04-07 21:43:08.559071] I [master(/data/private/brick):367:__init__]
> > _GMaster: using 'rsync' as the sync engine
> > [2017-04-07 21:43:08.560163] I
> > [master(/data/private/brick):83:gmaster_builder] : setting up
> > changelog
> > change detection mode
> > [2017-04-07 21:43:08.560431] I [master(/data/private/brick):367:__init__]
> > _GMaster: using 'rsync' as the sync engine
> > [2017-04-07 21:43:08.561105] I
> > [master(/data/private/brick):83:gmaster_builder] : setting up
> > changeloghistory change detection mode
> > [2017-04-07 21:43:08.561391] I [master(/data/private/brick):367:__init__]
> > _GMaster: using 'rsync' as the sync engine
> > [2017-04-07 21:43:11.354417] I [master(/data/private/brick):1249:register]
> > _GMaster: xsync temp directory:
> > /var/lib/misc/glusterfsd/private/ssh%3A%2F%2Froot%40192.168.20.107%3Agluster%3A%2F%2F127.0.0.1%3Aprivate-geo/616931ac8f39da5dc5834f9d47fc7b1a/xsync
> > [2017-04-07 21:43:11.354751] I
> > [resource(/data/private/brick):1528:service_loop] GLUSTER: Register time:
> > 149160

Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-10 Thread mabi
Hi Kotresh,

I am using the official Debian 8 (jessie) package which has rsync version 3.1.1.

Regards,
M.

 Original Message 
Subject: Re: [Gluster-users] Geo replication stuck (rsync: link_stat 
"(unreachable)")
Local Time: April 10, 2017 6:33 AM
UTC Time: April 10, 2017 4:33 AM
From: khire...@redhat.com
To: mabi <m...@protonmail.ch>
Gluster Users <gluster-users@gluster.org>

Hi Mabi,

What's the rsync version being used?

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" <m...@protonmail.ch>
> To: "Gluster Users" <gluster-users@gluster.org>
> Sent: Saturday, April 8, 2017 4:20:25 PM
> Subject: [Gluster-users] Geo replication stuck (rsync: link_stat 
> "(unreachable)")
>
> Hello,
>
> I am using distributed geo replication with two of my GlusterFS 3.7.20
> replicated volumes and just noticed that the geo replication for one volume
> is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried
> to stop and restart geo replication but still it stays stuck at that
> specific date and time under the DATA field of the geo replication "status
> detail" command I can see 3879 and that it has "Active" as STATUS but still
> nothing happens. I noticed that the rsync process is running but does not do
> anything, then I did a strace on the PID of rsync and saw the following:
>
> write(2, "rsync: link_stat \"(unreachable)/"..., 114
>
> It looks like rsync can't read or find a file and stays stuck on that. In the
> geo-replication log files of GlusterFS master I can't find any error
> messages just informational message. For example when I restart the geo
> replication I see the following log entries:
>
> [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] : slave
> bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}]
> [2017-04-07 21:43:05.666435] I [monitor(monitor):468:distribute] :
> worker specs: [('/data/private/brick', 'ssh:// root@gfs1geo.domain
> :gluster://localhost:private-geo', '1', False)]
> [2017-04-07 21:43:05.823931] I [monitor(monitor):267:monitor] Monitor:
> 
> [2017-04-07 21:43:05.824204] I [monitor(monitor):268:monitor] Monitor:
> starting gsyncd worker
> [2017-04-07 21:43:05.930124] I [gsyncd(/data/private/brick):733:main_i]
> : syncing: gluster://localhost:private -> ssh:// root@gfs1geo.domain
> :gluster://localhost:private-geo
> [2017-04-07 21:43:05.931169] I [changelogagent(agent):73:__init__]
> ChangelogAgent: Agent listining...
> [2017-04-07 21:43:08.558648] I
> [master(/data/private/brick):83:gmaster_builder] : setting up xsync
> change detection mode
> [2017-04-07 21:43:08.559071] I [master(/data/private/brick):367:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2017-04-07 21:43:08.560163] I
> [master(/data/private/brick):83:gmaster_builder] : setting up changelog
> change detection mode
> [2017-04-07 21:43:08.560431] I [master(/data/private/brick):367:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2017-04-07 21:43:08.561105] I
> [master(/data/private/brick):83:gmaster_builder] : setting up
> changeloghistory change detection mode
> [2017-04-07 21:43:08.561391] I [master(/data/private/brick):367:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2017-04-07 21:43:11.354417] I [master(/data/private/brick):1249:register]
> _GMaster: xsync temp directory:
> /var/lib/misc/glusterfsd/private/ssh%3A%2F%2Froot%40192.168.20.107%3Agluster%3A%2F%2F127.0.0.1%3Aprivate-geo/616931ac8f39da5dc5834f9d47fc7b1a/xsync
> [2017-04-07 21:43:11.354751] I
> [resource(/data/private/brick):1528:service_loop] GLUSTER: Register time:
> 1491601391
> [2017-04-07 21:43:11.357630] I [master(/data/private/brick):510:crawlwrap]
> _GMaster: primary master with volume id e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
> ...
> [2017-04-07 21:43:11.489355] I [master(/data/private/brick):519:crawlwrap]
> _GMaster: crawl interval: 1 seconds
> [2017-04-07 21:43:11.516710] I [master(/data/private/brick):1163:crawl]
> _GMaster: starting history crawl... turns: 1, stime: (1487885974, 0), etime:
> 1491601391
> [2017-04-07 21:43:12.607836] I [master(/data/private/brick):1192:crawl]
> _GMaster: slave's time: (1487885974, 0)
>
> Does anyone know how I can find out the root cause of this problem and make
> geo replication work again from the time point it got stuck?
>
> Many thanks in advance for your help.
>
> Best regards,
> Mabi
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo replication stuck (rsync: link_stat "(unreachable)")

2017-04-09 Thread Kotresh Hiremath Ravishankar
Hi Mabi,

What's the rsync version being used?

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "mabi" 
> To: "Gluster Users" 
> Sent: Saturday, April 8, 2017 4:20:25 PM
> Subject: [Gluster-users] Geo replication stuck (rsync: link_stat  
> "(unreachable)")
> 
> Hello,
> 
> I am using distributed geo replication with two of my GlusterFS 3.7.20
> replicated volumes and just noticed that the geo replication for one volume
> is not working anymore. It is stuck since the 2017-02-23 22:39 and I tried
> to stop and restart geo replication but still it stays stuck at that
> specific date and time under the DATA field of the geo replication "status
> detail" command I can see 3879 and that it has "Active" as STATUS but still
> nothing happens. I noticed that the rsync process is running but does not do
> anything, then I did a strace on the PID of rsync and saw the following:
> 
> write(2, "rsync: link_stat \"(unreachable)/"..., 114
> 
> It looks like rsync can't read or find a file and stays stuck on that. In the
> geo-replication log files of GlusterFS master I can't find any error
> messages just informational message. For example when I restart the geo
> replication I see the following log entries:
> 
> [2017-04-07 21:43:05.664541] I [monitor(monitor):443:distribute] : slave
> bricks: [{'host': 'gfs1geo.domain', 'dir': '/data/private-geo/brick'}]
> [2017-04-07 21:43:05.666435] I [monitor(monitor):468:distribute] :
> worker specs: [('/data/private/brick', 'ssh:// root@gfs1geo.domain
> :gluster://localhost:private-geo', '1', False)]
> [2017-04-07 21:43:05.823931] I [monitor(monitor):267:monitor] Monitor:
> 
> [2017-04-07 21:43:05.824204] I [monitor(monitor):268:monitor] Monitor:
> starting gsyncd worker
> [2017-04-07 21:43:05.930124] I [gsyncd(/data/private/brick):733:main_i]
> : syncing: gluster://localhost:private -> ssh:// root@gfs1geo.domain
> :gluster://localhost:private-geo
> [2017-04-07 21:43:05.931169] I [changelogagent(agent):73:__init__]
> ChangelogAgent: Agent listining...
> [2017-04-07 21:43:08.558648] I
> [master(/data/private/brick):83:gmaster_builder] : setting up xsync
> change detection mode
> [2017-04-07 21:43:08.559071] I [master(/data/private/brick):367:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2017-04-07 21:43:08.560163] I
> [master(/data/private/brick):83:gmaster_builder] : setting up changelog
> change detection mode
> [2017-04-07 21:43:08.560431] I [master(/data/private/brick):367:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2017-04-07 21:43:08.561105] I
> [master(/data/private/brick):83:gmaster_builder] : setting up
> changeloghistory change detection mode
> [2017-04-07 21:43:08.561391] I [master(/data/private/brick):367:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2017-04-07 21:43:11.354417] I [master(/data/private/brick):1249:register]
> _GMaster: xsync temp directory:
> /var/lib/misc/glusterfsd/private/ssh%3A%2F%2Froot%40192.168.20.107%3Agluster%3A%2F%2F127.0.0.1%3Aprivate-geo/616931ac8f39da5dc5834f9d47fc7b1a/xsync
> [2017-04-07 21:43:11.354751] I
> [resource(/data/private/brick):1528:service_loop] GLUSTER: Register time:
> 1491601391
> [2017-04-07 21:43:11.357630] I [master(/data/private/brick):510:crawlwrap]
> _GMaster: primary master with volume id e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
> ...
> [2017-04-07 21:43:11.489355] I [master(/data/private/brick):519:crawlwrap]
> _GMaster: crawl interval: 1 seconds
> [2017-04-07 21:43:11.516710] I [master(/data/private/brick):1163:crawl]
> _GMaster: starting history crawl... turns: 1, stime: (1487885974, 0), etime:
> 1491601391
> [2017-04-07 21:43:12.607836] I [master(/data/private/brick):1192:crawl]
> _GMaster: slave's time: (1487885974, 0)
> 
> Does anyone know how I can find out the root cause of this problem and make
> geo replication work again from the time point it got stuck?
> 
> Many thanks in advance for your help.
> 
> Best regards,
> Mabi
> 
> 
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users