I have also tried enabling geo-replication using only a directory on the slave server rather than a gluster volume and it fails in the same way.
I've noticed that every time it fails the following is logged on the master [2014-03-14 11:51:43.155292] I [fuse-bridge.c:3376:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2014-03-14 11:51:43.155959] I [afr-common.c:2022:afr_set_root_inode_on_first_lookup] 0-volname-replicate-0: added root inode [2014-03-14 11:51:53.065063] I [fuse-bridge.c:4091:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-mlNTEe [2014-03-14 11:51:53.065631] W [glusterfsd.c:838:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x30aeee894d] (-->/lib64/libpthread.so.0() [0x30af207851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405d8d]))) 0-: received signum (15), shutting down [2014-03-14 11:51:53.065683] I [fuse-bridge.c:4655:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-mlNTEe'. On Thu, Mar 13, 2014 at 10:33 AM, John Ewing <[email protected]> wrote: > Hi, > > Thanks for the advice, I finally have time to go back to this issue now. > > It doesn't seem to be sticking on any particular part of the file system > as far as I can tell. > > One thing I've noticed is I always get an error about missing 'option > transport-type' > > > 2014-03-13 09:57:00.902189] E [resource:194:logerr] Popen: ssh> > [2014-03-13 09:56:50.093951] W [rpc-transport.c:174:rpc_transport_load] > 0-rpc-transport: missing 'option transport-type'. defaulting to "socket" > > on the master I have the following in glusterd.vol > > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > end-volume > > > on the slave I have > > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > > option mountbroker-root /var/mountbroker-root > option mountbroker-geo-replication.gluster-async > geo-ftb-vol,geo-bak-vol,geo-j1h-vol > option geo-replication-log-group gluster-async > > end-volume > > > What should I change to fix this error? > > > > master log > > [2014-03-13 09:56:47.888899] I [monitor(monitor):80:monitor] Monitor: > ------------------------------------------------------------ > [2014-03-13 09:56:47.889317] I [monitor(monitor):81:monitor] Monitor: > starting gsyncd worker > [2014-03-13 09:56:47.995637] I [gsyncd:354:main_i] <top>: syncing: > gluster://localhost:volname -> ssh://[email protected] > :gluster://localhost:geo-ftb-vol > [2014-03-13 09:56:48.22799] D [repce:175:push] RepceClient: call > 14516:140653524453120:1394704608.02 __repce_version__() ... > [2014-03-13 09:57:00.898520] E [syncdutils:173:log_raise_exception] <top>: > connection to peer is broken > [2014-03-13 09:57:00.901844] E [resource:191:errlog] Popen: command "ssh > -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i > /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S > /tmp/gsyncd-aux-ssh-_wRYS3/gsycnd-ssh-%r@%h:%p > [email protected]/nonexistent/gsyncd --session-owner > acfda6fc-d995-4bf0-b13e-da789afb28c7 -N > --listen --timeout 120 gluster://localhost:geo-ftb-vol" returned with 1, > saying: > [2014-03-13 09:57:00.902189] E [resource:194:logerr] Popen: ssh> > [2014-03-13 09:56:50.093951] W [rpc-transport.c:174:rpc_transport_load] > 0-rpc-transport: missing 'option transport-type'. defaulting to "socket" > [2014-03-13 09:57:00.902648] E [resource:194:logerr] Popen: ssh> > [2014-03-13 09:56:52.136564] I [cli-rpc-ops.c:4318:gf_cli3_1_getwd_cbk] > 0-cli: Received resp to getwd > [2014-03-13 09:57:00.902940] E [resource:194:logerr] Popen: ssh> > [2014-03-13 09:56:52.136782] I [input.c:46:cli_batch] 0-: Exiting with: 0 > [2014-03-13 09:57:00.903209] E [resource:194:logerr] Popen: ssh> failed > with error. > [2014-03-13 09:57:00.903844] I [syncdutils:142:finalize] <top>: exiting. > [2014-03-13 09:57:00.906152] D [monitor(monitor):96:monitor] Monitor: > worker seems to be connected (?? racy check) > [2014-03-13 09:57:01.907625] D [monitor(monitor):100:monitor] Monitor: > worker died in startup phase > [2014-03-13 09:57:11.918355] I [monitor(monitor):80:monitor] Monitor: > ------------------------------------------------------------ > [2014-03-13 09:57:11.918920] I [monitor(monitor):81:monitor] Monitor: > starting gsyncd worker > [2014-03-13 09:57:12.29169] I [gsyncd:354:main_i] <top>: syncing: > gluster://localhost:volname -> ssh://[email protected] > :gluster://localhost:geo-ftb-vol > > > -- lots of entries about syncing files --- > > [2014-03-13 10:10:20.670299] E [syncdutils:190:log_raise_exception] <top>: > FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 216, > in twrap > tf(*aa) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 123, > in tailer > poe, _ ,_ = select([po.stderr for po in errstore], [], [], 1) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 276, > in select > return eintr_wrap(oselect.select, oselect.error, *a) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 269, > in eintr_wrap > return func(*a) > error: (9, 'Bad file descriptor') > [2014-03-13 10:10:20.671988] I [syncdutils:142:finalize] <top>: exiting. > [2014-03-13 10:10:21.624923] D [monitor(monitor):100:monitor] Monitor: > worker died in startup phase > > > slave log > > [2014-03-13 10:08:44.478434] I [gsyncd(slave):354:main_i] <top>: syncing: > gluster://localhost:geo-ftb-vol > [2014-03-13 10:08:55.6546] I [resource(slave):453:service_loop] GLUSTER: > slave listening > [2014-03-13 10:09:31.698591] I [repce(slave):78:service_loop] RepceServer: > terminating on reaching EOF. > [2014-03-13 10:09:31.699101] I [syncdutils(slave):142:finalize] <top>: > exiting. > [2014-03-13 10:09:49.26217] I [gsyncd(slave):354:main_i] <top>: syncing: > gluster://localhost:geo-ftb-vol > [2014-03-13 10:10:00.252576] I [resource(slave):453:service_loop] GLUSTER: > slave listening > [2014-03-13 10:10:20.783905] I [repce(slave):78:service_loop] RepceServer: > terminating on reaching EOF. > [2014-03-13 10:10:20.784468] I [syncdutils(slave):142:finalize] <top>: > exiting. > [2014-03-13 10:10:37.405524] I [gsyncd(slave):354:main_i] <top>: syncing: > gluster://localhost:geo-ftb-vol > [2014-03-13 10:10:46.988630] I [resource(slave):453:service_loop] GLUSTER: > slave listening > > Thanks > > J. > > > > > > On Fri, Feb 14, 2014 at 1:51 PM, Venky Shankar <[email protected]>wrote: > >> Could you try again after changing the log-level to DEBUG using: >> >> # gluster volume geo-replication <master> <slave> config log-level DEBUG >> >> Also, logs from both master and slave would help. >> >> Thanks, >> -venky >> >> >> On Wed, Feb 12, 2014 at 4:44 PM, John Ewing <[email protected]> wrote: >> >>> No, its the latest 3.3 series release. >>> >>> 3.3.2 on both master and slave. >>> Centos 6 on master , Amazon linux on slave. >>> rsync 3.0.6 on both >>> >>> Using unprivileged ssh user setup with mountbroker. >>> >>> One thing I noticed was that the 3.3 manual says the base requirement is >>> for rsync 3.0.0 and higher and the webpage now >>> says 3.0.7. Is this relevant ? >>> >>> >>> On Wed, Feb 12, 2014 at 2:12 AM, Venky Shankar >>> <[email protected]>wrote: >>> >>>> Is this from the latest master branch? >>>> >>>> >>>> On Tue, Feb 11, 2014 at 4:35 PM, John Ewing <[email protected]>wrote: >>>> >>>>> I am trying to use geo-replication but it is running slowly and I keep >>>>> getting the >>>>> following logged in the geo-replication log. >>>>> >>>>> [2014-02-11 10:56:42.831517] I [monitor(monitor):80:monitor] Monitor: >>>>> ------------------------------------------------------------ >>>>> [2014-02-11 10:56:42.832226] I [monitor(monitor):81:monitor] Monitor: >>>>> starting gsyncd worker >>>>> [2014-02-11 10:56:42.951199] I [gsyncd:354:main_i] <top>: syncing: >>>>> gluster://localhost:xxxxxxx -> ssh://[email protected] >>>>> :gluster://localhost:xxxxx >>>>> [2014-02-11 10:56:53.79632] I [master:284:crawl] GMaster: new master >>>>> is acfda6fc-d995-4bf0-b13e-da789afb28c7 >>>>> [2014-02-11 10:56:53.80282] I [master:288:crawl] GMaster: primary >>>>> master with volume id acfda6fc-d995-4bf0-b13e-da789afb28c7 ... >>>>> [2014-02-11 10:56:57.453376] E [syncdutils:190:log_raise_exception] >>>>> <top>: FAIL: >>>>> Traceback (most recent call last): >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line >>>>> 216, in twrap >>>>> tf(*aa) >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>>>> 123, in tailer >>>>> poe, _ ,_ = select([po.stderr for po in errstore], [], [], 1) >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line >>>>> 276, in select >>>>> return eintr_wrap(oselect.select, oselect.error, *a) >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line >>>>> 269, in eintr_wrap >>>>> return func(*a) >>>>> error: (9, 'Bad file descriptor') >>>>> [2014-02-11 10:56:57.462110] I [syncdutils:142:finalize] <top>: >>>>> exiting. >>>>> >>>>> I'm unsure what to do to debug and fix this. >>>>> >>>>> Thanks >>>>> >>>>> John. >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> [email protected] >>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users >>> >> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
