Re: [Gluster-users] Issues with geo-rep

Carl Chenet Thu, 07 Jul 2011 10:56:56 -0700

On 07/07/2011 15:25, Kaushik BV wrote:

Hi Chaica,


This primarily means that the RPC communtication between the master
gsyncd module and slave gsyncd module is broken, this could happen to
various reasons. Check if it satisies all the pre-requisites:

- If FUSE is installed in the machine, since Geo-replication module
mounts the GlusterFS volume using FUSE to sync data.
- If the Slave is a volume, check if the volume is started.
- If the Slave is a plain directory, check if the directory has been
created already with the desired permissions (Not applicable in your case)
- If Glusterfs 3.2 is not installed in the default location (in Master)
and has been prefixed to be installed in a custom location, configure
the *gluster-command*  for it to point to exact location.
- If Glusterfs 3.2 is not installed in the default location (in slave)
and has been prefixed to be installed in a custom location, configure
the *remote-gsyncd-command*  for it to point to exact place where gsyncd
  is located.
- locate the slave log and see if it has any anomalies.
- Passwordless SSH is set up properly between the host and the remote
machine ( Not applicable in your case)

Ok the situation has slightly evolved. Now I do have a slave log andclearer error message on the master :

[2011-07-07 19:53:16.258866] I [monitor(monitor):42:monitor] Monitor:------------------------------------------------------------[2011-07-07 19:53:16.259073] I [monitor(monitor):43:monitor] Monitor:starting gsyncd worker[2011-07-07 19:53:16.332720] I [gsyncd:286:main_i] <top>: syncing:gluster://localhost:test-volume -> ssh://192.168.1.32::test-volume[2011-07-07 19:53:16.343554] D [repce:131:push] RepceClient: call6302:140305661662976:1310061196.34 __repce_version__() ...[2011-07-07 19:53:20.931523] D [repce:141:__call__] RepceClient: call6302:140305661662976:1310061196.34 __repce_version__ -> 1.0[2011-07-07 19:53:20.932172] D [repce:131:push] RepceClient: call6302:140305661662976:1310061200.93 version() ...[2011-07-07 19:53:20.933662] D [repce:141:__call__] RepceClient: call6302:140305661662976:1310061200.93 version -> 1.0[2011-07-07 19:53:20.933861] D [repce:131:push] RepceClient: call6302:140305661662976:1310061200.93 pid() ...[2011-07-07 19:53:20.934525] D [repce:141:__call__] RepceClient: call6302:140305661662976:1310061200.93 pid -> 10075[2011-07-07 19:53:20.957355] E [syncdutils:131:log_raise_exception]<top>: FAIL:

Traceback (most recent call last):

File "/usr/lib/glusterfs/glusterfs/python/syncdaemon/gsyncd.py", line102, in main

    main_i()

File "/usr/lib/glusterfs/glusterfs/python/syncdaemon/gsyncd.py", line293, in main_i

    local.connect()

File "/usr/lib/glusterfs/glusterfs/python/syncdaemon/resource.py",line 379, in connect

    raise RuntimeError("command failed: " + " ".join(argv))

RuntimeError: command failed: /usr/sbin/glusterfs --xlator-option*-dht.assert-no-child-down=true -L DEBUG -l/var/log/glusterfs/geo-replication/test-volume/ssh%3A%2F%2Froot%40192.168.1.32%3Agluster%3A%2F%2F127.0.0.1%3Atest-volume.gluster.log-s localhost --volfile-id test-volume --client-pid=-1/tmp/gsyncd-aux-mount-hy6T_w[2011-07-07 19:53:20.960621] D [monitor(monitor):58:monitor] Monitor:worker seems to be connected (?? racy check)[2011-07-07 19:53:21.962501] D [monitor(monitor):62:monitor] Monitor:worker died in startup phase

The command launched by glusterfs returns a 255 error shell code, whichI belive means the command is terminated by a signal. On the slave log Ihave :

[2011-07-07 19:54:49.571549] I [fuse-bridge.c:3218:fuse_thread_proc]0-fuse: unmounting /tmp/gsyncd-aux-mount-z2Q2Hg[2011-07-07 19:54:49.572459] W [glusterfsd.c:712:cleanup_and_exit](-->/lib/libc.so.6(clone+0x6d) [0x7f2c8998b02d](-->/lib/libpthread.so.0(+0x68ba) [0x7f2c89c238ba](-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x7f2c8a8f51b5])))0-: received signum (15), shutting down[2011-07-07 19:54:51.280207] W [write-behind.c:3029:init]0-test-volume-write-behind: disabling write-behind for first 0 bytes[2011-07-07 19:54:51.291669] I [client.c:1935:notify]0-test-volume-client-0: parent translators are ready, attempting connecton transport[2011-07-07 19:54:51.292329] I [client.c:1935:notify]0-test-volume-client-1: parent translators are ready, attempting connecton transport[2011-07-07 19:55:38.582926] I [rpc-clnt.c:1531:rpc_clnt_reconfig]0-test-volume-client-0: changing port to 24009 (from 0)[2011-07-07 19:55:38.583456] I [rpc-clnt.c:1531:rpc_clnt_reconfig]0-test-volume-client-1: changing port to 24009 (from 0)


Bye,
Carl Chenet
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Issues with geo-rep

Reply via email to