On 05/17/11 13:04, anthony garnier wrote:
Hi,
I've put the Client log in Debug mod :
# gluster volume geo-replication /soft/venus config log-level DEBUG
geo-replication config updated successfully
# gluster volume geo-replication /soft/venus config log-file
/usr/local/var/log/glusterfs/geo-replication-slaves/${session_owner}:file%3A%2F%2F%2Fsoft%2Fvenus.log
# gluster volume geo-replication athena /soft/venus config session-owner
28cbd261-3a3e-4a5a-b300-ea468483c944
# gluster volume geo-replication athena /soft/venus start
Starting geo-replication session between athena & /soft/venus has been
successful
# gluster volume geo-replication athena /soft/venus status
MASTER SLAVE STATUS
--------------------------------------------------------------------------------
athena /soft/venus starting...
and then :
# gluster volume geo-replication athena /soft/venus status
MASTER SLAVE STATUS
--------------------------------------------------------------------------------
athena /soft/venus faulty
Is this an edited output? By all chance, I'd expect to see the full
slave url, ie. file:///soft/venus in the status output.
For client :
cat
/usr/local/var/log/glusterfs/geo-replication-slaves/28cbd261-3a3e-4a5a-b300-ea468483c944:file%3A%2F%2F%2Fsoft%2Fvenus.log
[2011-05-17 09:20:40.519731] I [gsyncd(slave):287:main_i] <top>:
syncing: file:///soft/venus
[2011-05-17 09:20:40.520587] I [resource(slave):200:service_loop] FILE:
slave listening
[2011-05-17 09:20:40.532951] I [repce(slave):61:service_loop]
RepceServer: terminating on reaching EOF.
[2011-05-17 09:21:50.528803] I [gsyncd(slave):287:main_i] <top>:
syncing: file:///soft/venus
[2011-05-17 09:21:50.529666] I [resource(slave):200:service_loop] FILE:
slave listening
[2011-05-17 09:21:50.542349] I [repce(slave):61:service_loop]
RepceServer: terminating on reaching EOF.
For server :
# cat
/usr/local/var/log/glusterfs/geo-replication/athena/file%3A%2F%2F%2Fsoft%2Fvenus.log
[2011-05-17 09:30:04.431369] I [monitor(monitor):42:monitor] Monitor:
------------------------------------------------------------
[2011-05-17 09:30:04.431669] I [monitor(monitor):43:monitor] Monitor:
starting gsyncd worker
[2011-05-17 09:30:04.486852] I [gsyncd:287:main_i] <top>: syncing:
gluster://localhost:athena -> file:///soft/venus
[...]
raise RuntimeError("command failed: " + " ".join(argv))
RuntimeError: command failed: /usr/local/sbin/glusterfs --xlator-option
*-dht.assert-no-child-down=true -l
/usr/local/var/log/glusterfs/geo-replication/athena/file%3A%2F%2F%2Fsoft%2Fvenus.gluster.log
-s localhost --volfile-id athena --client-pid=-1
/tmp/gsyncd-aux-mount-TEqjwY
[2011-05-17 09:30:04.647973] D [monitor(monitor):57:monitor] Monitor:
worker got connected in 0 sec, waiting 59 more to make sure it's fine
This is interesting in the sense that the error you get now is not the
same as in your first post. Better said, the _symptoms_ are different,
the error as such might be the same. I can imagine that there is a race
in between exceptional events and it's accidental which one interrupts
the event flow.
So, it seems that the auxiliary glusterfs instance used by master gsyncd
fails. (Sidenote: if you prefer to use client/server terminology instead
of master/slave, that's fine, but master should be called client and
slave should be called server, ie. the reverse way you do :) ) To see
what's wrong with that, I again ask for the respective logs:
## setting DEBUG loglevel for master's aux glusterfs
# gluster volume geo-replication athena /soft/venus config \
gluster-log-level DEBUG
## getting the path of the logfile of aux glusterfs
# gluster volume geo-replication athena /soft/venus config \
gluster-log-file
So pls post the latter thingy.
Csaba
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users