On 2011-05-12, Cedric Lagneau <[email protected]> wrote:
> My initial problem on the testing platform is not solved: glusterd 
> geo-replication command stop working after about one day.
>
> On Master:
> #cat ssh%3A%2F%2Froot%40slave.mydomain.com%3Afile%3A%2F%2F%2Fdata%2Ftest2.log 
> [2011-05-12 10:50:53.451495] I [monitor(monitor):19:set_state] Monitor: new 
> state: starting...
> [2011-05-12 10:50:53.465759] I [monitor(monitor):42:monitor] Monitor: 
> ------------------------------------------------------------
> [2011-05-12 10:50:53.466232] I [monitor(monitor):43:monitor] Monitor: 
> starting gsyncd worker
> [2011-05-12 10:50:53.596132] I [gsyncd:287:main_i] <top>: syncing: 
> gluster://localhost:test2 -> ssh://slave.mydomain.com:/data/test2
> [2011-05-12 10:50:53.641566] D [repce:131:push] RepceClient: call 
> 1879:140148091115264:1305190253.64 __repce_version__() ...
> [2011-05-12 10:50:53.751271] E [syncdutils:131:log_raise_exception] <top>: 
> FAIL: 
> Traceback (most recent call last):
>   File "/usr/lib/glusterfs/glusterfs/python/syncdaemon/syncdutils.py", line 
> 152, in twrap
>     tf(*aa)
>   File "/usr/lib/glusterfs/glusterfs/python/syncdaemon/repce.py", line 118, 
> in listen
>     rid, exc, res = recv(self.inf)
>   File "/usr/lib/glusterfs/glusterfs/python/syncdaemon/repce.py", line 42, in 
> recv
>     return pickle.load(inf)
> EOFError
> [2011-05-12 10:50:53.759484] D [monitor(monitor):57:monitor] Monitor: worker 
> got connected in 0 sec, waiting 59 more to make sure it's fine
> [2011-05-12 10:51:53.535005] I [monitor(monitor):19:set_state] Monitor: new 
> state: faulty
>
> There is not test2-gluster.log.
>
> On Slave:
> no log (in debug mode) and no process /usr/bin/python 
> /usr/lib/glusterfs/glusterfs/python/syncdaemon/gsyncd.py
>
>
> tcpdump on SLAVE show some ssh traffic with Master server when i start 
> geo-replication.
>
> glusterd strace on master with a starting geo-replication with status faulty:

It would be more interesting to strace the execution of the remote gsyncd. That 
can be accomplished by
smuggling in strace to the remote-gsyncd command:

# gluster volume geo-replication test2 slave.mydomain.com::/data/test2 config 
remote-gsyncd \
    "strace -f -s512 -o /tmp/gsyncd-test2.slog `gluster volume geo-replication 
test2 slave.mydomain.com::/data/test2 config remote-gsyncd`"

>From that we can read out why remote gsyncd invocation/initialization fails.

Csaba

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to