Thanks Aravinda! The problem was in duplicate keys in authorized_keys file, one with prefix “command=“ the other one is exactly the same key but with prefix ssh-rsa. I’ve removed the one with prefix ssh-rsa, and the session is now working fine :D
I’ll do some failure tests then I’ll update you with the results. —Bishoy > On Mar 31, 2016, at 1:22 AM, Aravinda <[email protected]> wrote: > > Hi, > > From the error I understood that SSH connection is failing. In slave-host02 > extra entries present in /home/guser/.ssh/authorized_keys. > > In /home/guser/.ssh/authorized_keys Please delete extra lines which does not > start with "command=". Then stop and start the Geo-replication. > regards > Aravinda > On 03/31/2016 04:00 AM, Gmail wrote: >> I’ve rebuilt the cluster again, making a fresh installation. And now the >> error is different. >> >> >> >> >> >> MASTER NODE MASTER VOL MASTER BRICK SLAVE >> USER SLAVE SLAVE NODE STATUS >> CRAWL STATUS LAST_SYNCED >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> master-host01.me.com <http://master-host01.me.com/> geotest >> /gpool/brick03/geotest guser guser@slave-host01::geotestdr N/A >> Faulty N/A N/A >> master-host02.me.com <http://master-host02.me.com/> geotest >> /gpool/brick03/geotest guser guser@slave-host01::geotestdr >> slave-host01 Passive N/A N/A >> >> master-host03.me.com <http://master-host03.me.com/> geotest >> /gpool/brick03/geotest guser guser@slave-host01::geotestdr >> slave-host03 Passive N/A N/A >> >> >> >> >> >> >> [2016-03-30 22:09:31.326898] I [monitor(monitor):221:monitor] Monitor: >> ------------------------------------------------------------ >> [2016-03-30 22:09:31.327461] I [monitor(monitor):222:monitor] Monitor: >> starting gsyncd worker >> [2016-03-30 22:09:31.544631] I [gsyncd(/gpool/brick03/geotest):649:main_i] >> <top>: syncing: gluster://localhost:geotest <gluster://localhost:geotest> -> >> ssh://guser@slave-host02:gluster://localhost:geotestdr >> <ssh://guser@slave-host02:gluster://localhost:geotestdr> >> [2016-03-30 22:09:31.547542] I [changelogagent(agent):75:__init__] >> ChangelogAgent: Agent listining... >> [2016-03-30 22:09:31.830554] E >> [syncdutils(/gpool/brick03/geotest):252:log_raise_exception] <top>: >> connection to peer is broken >> [2016-03-30 22:09:31.831017] W >> [syncdutils(/gpool/brick03/geotest):256:log_raise_exception] <top>: >> !!!!!!!!!!!!! >> [2016-03-30 22:09:31.831258] W >> [syncdutils(/gpool/brick03/geotest):257:log_raise_exception] <top>: !!! >> getting "No such file or directory" errors is most likely due >> to MISCONFIGURATION, please consult >> https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html >> >> <https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html> >> [2016-03-30 22:09:31.831502] W >> [syncdutils(/gpool/brick03/geotest):265:log_raise_exception] <top>: >> !!!!!!!!!!!!! >> [2016-03-30 22:09:31.836395] E [resource(/gpool/brick03/geotest):222:errlog] >> Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no >> -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S >> /tmp/gsyncd-aux-ssh-SfXvbB/de372ce5774b5d259c58c5c9522ffc8f.sock >> guser@slave-host02 /nonexistent/gsyncd --session-owner >> ec473e17-b933-4bf7-9eed-4c393f7aaf5d -N --listen --timeout 120 >> gluster://localhost:geotestdr <gluster://localhost:geotestdr>" returned with >> 127, saying: >> [2016-03-30 22:09:31.836694] E [resource(/gpool/brick03/geotest):226:logerr] >> Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory >> [2016-03-30 22:09:31.837193] I >> [syncdutils(/gpool/brick03/geotest):220:finalize] <top>: exiting. >> [2016-03-30 22:09:31.840569] I [repce(agent):92:service_loop] RepceServer: >> terminating on reaching EOF. >> [2016-03-30 22:09:31.840993] I [syncdutils(agent):220:finalize] <top>: >> exiting. >> [2016-03-30 22:09:31.840742] I [monitor(monitor):274:monitor] Monitor: >> worker(/gpool/brick03/geotest) died before establishing connection >> [2016-03-30 22:09:42.130866] I [monitor(monitor):221:monitor] Monitor: >> ------------------------------------------------------------ >> [2016-03-30 22:09:42.131448] I [monitor(monitor):222:monitor] Monitor: >> starting gsyncd worker >> [2016-03-30 22:09:42.348165] I [gsyncd(/gpool/brick03/geotest):649:main_i] >> <top>: syncing: gluster://localhost:geotest <gluster://localhost:geotest> -> >> ssh://guser@slave-host02:gluster://localhost:geotestdr >> <ssh://guser@slave-host02:gluster://localhost:geotestdr> >> [2016-03-30 22:09:42.349118] I [changelogagent(agent):75:__init__] >> ChangelogAgent: Agent listining... >> [2016-03-30 22:09:42.653141] E >> [syncdutils(/gpool/brick03/geotest):252:log_raise_exception] <top>: >> connection to peer is broken >> [2016-03-30 22:09:42.653656] W >> [syncdutils(/gpool/brick03/geotest):256:log_raise_exception] <top>: >> !!!!!!!!!!!!! >> [2016-03-30 22:09:42.653898] W >> [syncdutils(/gpool/brick03/geotest):257:log_raise_exception] <top>: !!! >> getting "No such file or directory" errors is most likely due >> to MISCONFIGURATION, please consult >> https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html >> >> <https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html> >> [2016-03-30 22:09:42.654129] W >> [syncdutils(/gpool/brick03/geotest):265:log_raise_exception] <top>: >> !!!!!!!!!!!!! >> [2016-03-30 22:09:42.659329] E [resource(/gpool/brick03/geotest):222:errlog] >> Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no >> -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S >> /tmp/gsyncd-aux-ssh-6r8rxx/de372ce5774b5d259c58c5c9522ffc8f.sock >> guser@slave-host02 /nonexistent/gsyncd --session-owner >> ec473e17-b933-4bf7-9eed-4c393f7aaf5d -N --listen --timeout 120 >> gluster://localhost:geotestdr <gluster://localhost:geotestdr>" returned with >> 127, saying: >> [2016-03-30 22:09:42.659626] E [resource(/gpool/brick03/geotest):226:logerr] >> Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory >> [2016-03-30 22:09:42.660140] I >> [syncdutils(/gpool/brick03/geotest):220:finalize] <top>: exiting. >> [2016-03-30 22:09:42.662802] I [repce(agent):92:service_loop] RepceServer: >> terminating on reaching EOF. >> [2016-03-30 22:09:42.663197] I [syncdutils(agent):220:finalize] <top>: >> exiting. >> [2016-03-30 22:09:42.663024] I [monitor(monitor):274:monitor] Monitor: >> worker(/gpool/brick03/geotest) died before establishing connection >> >> >> —Bishoy >> >>> On Mar 30, 2016, at 10:50 AM, Gmail < >>> <mailto:[email protected]>[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> I’ve tried changing the permissions to 777 on /var/log/glusterfs on all the >>> slave nodes, but still no luck :( >>> >>> here is the log from the master node where I created and started the >>> geo-replication session. >>> >>> [2016-03-30 17:14:53.463150] I [monitor(monitor):221:monitor] Monitor: >>> ------------------------------------------------------------ >>> [2016-03-30 17:14:53.463669] I [monitor(monitor):222:monitor] Monitor: >>> starting gsyncd worker >>> [2016-03-30 17:14:53.603774] I [changelogagent(agent):75:__init__] >>> ChangelogAgent: Agent listining... >>> [2016-03-30 17:14:53.604080] I [gsyncd(/mnt/brick10/xfsvol2):649:main_i] >>> <top>: syncing: gluster://localhost:xfsvol2 <gluster://localhost:xfsvol2> >>> -> >>> <ssh://guser@slave-host01:gluster://localhost:xfsvol2dr>ssh://guser@slave-host01:gluster://localhost:xfsvol2dr >>> <ssh://guser@slave-host01:gluster://localhost:xfsvol2dr> >>> [2016-03-30 17:14:54.210602] E >>> [syncdutils(/mnt/brick10/xfsvol2):252:log_raise_exception] <top>: >>> connection to peer is broken >>> [2016-03-30 17:14:54.211117] E [resource(/mnt/brick10/xfsvol2):222:errlog] >>> Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no >>> -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S >>> /tmp/gsyncd-aux-ssh-evONxc/3bda60dc6e900c0833fed4e4fdfbd480.sock >>> guser@slave-host01 /nonexistent/gsyncd --session-owner >>> ef9ccae5-0def-4a47-9a96-881a1896755c -N --listen --timeout 120 >>> gluster://localhost:xfsvol2dr <gluster://localhost:xfsvol2dr>" returned >>> with 1, saying: >>> [2016-03-30 17:14:54.211376] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> [2016-03-30 17:14:53.933174] I [cli.c:720:main] 0-cli: Started >>> running /usr/sbin/gluster with version 3.7.3 >>> [2016-03-30 17:14:54.211631] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> [2016-03-30 17:14:53.933225] I [cli.c:608:cli_rpc_init] 0-cli: >>> Connecting to remote glusterd at localhost >>> [2016-03-30 17:14:54.211828] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> [2016-03-30 17:14:54.074207] I [MSGID: 101190] >>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread >>> with index 1 >>> [2016-03-30 17:14:54.212017] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> [2016-03-30 17:14:54.074302] I >>> [socket.c:2409:socket_event_handler] 0-transport: disconnecting now >>> [2016-03-30 17:14:54.212199] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> [2016-03-30 17:14:54.077207] I >>> [cli-rpc-ops.c:6230:gf_cli_getwd_cbk] 0-cli: Received resp to getwd >>> [2016-03-30 17:14:54.212380] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> [2016-03-30 17:14:54.077269] I [input.c:36:cli_batch] 0-: >>> Exiting with: 0 >>> [2016-03-30 17:14:54.212584] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> ERROR:root:FAIL: >>> [2016-03-30 17:14:54.212774] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> Traceback (most recent call last): >>> [2016-03-30 17:14:54.212954] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", >>> line 165, in main >>> [2016-03-30 17:14:54.213131] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> main_i() >>> [2016-03-30 17:14:54.213308] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", >>> line 638, in main_i >>> [2016-03-30 17:14:54.213500] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> startup(go_daemon=go_daemon, log_file=log_file, label=label) >>> [2016-03-30 17:14:54.213690] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", >>> line 144, in startup >>> [2016-03-30 17:14:54.213890] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> GLogger._gsyncd_loginit(**kw) >>> [2016-03-30 17:14:54.214068] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", >>> line 107, in _gsyncd_loginit >>> [2016-03-30 17:14:54.214246] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> cls.setup(label=kw.get('label'), **lkw) >>> [2016-03-30 17:14:54.214422] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", >>> line 79, in setup >>> [2016-03-30 17:14:54.214622] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> logging_handler = >>> handlers.WatchedFileHandler(lprm['filename']) >>> [2016-03-30 17:14:54.214802] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/lib64/python2.6/logging/handlers.py", line 377, in >>> __init__ >>> [2016-03-30 17:14:54.214977] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> logging.FileHandler.__init__(self, filename, mode, >>> encoding, delay) >>> [2016-03-30 17:14:54.215152] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/lib64/python2.6/logging/__init__.py", line 835, in >>> __init__ >>> [2016-03-30 17:14:54.215327] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> StreamHandler.__init__(self, self._open()) >>> [2016-03-30 17:14:54.215523] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> File "/usr/lib64/python2.6/logging/__init__.py", line 854, in >>> _open >>> [2016-03-30 17:14:54.215703] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> stream = open(self.baseFilename, self.mode) >>> [2016-03-30 17:14:54.215883] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> IOError: [Errno 13] Permission denied: >>> '/var/log/glusterfs/geo-replication-slaves/mbr/ef9ccae5-0def-4a47-9a96-881a1896755c:gluster%3A%2F%2F127.0.0.1%3Axfsvol2dr.log' >>> [2016-03-30 17:14:54.216063] E [resource(/mnt/brick10/xfsvol2):226:logerr] >>> Popen: ssh> failed with IOError. >>> [2016-03-30 17:14:54.216500] I >>> [syncdutils(/mnt/brick10/xfsvol2):220:finalize] <top>: exiting. >>> [2016-03-30 17:14:54.218672] I [repce(agent):92:service_loop] RepceServer: >>> terminating on reaching EOF. >>> [2016-03-30 17:14:54.219063] I [syncdutils(agent):220:finalize] <top>: >>> exiting. >>> [2016-03-30 17:14:54.218930] I [monitor(monitor):274:monitor] Monitor: >>> worker(/mnt/brick10/xfsvol2) died before establishing connection >>> >>> —Bishoy >>> >>>> On Mar 29, 2016, at 1:05 AM, Aravinda <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> Geo-replication command should be run as privileged user itself. >>>> >>>> gluster volume geo-replication <MASTERVOL> <SLAVEUSER>@<SLAVEHOST> start >>>> >>>> and then check the status, if it shows Faulty then please share the log >>>> files present in /var/log/glusterfs/geo-replication/<MASTERVOL>/*.log >>>> >>>> regards >>>> Aravinda >>>> On 03/29/2016 12:51 PM, Gmail wrote: >>>>> I’ve been trying to setup geo-replication using Gluster 3.7.3 on OEL 6.5 >>>>> It keeps giving me faulty session. >>>>> I’ve tried to use root user instead, it works fine! >>>>> >>>>> I’ve followed literally the documentation but no luck getting the >>>>> unprivileged user working. >>>>> >>>>> I’ve tried running /usr/libexec/glusterfs/gsyncd on the slave node using >>>>> the unprivileged user, and that’s what I get. >>>>> >>>>> /usr/libexec/glusterfs/gsyncd --session-owner >>>>> ef9ccae5-0def-4a47-9a96-881a1896755c -N --listen --timeout 120 >>>>> gluster://localhost:vol01dr <gluster://localhost:vol01dr> >>>>> [2016-03-29 00:52:49.058244] I [cli.c:720:main] 0-cli: Started running >>>>> /usr/sbin/gluster with version 3.7.3 >>>>> [2016-03-29 00:52:49.058297] I [cli.c:608:cli_rpc_init] 0-cli: Connecting >>>>> to remote glusterd at localhost >>>>> [2016-03-29 00:52:49.174686] I [MSGID: 101190] >>>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread >>>>> with index 1 >>>>> [2016-03-29 00:52:49.174768] I [socket.c:2409:socket_event_handler] >>>>> 0-transport: disconnecting now >>>>> [2016-03-29 00:52:49.177482] I [cli-rpc-ops.c:6230:gf_cli_getwd_cbk] >>>>> 0-cli: Received resp to getwd >>>>> [2016-03-29 00:52:49.177545] I [input.c:36:cli_batch] 0-: Exiting with: 0 >>>>> ERROR:root:FAIL: >>>>> Traceback (most recent call last): >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 165, in >>>>> main >>>>> main_i() >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 638, in >>>>> main_i >>>>> startup(go_daemon=go_daemon, log_file=log_file, label=label) >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 144, in >>>>> startup >>>>> GLogger._gsyncd_loginit(**kw) >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 107, in >>>>> _gsyncd_loginit >>>>> cls.setup(label=kw.get('label'), **lkw) >>>>> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 79, in >>>>> setup >>>>> logging_handler = handlers.WatchedFileHandler(lprm['filename']) >>>>> File "/usr/lib64/python2.6/logging/handlers.py", line 377, in __init__ >>>>> logging.FileHandler.__init__(self, filename, mode, encoding, delay) >>>>> File "/usr/lib64/python2.6/logging/__init__.py", line 835, in __init__ >>>>> StreamHandler.__init__(self, self._open()) >>>>> File "/usr/lib64/python2.6/logging/__init__.py", line 854, in _open >>>>> stream = open(self.baseFilename, self.mode) >>>>> IOError: [Errno 13] Permission denied: >>>>> '/var/log/glusterfs/geo-replication-slaves/mbr/ef9ccae5-0def-4a47-9a96-881a1896755c:gluster%3A%2F%2F127.0.0.1%3Avol01dr.log' >>>>> failed with IOError. >>>>> >>>>> >>>>> — Bishoy >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> <http://www.gluster.org/mailman/listinfo/gluster-users> >>> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
