Hi Marcus, I think the fix [1] is needed in 4.1 Could you please this out and let us know if that works for you?
[1] https://review.gluster.org/#/c/20207/ Thanks, Kotresh HR On Thu, Jul 12, 2018 at 1:49 AM, Marcus Pedersén <[email protected]> wrote: > Hi all, > > I have upgraded from 3.12.9 to 4.1.1 and been following upgrade > instructions for offline upgrade. > > I upgraded geo-replication side first 1 x (2+1) and the master side after > that 2 x (2+1). > > Both clusters works the way they should on their own. > > After upgrade on master side status for all geo-replication nodes > is Stopped. > > I tried to start the geo-replication from master node and response back > was started successfully. > > Status again .... Stopped > > Tried to start again and get response started successfully, after that all > glusterd crashed on all master nodes. > > After a restart of all glusterd the master cluster was up again. > > Status for geo-replication is still Stopped and every try to start it > after this gives the response successful but still status Stopped. > > > Please help me get the geo-replication up and running again. > > > Best regards > > Marcus Pedersén > > > Part of geo-replication log from master node: > > [2018-07-11 18:42:48.941760] I [changelogagent(/urd-gds/gluster):73:__init__] > ChangelogAgent: Agent listining... > [2018-07-11 18:42:48.947567] I > [resource(/urd-gds/gluster):1780:connect_remote] > SSH: Initializing SSH connection between master and slave... > [2018-07-11 18:42:49.363514] E > [syncdutils(/urd-gds/gluster):304:log_raise_exception] > <top>: connection to peer is broken > [2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog] > Popen: command returned error cmd=ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\ > .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-hjRhBo/ > 7e5534547f3675a710a107722317484f.sock geouser@urd-gds-geo-000 > /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee > --local-id .%\ > 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120 > gluster://localhost:urd-gds-volume error=2 > [2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> usage: gsyncd.py [-h] > [2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> > [2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> {monitor-status,monitor, > worker,agent,slave,status,config-check,config-get,config-set,config-reset, > voluuidget,d\ > elete} > [2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> ... > [2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice: > '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status', > 'monit\ > or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get', > 'config-set', 'config-reset', 'voluuidget', 'delete') > [2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop] > RepceServer: terminating on reaching EOF. > [2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor: > worker died before establishing connection brick=/urd-gds/gluster > [2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor: > starting gsyncd worker brick=/urd-gds/gluster > slave_node=ssh://geouser@urd-gds-geo-000:gluster:// > localhost:urd-gds-volume > [2018-07-11 18:42:59.558491] I > [resource(/urd-gds/gluster):1780:connect_remote] > SSH: Initializing SSH connection between master and slave... > [2018-07-11 18:42:59.559056] I [changelogagent(/urd-gds/gluster):73:__init__] > ChangelogAgent: Agent listining... > [2018-07-11 18:42:59.945693] E > [syncdutils(/urd-gds/gluster):304:log_raise_exception] > <top>: connection to peer is broken > [2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog] > Popen: command returned error cmd=ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\ > .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-992bk7/ > 7e5534547f3675a710a107722317484f.sock geouser@urd-gds-geo-000 > /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee > --local-id .%\ > 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120 > gluster://localhost:urd-gds-volume error=2 > [2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> usage: gsyncd.py [-h] > [2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> > [2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> {monitor-status,monitor, > worker,agent,slave,status,config-check,config-get,config-set,config-reset, > voluuidget,d\ > elete} > [2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> ... > [2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice: > '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status', > 'monit\ > or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get', > 'config-set', 'config-reset', 'voluuidget', 'delete') > [2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop] > RepceServer: terminating on reaching EOF. > [2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor: > worker died before establishing connection brick=/urd-gds/gluster > [2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor: > starting gsyncd worker brick=/urd-gds/gluster > slave_node=ssh://geouser@urd-gds-geo-000:gluster:// > localhost:urd-gds-volume > [2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor: > Changelog Agent died, Aborting Worker brick=/urd-gds/gluster > [2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor: > worker died before establishing connection brick=/urd-gds/gluster > [2018-07-11 18:43:20.112435] I [gsyncdstatus(monitor):242:set_worker_status] > GeorepStatus: Worker Status Change status=inconsistent > [2018-07-11 18:43:20.112885] E [syncdutils(monitor):331:log_raise_exception] > <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line > 361, in twrap > except: > File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 428, > in wmon > sys.exit() > TypeError: 'int' object is not iterable > [2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize] <top>: > exiting. > > --- > När du skickar e-post till SLU så innebär detta att SLU behandlar dina > personuppgifter. För att läsa mer om hur detta går till, klicka här > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> > > _______________________________________________ > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users > -- Thanks and Regards, Kotresh H R
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
