Hi Marcus, Geo-rep had few important fixes in 4.1.3. Is it possible to upgrade and check whether the issue is still seen?
Thanks, Kotresh HR On Sat, Sep 1, 2018 at 5:08 PM, Marcus Pedersén <[email protected]> wrote: > Hi again, > > I found another problem on the other master node. > > The node toggles Active/Faulty and it is the same error over and over > again. > > > [2018-09-01 11:23:02.94080] E [repce(worker /urd-gds/gluster):197:__call__] > RepceClient: call failed call=1226:139955262510912:1535800981.24 > method=entry_ops error=GsyncdError > [2018-09-01 11:23:02.94214] E [syncdutils(worker > /urd-gds/gluster):300:log_raise_exception] > <top>: execution of "gluster" failed with ENOENT (No such file or directory) > [2018-09-01 11:23:02.106194] I [repce(agent /urd-gds/gluster):80:service_loop] > RepceServer: terminating on reaching EOF. > [2018-09-01 11:23:02.124444] I [gsyncdstatus(monitor):244:set_worker_status] > GeorepStatus: Worker Status Change status=Faulty > > > I have also found a python error as well, I have only seen this once > though. > > > [2018-09-01 11:16:45.907660] I [master(worker > /urd-gds/gluster):1536:crawl] _GMaster: slave's time > stime=(1524101534, 0) > [2018-09-01 11:16:47.364109] E [syncdutils(worker > /urd-gds/gluster):332:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line > 362, in twrap > tf(*aargs) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1939, > in syncjob > po = self.sync_engine(pb, self.log_err) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1442, > in rsync > rconf.ssh_ctl_args + \ > AttributeError: 'NoneType' object has no attribute 'split' > [2018-09-01 11:16:47.384531] I [repce(agent /urd-gds/gluster):80:service_loop] > RepceServer: terminating on reaching EOF. > [2018-09-01 11:16:48.362987] I [monitor(monitor):279:monitor] Monitor: > worker died in startup phase brick=/urd-gds/gluster > [2018-09-01 11:16:48.370701] I [gsyncdstatus(monitor):244:set_worker_status] > GeorepStatus: Worker Status Change status=Faulty > [2018-09-01 11:16:58.390548] I [monitor(monitor):158:monitor] Monitor: > starting gsyncd worker brick=/urd-gds/gluster slave_node=urd-gds-geo-000 > > > I attach the logs as well. > > > Many thanks! > > > Best regards > > Marcus Pedersén > > > > > ------------------------------ > *Från:* [email protected] <gluster-users-bounces@ > gluster.org> för Marcus Pedersén <[email protected]> > *Skickat:* den 31 augusti 2018 16:09 > *Till:* [email protected] > > *Kopia:* [email protected] > *Ämne:* Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > > I realy appologize, third try to make mail smaller. > > > /Marcus > > > ------------------------------ > *Från:* Marcus Pedersén > *Skickat:* den 31 augusti 2018 16:03 > *Till:* Kotresh Hiremath Ravishankar > *Kopia:* [email protected] > *Ämne:* SV: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > > Sorry, resend due to too large mail. > > > /Marcus > ------------------------------ > *Från:* Marcus Pedersén > *Skickat:* den 31 augusti 2018 15:19 > *Till:* Kotresh Hiremath Ravishankar > *Kopia:* [email protected] > *Ämne:* SV: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > > Hi Kotresh, > > Please find attached logs, only logs from today. > > The python error was repeated over and over again until I disabled selinux. > > After that the node bacame active again. > > The return code 23 seems to be repeated over and over again. > > > rsync version 3.1.2 > > > Thanks a lot! > > > Best regards > > Marcus > > > ------------------------------ > *Från:* Kotresh Hiremath Ravishankar <[email protected]> > *Skickat:* den 31 augusti 2018 11:09 > *Till:* Marcus Pedersén > *Kopia:* [email protected] > *Ämne:* Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does > not work Now: Upgraded to 4.1.3 geo node Faulty > > Hi Marcus, > > Could you attach full logs? Is the same trace back happening repeatedly? > It will be helpful you attach the corresponding mount log as well. > What's the rsync version, you are using? > > Thanks, > Kotresh HR > > On Fri, Aug 31, 2018 at 12:16 PM, Marcus Pedersén <[email protected]> > wrote: > >> Hi all, >> >> I had problems with stopping sync after upgrade to 4.1.2. >> >> I upgraded to 4.1.3 and it ran fine for one day, but now one of the >> master nodes shows faulty. >> >> Most of the sync jobs have return code 23, how do I resolve this? >> >> I see messages like: >> >> _GMaster: Sucessfully fixed all entry ops with gfid mismatch >> >> Will this resolve error code 23? >> >> There is also a python error. >> >> The python error was a selinux problem, turning off selinux made node go >> to active again. >> >> See log below. >> >> >> CentOS 7, installed through SIG Gluster (OS updated to latest at the same >> time) >> >> Master cluster: 2 x (2 + 1) distributed, replicated >> >> Client cluster: 1 x (2 + 1) replicated >> >> >> Many thanks in advance! >> >> >> Best regards >> >> Marcus Pedersén >> >> >> >> gsyncd.log from Faulty node: >> >> [2018-08-31 06:25:51.375267] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.8099 num_files=57 job=3 return_code=23 >> [2018-08-31 06:25:51.465895] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0904 num_files=3 job=3 return_code=23 >> [2018-08-31 06:25:52.562107] E [repce(worker >> /urd-gds/gluster):197:__call__] RepceClient: call failed >> call=30069:139655665837888:1535696752.35 method=entry_ops >> error=OSError >> [2018-08-31 06:25:52.562346] E [syncdutils(worker >> /urd-gds/gluster):332:log_raise_exception] <top>: FAIL: >> Traceback (most recent call last): >> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, >> in main >> func(args) >> File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, >> in subcmd_worker >> local.service_loop(remote) >> File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >> 1288, in service_loop >> g3.crawlwrap(oneshot=True) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, >> in crawlwrap >> self.crawl() >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545, >> in crawl >> self.changelogs_batch_process(changes) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445, >> in changelogs_batch_process >> self.process(batch) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280, >> in process >> self.process_change(change, done, retry) >> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179, >> in process_change >> failures = self.slave.server.entry_ops(entries) >> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in >> __call__ >> return self.ins(self.meth, *a) >> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in >> __call__ >> raise res >> OSError: [Errno 13] Permission denied >> [2018-08-31 06:25:52.578367] I [repce(agent >> /urd-gds/gluster):80:service_loop] >> RepceServer: terminating on reaching EOF. >> [2018-08-31 06:25:53.558765] I [monitor(monitor):279:monitor] Monitor: >> worker died in startup phase brick=/urd-gds/gluster >> [2018-08-31 06:25:53.569777] I [gsyncdstatus(monitor):244:set_worker_status] >> GeorepStatus: Worker Status Change status=Faulty >> [2018-08-31 06:26:03.593161] I [monitor(monitor):158:monitor] Monitor: >> starting gsyncd worker brick=/urd-gds/gluster slave_node=urd-gds-geo-000 >> [2018-08-31 06:26:03.636452] I [gsyncd(agent /urd-gds/gluster):297:main] >> <top>: Using session config file path=/var/lib/glusterd/geo-rep >> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf >> [2018-08-31 06:26:03.636810] I [gsyncd(worker /urd-gds/gluster):297:main] >> <top>: Using session config file path=/var/lib/glusterd/geo-rep >> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf >> [2018-08-31 06:26:03.637486] I [changelogagent(agent >> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining... >> [2018-08-31 06:26:03.650330] I [resource(worker >> /urd-gds/gluster):1377:connect_remote] SSH: Initializing SSH connection >> between master and slave... >> [2018-08-31 06:26:05.296473] I [resource(worker >> /urd-gds/gluster):1424:connect_remote] SSH: SSH connection between >> master and slave established. duration=1.6457 >> [2018-08-31 06:26:05.297904] I [resource(worker >> /urd-gds/gluster):1096:connect] GLUSTER: Mounting gluster volume >> locally... >> [2018-08-31 06:26:06.396939] I [resource(worker >> /urd-gds/gluster):1119:connect] GLUSTER: Mounted gluster volume >> duration=1.0985 >> [2018-08-31 06:26:06.397691] I [subcmds(worker >> /urd-gds/gluster):70:subcmd_worker] <top>: Worker spawn successful. >> Acknowledging back to monitor >> [2018-08-31 06:26:16.815566] I [master(worker >> /urd-gds/gluster):1593:register] _GMaster: Working dir >> path=/var/lib/misc/gluster/gsyncd/urd-gds-volume_urd-gds-geo >> -001_urd-gds-volume/urd-gds-gluster >> [2018-08-31 06:26:16.816423] I [resource(worker >> /urd-gds/gluster):1282:service_loop] GLUSTER: Register time >> time=1535696776 >> [2018-08-31 06:26:16.888772] I [gsyncdstatus(worker >> /urd-gds/gluster):277:set_active] GeorepStatus: Worker Status >> Change status=Active >> [2018-08-31 06:26:16.892049] I [gsyncdstatus(worker >> /urd-gds/gluster):249:set_worker_crawl_status] GeorepStatus: Crawl >> Status Change status=History Crawl >> [2018-08-31 06:26:16.892703] I [master(worker >> /urd-gds/gluster):1507:crawl] _GMaster: starting history crawl turns=1 >> stime=(1525739167, 0) entry_stime=(1525740143, 0) etime=1535696776 >> [2018-08-31 06:26:17.914803] I [master(worker >> /urd-gds/gluster):1536:crawl] _GMaster: slave's time >> stime=(1525739167, 0) >> [2018-08-31 06:26:18.521718] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1063 num_files=17 job=3 return_code=23 >> [2018-08-31 06:26:19.260137] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.3441 num_files=34 job=1 return_code=23 >> [2018-08-31 06:26:19.615191] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0923 num_files=7 job=3 return_code=23 >> [2018-08-31 06:26:19.891227] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1302 num_files=12 job=1 return_code=23 >> [2018-08-31 06:26:19.922700] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.5024 num_files=50 job=2 return_code=23 >> >> [2018-08-31 06:26:21.639342] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=1.5233 num_files=5 job=3 return_code=23 >> [2018-08-31 06:26:22.12726] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1191 num_files=7 job=1 return_code=23 >> [2018-08-31 06:26:22.86136] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0731 num_files=4 job=1 return_code=23 >> [2018-08-31 06:26:22.503290] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0779 num_files=15 job=2 return_code=23 >> [2018-08-31 06:26:23.214704] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0738 num_files=9 job=3 return_code=23 >> [2018-08-31 06:26:23.251876] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.2478 num_files=33 job=2 return_code=23 >> [2018-08-31 06:26:23.802699] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0873 num_files=9 job=3 return_code=23 >> [2018-08-31 06:26:23.828176] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0758 num_files=3 job=2 return_code=23 >> [2018-08-31 06:26:23.854063] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.2662 num_files=34 job=1 return_code=23 >> [2018-08-31 06:26:24.403228] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0997 num_files=30 job=3 return_code=23 >> [2018-08-31 06:26:25.526] I [master(worker /urd-gds/gluster):1944:syncjob] >> Syncer: Sync Time Taken duration=0.0965 num_files=8 job=3 >> return_code=23 >> [2018-08-31 06:26:25.438527] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0832 num_files=9 job=1 return_code=23 >> [2018-08-31 06:26:25.447256] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.6180 num_files=86 job=2 return_code=23 >> [2018-08-31 06:26:25.571913] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0706 num_files=2 job=3 return_code=0 >> [2018-08-31 06:26:27.21325] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0814 num_files=1 job=1 return_code=23 >> [2018-08-31 06:26:27.615520] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0933 num_files=13 job=1 return_code=23 >> [2018-08-31 06:26:27.668323] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.2190 num_files=95 job=2 return_code=23 >> [2018-08-31 06:26:27.740139] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0716 num_files=11 job=2 return_code=23 >> [2018-08-31 06:26:28.191068] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1167 num_files=38 job=3 return_code=23 >> [2018-08-31 06:26:28.268213] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0768 num_files=7 job=3 return_code=23 >> [2018-08-31 06:26:28.317909] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0770 num_files=4 job=2 return_code=23 >> [2018-08-31 06:26:28.710064] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0932 num_files=23 job=1 return_code=23 >> [2018-08-31 06:26:28.907250] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0886 num_files=26 job=2 return_code=23 >> [2018-08-31 06:26:28.976679] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0692 num_files=4 job=2 return_code=23 >> [2018-08-31 06:26:29.55774] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0788 num_files=9 job=2 return_code=23 >> [2018-08-31 06:26:29.295576] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0847 num_files=16 job=1 return_code=23 >> [2018-08-31 06:26:29.665076] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1087 num_files=25 job=2 return_code=23 >> [2018-08-31 06:26:30.277998] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.1122 num_files=40 job=2 return_code=23 >> [2018-08-31 06:26:31.153105] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.3822 num_files=74 job=3 return_code=23 >> [2018-08-31 06:26:31.227639] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0743 num_files=18 job=3 return_code=23 >> [2018-08-31 06:26:31.302660] I [master(worker >> /urd-gds/gluster):1944:syncjob] Syncer: Sync Time Taken >> duration=0.0748 num_files=18 job=3 return_code=23 >> >> >> --- >> När du skickar e-post till SLU så innebär detta att SLU behandlar dina >> personuppgifter. För att läsa mer om hur detta går till, klicka här >> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> >> E-mailing SLU will result in SLU processing your personal data. For more >> information on how this is done, click here >> <https://www.slu.se/en/about-slu/contact-slu/personal-data/> >> >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > -- > Thanks and Regards, > Kotresh H R > > --- > När du skickar e-post till SLU så innebär detta att SLU behandlar dina > personuppgifter. För att läsa mer om hur detta går till, klicka här > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> > > --- > När du skickar e-post till SLU så innebär detta att SLU behandlar dina > personuppgifter. För att läsa mer om hur detta går till, klicka här > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> > -- Thanks and Regards, Kotresh H R
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
