We are running glusterfs-3.8.9-1.el7.x86_64 with geo-replication.
I have been having ongoing problems with the replication failing after some time.
Once it has failed restarting it results in the attached logfile snippet. -- Alvin Starr || land: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 [email protected] ||
[2018-09-12 03:01:04.433048] I [monitor(monitor):267:monitor] Monitor: ------------------------------------------------------------ [2018-09-12 03:01:04.433470] I [monitor(monitor):268:monitor] Monitor: starting gsyncd worker [2018-09-12 03:01:04.599227] D [gsyncd(agent):730:main_i] <top>: rpc_fd: '9,12,11,10' [2018-09-12 03:01:04.600925] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining... [2018-09-12 03:01:04.625732] I [gsyncd(/bricks/ccto_us/data):736:main_i] <top>: syncing: gluster://localhost:CCTO-US-EDOCS -> ssh://[email protected]:gluster://localhost:arch-CCTO-US-EDOCS [2018-09-12 03:01:04.675003] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__() ... [2018-09-12 03:01:06.518789] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__ -> 1.0 [2018-09-12 03:01:06.519186] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721266.52 version() ... [2018-09-12 03:01:06.522499] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721266.52 version -> 1.0 [2018-09-12 03:01:06.522882] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721266.52 pid() ... [2018-09-12 03:01:06.525834] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721266.52 pid -> 2647 [2018-09-12 03:01:06.623212] D [resource(/bricks/ccto_us/data):1281:inhibit] DirectMounter: auxiliary glusterfs mount in place [2018-09-12 03:01:07.678328] D [resource(/bricks/ccto_us/data):1336:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2018-09-12 03:01:07.679094] I [master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up xsync change detection mode [2018-09-12 03:01:07.679126] D [monitor(monitor):337:monitor] Monitor: worker(/bricks/ccto_us/data) connected [2018-09-12 03:01:07.679547] I [master(/bricks/ccto_us/data):367:__init__] _GMaster: using 'rsync' as the sync engine [2018-09-12 03:01:07.681130] I [master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up changelog change detection mode [2018-09-12 03:01:07.681557] I [master(/bricks/ccto_us/data):367:__init__] _GMaster: using 'rsync' as the sync engine [2018-09-12 03:01:07.683561] I [master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up changeloghistory change detection mode [2018-09-12 03:01:07.683960] I [master(/bricks/ccto_us/data):367:__init__] _GMaster: using 'rsync' as the sync engine [2018-09-12 03:01:07.688644] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721267.69 version() ... [2018-09-12 03:01:07.689547] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721267.69 version -> 1.0 [2018-09-12 03:01:07.689709] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6 [2018-09-12 03:01:07.689863] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721267.69 init() ... [2018-09-12 03:01:07.706136] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721267.69 init -> None [2018-09-12 03:01:07.706440] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721267.71 register('/bricks/ccto_us/data', '/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6', '/var/log/glusterfs/geo-replication/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS.%2Fbricks%2Fccto_us%2Fdata-changes.log', 7, 5) ... [2018-09-12 03:01:09.711715] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139692706621248:1536721267.71 register -> None [2018-09-12 03:01:09.712357] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6 [2018-09-12 03:01:09.712651] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6 [2018-09-12 03:01:09.712901] D [master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6 [2018-09-12 03:01:09.713129] I [master(/bricks/ccto_us/data):1251:register] _GMaster: xsync temp directory: /var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6/xsync [2018-09-12 03:01:09.713479] I [resource(/bricks/ccto_us/data):1533:service_loop] GLUSTER: Register time: 1536721269 [2018-09-12 03:01:09.714504] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139691772856064:1536721269.71 keep_alive(None,) ... [2018-09-12 03:01:09.719439] I [master(/bricks/ccto_us/data):510:crawlwrap] _GMaster: primary master with volume id 900656fd-3f13-4ba2-bf04-90832508566e ... [2018-09-12 03:01:09.723702] D [repce(/bricks/ccto_us/data):209:__call__] RepceClient: call 26412:139691772856064:1536721269.71 keep_alive -> 1 [2018-09-12 03:01:09.726247] I [master(/bricks/ccto_us/data):519:crawlwrap] _GMaster: crawl interval: 1 seconds [2018-09-12 03:01:09.733443] I [master(/bricks/ccto_us/data):1165:crawl] _GMaster: starting history crawl... turns: 1, stime: (1536718883, 0), etime: 1536721269 [2018-09-12 03:01:09.733824] D [repce(/bricks/ccto_us/data):191:push] RepceClient: call 26412:139692706621248:1536721269.73 history('/bricks/ccto_us/data/.glusterfs/changelogs', 1536718883, 1536721269, 3) ... [2018-09-12 03:01:09.735060] E [repce(agent):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54, in history num_parallel) File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 100, in cl_history_changelog cls.raise_changelog_err() File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27, in raise_changelog_err raise ChangelogException(errn, os.strerror(errn)) ChangelogException: [Errno 2] No such file or directory [2018-09-12 03:01:09.736625] E [repce(/bricks/ccto_us/data):207:__call__] RepceClient: call 26412:139692706621248:1536721269.73 (history) failed on peer with ChangelogException [2018-09-12 03:01:09.736931] E [resource(/bricks/ccto_us/data):1551:service_loop] GLUSTER: Changelog History Crawl failed, [Errno 2] No such file or directory [2018-09-12 03:01:09.737512] I [syncdutils(/bricks/ccto_us/data):220:finalize] <top>: exiting. [2018-09-12 03:01:09.743961] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2018-09-12 03:01:09.744335] I [syncdutils(agent):220:finalize] <top>: exiting. [2018-09-12 03:01:10.682538] I [monitor(monitor):344:monitor] Monitor: worker(/bricks/ccto_us/data) died in startup phase
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
