We are running glusterfs-3.8.9-1.el7.x86_64 with geo-replication.

I have been having ongoing problems with the replication failing after some time.

Once it has failed restarting it results in the attached logfile snippet.


--
Alvin Starr                   ||   land:  (905)513-7688
Netvel Inc.                   ||   Cell:  (416)806-0133
[email protected]              ||

[2018-09-12 03:01:04.433048] I [monitor(monitor):267:monitor] Monitor: 
------------------------------------------------------------
[2018-09-12 03:01:04.433470] I [monitor(monitor):268:monitor] Monitor: starting 
gsyncd worker
[2018-09-12 03:01:04.599227] D [gsyncd(agent):730:main_i] <top>: rpc_fd: 
'9,12,11,10'
[2018-09-12 03:01:04.600925] I [changelogagent(agent):73:__init__] 
ChangelogAgent: Agent listining...
[2018-09-12 03:01:04.625732] I [gsyncd(/bricks/ccto_us/data):736:main_i] <top>: 
syncing: gluster://localhost:CCTO-US-EDOCS -> 
ssh://[email protected]:gluster://localhost:arch-CCTO-US-EDOCS
[2018-09-12 03:01:04.675003] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__() ...
[2018-09-12 03:01:06.518789] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139692706621248:1536721264.67 __repce_version__ -> 1.0
[2018-09-12 03:01:06.519186] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721266.52 version() ...
[2018-09-12 03:01:06.522499] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139692706621248:1536721266.52 version -> 1.0
[2018-09-12 03:01:06.522882] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721266.52 pid() ...
[2018-09-12 03:01:06.525834] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139692706621248:1536721266.52 pid -> 2647
[2018-09-12 03:01:06.623212] D [resource(/bricks/ccto_us/data):1281:inhibit] 
DirectMounter: auxiliary glusterfs mount in place
[2018-09-12 03:01:07.678328] D [resource(/bricks/ccto_us/data):1336:inhibit] 
DirectMounter: auxiliary glusterfs mount prepared
[2018-09-12 03:01:07.679094] I 
[master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up xsync 
change detection mode
[2018-09-12 03:01:07.679126] D [monitor(monitor):337:monitor] Monitor: 
worker(/bricks/ccto_us/data) connected
[2018-09-12 03:01:07.679547] I [master(/bricks/ccto_us/data):367:__init__] 
_GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.681130] I 
[master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up changelog 
change detection mode
[2018-09-12 03:01:07.681557] I [master(/bricks/ccto_us/data):367:__init__] 
_GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.683561] I 
[master(/bricks/ccto_us/data):83:gmaster_builder] <top>: setting up 
changeloghistory change detection mode
[2018-09-12 03:01:07.683960] I [master(/bricks/ccto_us/data):367:__init__] 
_GMaster: using 'rsync' as the sync engine
[2018-09-12 03:01:07.688644] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721267.69 version() ...
[2018-09-12 03:01:07.689547] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139692706621248:1536721267.69 version -> 1.0
[2018-09-12 03:01:07.689709] D 
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog 
working dir 
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:07.689863] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721267.69 init() ...
[2018-09-12 03:01:07.706136] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139692706621248:1536721267.69 init -> None
[2018-09-12 03:01:07.706440] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721267.71 
register('/bricks/ccto_us/data', 
'/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6',
 
'/var/log/glusterfs/geo-replication/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS.%2Fbricks%2Fccto_us%2Fdata-changes.log',
 7, 5) ...
[2018-09-12 03:01:09.711715] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139692706621248:1536721267.71 register -> None
[2018-09-12 03:01:09.712357] D 
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog 
working dir 
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.712651] D 
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog 
working dir 
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.712901] D 
[master(/bricks/ccto_us/data):726:setup_working_dir] _GMaster: changelog 
working dir 
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6
[2018-09-12 03:01:09.713129] I [master(/bricks/ccto_us/data):1251:register] 
_GMaster: xsync temp directory: 
/var/lib/misc/glusterfsd/CCTO-US-EDOCS/ssh%3A%2F%2Froot%401.3.4.5%3Agluster%3A%2F%2F127.0.0.1%3Aarch-CCTO-US-EDOCS/0a70d065ebfb511403fa881adc1073e6/xsync
[2018-09-12 03:01:09.713479] I 
[resource(/bricks/ccto_us/data):1533:service_loop] GLUSTER: Register time: 
1536721269
[2018-09-12 03:01:09.714504] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139691772856064:1536721269.71 keep_alive(None,) ...
[2018-09-12 03:01:09.719439] I [master(/bricks/ccto_us/data):510:crawlwrap] 
_GMaster: primary master with volume id 900656fd-3f13-4ba2-bf04-90832508566e ...
[2018-09-12 03:01:09.723702] D [repce(/bricks/ccto_us/data):209:__call__] 
RepceClient: call 26412:139691772856064:1536721269.71 keep_alive -> 1
[2018-09-12 03:01:09.726247] I [master(/bricks/ccto_us/data):519:crawlwrap] 
_GMaster: crawl interval: 1 seconds
[2018-09-12 03:01:09.733443] I [master(/bricks/ccto_us/data):1165:crawl] 
_GMaster: starting history crawl... turns: 1, stime: (1536718883, 0), etime: 
1536721269
[2018-09-12 03:01:09.733824] D [repce(/bricks/ccto_us/data):191:push] 
RepceClient: call 26412:139692706621248:1536721269.73 
history('/bricks/ccto_us/data/.glusterfs/changelogs', 1536718883, 1536721269, 
3) ...
[2018-09-12 03:01:09.735060] E [repce(agent):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54, 
in history
    num_parallel)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 100, 
in cl_history_changelog
    cls.raise_changelog_err()
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27, 
in raise_changelog_err
    raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 2] No such file or directory
[2018-09-12 03:01:09.736625] E [repce(/bricks/ccto_us/data):207:__call__] 
RepceClient: call 26412:139692706621248:1536721269.73 (history) failed on peer 
with ChangelogException
[2018-09-12 03:01:09.736931] E 
[resource(/bricks/ccto_us/data):1551:service_loop] GLUSTER: Changelog History 
Crawl failed, [Errno 2] No such file or directory
[2018-09-12 03:01:09.737512] I [syncdutils(/bricks/ccto_us/data):220:finalize] 
<top>: exiting.
[2018-09-12 03:01:09.743961] I [repce(agent):92:service_loop] RepceServer: 
terminating on reaching EOF.
[2018-09-12 03:01:09.744335] I [syncdutils(agent):220:finalize] <top>: exiting.
[2018-09-12 03:01:10.682538] I [monitor(monitor):344:monitor] Monitor: 
worker(/bricks/ccto_us/data) died in startup phase
_______________________________________________
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to