Hi Mark, Google drive works for me.
Thanks, Kotresh HR On Fri, Jun 8, 2018 at 3:00 PM, Mark Betham < mark.bet...@performancehorizon.com> wrote: > Hi Kotresh, > > The memory issue re-occurred again. This is indicating it will occur > around once a day. > > Again no traceback listed in the log, the only update in the log was as > follows; > [2018-06-08 08:26:43.404261] I [resource(slave):1020:service_loop] > GLUSTER: connection inactive, stopping timeout=120 > [2018-06-08 08:29:19.357615] I [syncdutils(slave):271:finalize] <top>: > exiting. > [2018-06-08 08:31:02.432002] I [resource(slave):1502:connect] GLUSTER: > Mounting gluster volume locally... > [2018-06-08 08:31:03.716967] I [resource(slave):1515:connect] GLUSTER: > Mounted gluster volume duration=1.2729 > [2018-06-08 08:31:03.717411] I [resource(slave):1012:service_loop] > GLUSTER: slave listening > > I have attached an image showing the latest memory usage pattern. > > Can you please advise how I can pass the log data across to you? As soon > as I know this I will get the data uploaded for your review. > > Thanks, > > Mark Betham > > > > > On 7 June 2018 at 08:19, Mark Betham <mark.bet...@performancehorizon.com> > wrote: > >> Hi Kotresh, >> >> Many thanks for your prompt response. >> >> Below are my responses to your questions; >> >> 1. Is this trace back consistently hit? I just wanted to confirm whether >> it's transient which occurs once in a while and gets back to normal? >> It appears not. As soon as the geo-rep recovered yesterday from the high >> memory usage it immediately began rising again until it consumed all of the >> available ram. But this time nothing was committed to the log file. >> I would like to add here that this current instance of geo-rep was only >> brought online at the start of this week due to the issues with glibc on >> CentOS 7.5. This is the first time I have had geo-rep running with Gluster >> ver 3.12.9, both storage clusters at each physical site were only rebuilt >> approx. 4 weeks ago, due to the previous version in use going EOL. Prior >> to this I had been running 3.13.2 (3.13.X now EOL) at each of the sites and >> it is worth noting that the same behaviour was also seen on this version of >> Gluster, unfortunately I do not have any of the log data from then but I do >> not recall seeing any instances of the trace back message mentioned. >> >> 2. Please upload the complete geo-rep logs from both master and slave. >> I have the log files, just checking to make sure there is no confidential >> info inside. The logfiles are too big to send via email, even when >> compressed. Do you have a preferred method to allow me to share this data >> with you or would a share from my Google drive be sufficient? >> >> 3. Are the gluster versions same across master and slave? >> Yes, all gluster versions are the same across the two sites for all >> storage nodes. See below for version info taken from the current geo-rep >> master. >> >> glusterfs 3.12.9 >> Repository revision: git://git.gluster.org/glusterfs.git >> Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/> >> GlusterFS comes with ABSOLUTELY NO WARRANTY. >> It is licensed to you under your choice of the GNU Lesser >> General Public License, version 3 or any later version (LGPLv3 >> or later), or the GNU General Public License, version 2 (GPLv2), >> in all cases as published by the Free Software Foundation. >> >> glusterfs-geo-replication-3.12.9-1.el7.x86_64 >> glusterfs-gnfs-3.12.9-1.el7.x86_64 >> glusterfs-libs-3.12.9-1.el7.x86_64 >> glusterfs-server-3.12.9-1.el7.x86_64 >> glusterfs-3.12.9-1.el7.x86_64 >> glusterfs-api-3.12.9-1.el7.x86_64 >> glusterfs-events-3.12.9-1.el7.x86_64 >> centos-release-gluster312-1.0-1.el7.centos.noarch >> glusterfs-client-xlators-3.12.9-1.el7.x86_64 >> glusterfs-cli-3.12.9-1.el7.x86_64 >> python2-gluster-3.12.9-1.el7.x86_64 >> glusterfs-rdma-3.12.9-1.el7.x86_64 >> glusterfs-fuse-3.12.9-1.el7.x86_64 >> >> I have also attached another screenshot showing the memory usage from the >> Gluster slave for the last 48 hours. This shows memory saturation from >> yesterday, which correlates with the trace back sent yesterday, and the >> subsequent memory saturation which occurred over the last 24 hours. For >> info, all times are in UTC. >> >> Please advise the preferred method to get the log data across to you and >> also if you require any further information. >> >> Many thanks, >> >> Mark Betham >> >> >> On 7 June 2018 at 04:42, Kotresh Hiremath Ravishankar < >> khire...@redhat.com> wrote: >> >>> Hi Mark, >>> >>> Few questions. >>> >>> 1. Is this trace back consistently hit? I just wanted to confirm whether >>> it's transient which occurs once in a while and gets back to normal? >>> 2. Please upload the complete geo-rep logs from both master and slave. >>> 3. Are the gluster versions same across master and slave? >>> >>> Thanks, >>> Kotresh HR >>> >>> On Wed, Jun 6, 2018 at 7:10 PM, Mark Betham < >>> mark.bet...@performancehorizon.com> wrote: >>> >>>> Dear Gluster-Users, >>>> >>>> I have geo-replication setup and configured between 2 Gluster pools >>>> located at different sites. What I am seeing is an error being reported >>>> within the geo-replication slave log as follows; >>>> >>>> *[2018-06-05 12:05:26.767615] E >>>> [syncdutils(slave):331:log_raise_exception] <top>: FAIL: * >>>> *Traceback (most recent call last):* >>>> * File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line >>>> 361, in twrap* >>>> * tf(*aa)* >>>> * File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>>> 1009, in <lambda>* >>>> * t = syncdutils.Thread(target=lambda: (repce.service_loop(),* >>>> * File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 90, >>>> in service_loop* >>>> * self.q.put(recv(self.inf))* >>>> * File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 61, >>>> in recv* >>>> * return pickle.load(inf)* >>>> *ImportError: No module named >>>> h_2013-04-26-04:02:49-2013-04-26_11:02:53.gz.15WBuUh* >>>> *[2018-06-05 12:05:26.768085] E [repce(slave):117:worker] <top>: call >>>> failed: * >>>> *Traceback (most recent call last):* >>>> * File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, >>>> in worker* >>>> * res = getattr(self.obj, rmeth)(*in_data[2:])* >>>> *TypeError: getattr(): attribute name must be string* >>>> >>>> From this point in time the slave server begins to consume all of its >>>> available RAM until it becomes non-responsive. Eventually the gluster >>>> service seems to kill off the offending process and the memory is returned >>>> to the system. Once the memory has been returned to the remote slave >>>> system the geo-replication often recovers and data transfer resumes. >>>> >>>> I have attached the full geo-replication slave log containing the error >>>> shown above. I have also attached an image file showing the memory usage >>>> of the affected storage server. >>>> >>>> We are currently running Gluster version 3.12.9 on top of CentOS 7.5 >>>> x86_64. The system has been fully patched and is running the latest >>>> software, excluding glibc which had to be downgraded to get geo-replication >>>> working. >>>> >>>> The Gluster volume runs on a dedicated partition using the XFS >>>> filesystem which in turn is running on a LVM thin volume. The physical >>>> storage is presented as a single drive due to the underlying disks being >>>> part of a raid 10 array. >>>> >>>> The Master volume which is being replicated has a total of 2.2 TB of >>>> data to be replicated. The total size of the volume fluctuates very little >>>> as data being removed equals the new data coming in. This data is made up >>>> of many thousands of files across many separated directories. Data file >>>> sizes vary from the very small (>1K) to the large (>1Gb). The Gluster >>>> service itself is running with a single volume in a replicated >>>> configuration across 3 bricks at each of the sites. The delta changes >>>> being replicated are on average about 100GB per day, where this includes >>>> file creation / deletion / modification. >>>> >>>> The config for the geo-replication session is as follows, taken from >>>> the current source server; >>>> >>>> *special_sync_mode: partial* >>>> *gluster_log_file: >>>> /var/log/glusterfs/geo-replication/glustervol0/ssh%3A%2F%2Froot%40storage-server.local%3Agluster%3A%2F%2F127.0.0.1%3Aglustervol1.gluster.log* >>>> *ssh_command: ssh -oPasswordAuthentication=no >>>> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem* >>>> *change_detector: changelog* >>>> *session_owner: 40e9e77a-034c-44a2-896e-59eec47e8a84* >>>> *state_file: >>>> /var/lib/glusterd/geo-replication/glustervol0_storage-server.local_glustervol1/monitor.status* >>>> *gluster_params: aux-gfid-mount acl* >>>> *log_rsync_performance: true* >>>> *remote_gsyncd: /nonexistent/gsyncd* >>>> *working_dir: >>>> /var/lib/misc/glusterfsd/glustervol0/ssh%3A%2F%2Froot%40storage-server.local%3Agluster%3A%2F%2F127.0.0.1%3Aglustervol1* >>>> *state_detail_file: >>>> /var/lib/glusterd/geo-replication/glustervol0_storage-server.local_glustervol1/ssh%3A%2F%2Froot%40storage-server.local%3Agluster%3A%2F%2F127.0.0.1%3Aglustervol1-detail.status* >>>> *gluster_command_dir: /usr/sbin/* >>>> *pid_file: >>>> /var/lib/glusterd/geo-replication/glustervol0_storage-server.local_glustervol1/monitor.pid* >>>> *georep_session_working_dir: >>>> /var/lib/glusterd/geo-replication/glustervol0_storage-server.local_glustervol1/* >>>> *ssh_command_tar: ssh -oPasswordAuthentication=no >>>> -oStrictHostKeyChecking=no -i >>>> /var/lib/glusterd/geo-replication/tar_ssh.pem* >>>> *master.stime_xattr_name: >>>> trusted.glusterfs.40e9e77a-034c-44a2-896e-59eec47e8a84.ccfaed9b-ff4b-4a55-acfa-03f092cdf460.stime* >>>> *changelog_log_file: >>>> /var/log/glusterfs/geo-replication/glustervol0/ssh%3A%2F%2Froot%40storage-server.local%3Agluster%3A%2F%2F127.0.0.1%3Aglustervol1-changes.log* >>>> *socketdir: /var/run/gluster* >>>> *volume_id: 40e9e77a-034c-44a2-896e-59eec47e8a84* >>>> *ignore_deletes: false* >>>> *state_socket_unencoded: >>>> /var/lib/glusterd/geo-replication/glustervol0_storage-server.local_glustervol1/ssh%3A%2F%2Froot%40storage-server.local%3Agluster%3A%2F%2F127.0.0.1%3Aglustervol1.socket* >>>> *log_file: >>>> /var/log/glusterfs/geo-replication/glustervol0/ssh%3A%2F%2Froot%40storage-server.local%3Agluster%3A%2F%2F127.0.0.1%3Aglustervol1.log* >>>> >>>> If any further information is required in order to troubleshoot this >>>> issue then please let me know. >>>> >>>> I would be very grateful for any help or guidance received. >>>> >>>> Many thanks, >>>> >>>> Mark Betham. >>>> >>>> >>>> >>>> >>>> This email may contain confidential material; unintended recipients >>>> must not disseminate, use, or act upon any information in it. If you >>>> received this email in error, please contact the sender and permanently >>>> delete the email. >>>> Performance Horizon Group Limited | Registered in England & Wales >>>> 07188234 | Level 8, West One, Forth Banks, Newcastle upon Tyne, NE1 3PA >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users@gluster.org >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >>> >>> -- >>> Thanks and Regards, >>> Kotresh H R >>> >> >> >> >> -- >> MARK BETHAM >> Senior System Administrator >> +44 (0) 191 261 2444 >> performancehorizon.com >> PerformanceHorizon <https://www.facebook.com/PerformanceHorizon> >> tweetphg <https://twitter.com/tweetphg> >> performance-horizon-group >> <https://www.linkedin.com/company-beta/1484320/> >> > > > > -- > MARK BETHAM > Senior System Administrator > +44 (0) 191 261 2444 > performancehorizon.com > PerformanceHorizon <https://www.facebook.com/PerformanceHorizon> > tweetphg <https://twitter.com/tweetphg> > performance-horizon-group <https://www.linkedin.com/company-beta/1484320/> > > > This email may contain confidential material; unintended recipients must > not disseminate, use, or act upon any information in it. If you received > this email in error, please contact the sender and permanently delete the > email. > Performance Horizon Group Limited | Registered in England & Wales 07188234 > | Level 8, West One, Forth Banks, Newcastle upon Tyne, NE1 3PA > > > -- Thanks and Regards, Kotresh H R
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users