Re: [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

Ravishankar N Fri, 21 Jul 2017 04:12:01 -0700


On 07/21/2017 02:55 PM, yayo (j) wrote:

2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishan...@redhat.com<mailto:ravishan...@redhat.com>>:



    But it does  say something. All these gfids of completed heals in
    the log below are the for the ones that you have given the
    getfattr output of. So what is likely happening is there is an
    intermittent connection problem between your mount and the brick
    process, leading to pending heals again after the heal gets
    completed, which is why the numbers are varying each time. You
    would need to check why that is the case.
    Hope this helps,
    Ravi


        /[2017-07-20 09:58:46.573079] I [MSGID: 108026]
        [afr-self-heal-common.c:1254:afr_log_selfheal]
        0-engine-replicate-0: Completed data selfheal on
        e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1  sinks=2/
        /[2017-07-20 09:59:22.995003] I [MSGID: 108026]
        [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
        0-engine-replicate-0: performing metadata selfheal on
        f05b9742-2771-484a-85fc-5b6974bcef81/
        /[2017-07-20 09:59:22.999372] I [MSGID: 108026]
        [afr-self-heal-common.c:1254:afr_log_selfheal]
        0-engine-replicate-0: Completed metadata selfheal on
        f05b9742-2771-484a-85fc-5b6974bcef81. sources=[0] 1  sinks=2/

Hi,

But we ha1e 2 gluster volume on the same network and the other one(the "Data" gluster) don't have any problems. Why you think there is anetwork problem?

Because pending self-heals come into the picture when I/O from theclients (mounts) do not succeed on some bricks. They are mostly due to

(a) the client losing connection to some bricks (likely),
(b) the I/O failing on the bricks themselves (unlikely).

If most of the i/o is also going to the 3rd brick (since you say thefiles are already present on all bricks and I/O is successful) , then itis likely to be (a).

How to check this on a gluster infrastructure?

In the fuse mount logs for the engine volume, check if there are anymessages for brick disconnects. Something along the lines of"disconnected from volname-client-x".Just guessing here, but maybe even the 'data' volume did experiencedisconnects and self-heals later but you did not observe it when you ranheal info. See the glustershd log or mount log for for self-healcompletion messages on /0-data-replicate-0 /also.


Regards,
Ravi

Thank you

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

Reply via email to