Re: [Gluster-users] Self heal issues

Ravishankar N Fri, 07 Aug 2015 00:19:06 -0700


On 08/07/2015 12:11 PM, Prasun Gera wrote:

No, no noticeable difference. Still very high, possibly higher thanbefore.

I was guessing that the cpu usage could be because of the diff algorithmwhich computes checksums (which is a cpu intensive task). That doesn'tseem to be the case. Could you do a volume profile and see the FOPS thatare happening on the bricks and share the result?

1.gluster volume profile <volname> start
2. gluster volume profile <volname> info
3. wait 10-15 seconds
4.gluster volume profile <volname> info

The system has come down to a crawl. It's difficult to even ssh or runany commands on the terminal. Do you make anything of the logs ? Thebrick log is just a giant alternating stream of those two lines Imentioned earlier.

On Thu, Aug 6, 2015 at 10:10 PM, Ravishankar N <[email protected]<mailto:[email protected]>> wrote:




    On 08/07/2015 01:33 AM, Prasun Gera wrote:

        I replaced the brick in a node in my 3x2 dist+repl volume (RHS
        3). I'm seeing that the heal process, which should essentially
        be a dump from the working replica to the newly added one is
        taking exceptionally long. It has moved ~100 G over a day on a
        1Gigabit network. The CPU usage on both the nodes of the
        replica has been pretty high.


    Does setting `cluster.data-self-heal-algorithm` to full make a
    difference in the cpu usage?


        I also think that nagios is making it worse. The heal is slow
        enough as it is, and nagios keeps triggering heal info, which
        I think never completes. I also see my logs filling up These
        are some of the log contents which I got by running tail on them:

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Self heal issues

Reply via email to