Re: [Gluster-devel] Query on healing process

Ravishankar N Thu, 25 Feb 2016 20:28:02 -0800

Hello,

On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:

Hi Ravi,
Thanks for the response.

We are using Glugsterfs-3.7.8

Here is the use case:
We have a logging file which saves logs of the events for every boardof a node and these files are in sync using glusterfs. System inreplica 2 mode it means When one brick in a replicated volume goesoffline, the glusterd daemons on the other nodes keep track of all thefiles that are not replicated to the offline brick. When the offlinebrick becomes available again, the cluster initiates a healingprocess, replicating the updated files to that brick. But in ourcasse, we see that log file of one board is not in the sync and itsformat is corrupted means files are not in sync.

Just to understand you correctly, you have mounted the 2 node replica-2volume on both these nodes and writing to a logging file from the mountsright?

Even the outcome of #gluster volume heal c_glusterfs info shows thatthere is no pending heals.

Also , The logging file which is updated is of fixed size and the newentries will be wrapped ,overwriting the old entries.

This way we have seen that after few restarts , the contents of thesame file on two bricks are different , but the volume heal info showszero entries


Solution:

But when we tried to put delay > 5 min before the healing everythingis working fine.


Regards,
Abhishek

On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N <[email protected]<mailto:[email protected]>> wrote:


    On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:

    Hi,

    Here, I have one query regarding the time taken by the healing
    process.
    In current two node setup when we rebooted one node then the
    self-healing process starts less than 5min interval on the board
    which resulting the corruption of the some files data.


    Heal should start immediately after the brick process comes up.
    What version of gluster are you using? What do you mean by
    corruption of data? Also, how did you observe that the heal
    started after 5 minutes?
    -Ravi


    And to resolve it I have search on google and found the following
    link:
    https://support.rackspace.com/how-to/glusterfs-troubleshooting/

    Mentioning that the healing process can takes upto 10min of time
    to start this process.

    Here is the statement from the link:

    "Healing replicated volumes

    When any brick in a replicated volume goes offline, the glusterd
    daemons on the remaining nodes keep track of all the files that
    are not replicated to the offline brick. When the offline brick
    becomes available again, the cluster initiates a healing process,
    replicating the updated files to that brick. *The start of this
    process can take up to 10 minutes, based on observation.*"

    After giving the time of more than 5 min file corruption problem
    has been resolved.

    So, Here my question is there any way through which we can reduce
    the time taken by the healing process to start?


    Regards,
    Abhishek Paliwal




    _______________________________________________
    Gluster-devel mailing list
    [email protected] <mailto:[email protected]>
    http://www.gluster.org/mailman/listinfo/gluster-devel






--




Regards
Abhishek Paliwal

_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query on healing process

Reply via email to