Re: [Gluster-users] recovery from reboot time?

Amar Tumballi Suryanarayan Tue, 19 Mar 2019 22:41:14 -0700

There are 2 things happen after a reboot.

1. glusterd (management layer) does a sanity check of its volumes, and sees
if there are anything different while it went down, and tries to correct
its state.
  - This is fine as long as number of volumes are less, or numbers of nodes
are less. (less is referred as < 100).


2. If it is a replicate or disperse volume, then self-heal daemon does
check if there are any self-heal pending.
  - This does a 'index' crawl to check which files actually changed when
one of the brick/node was down.
  - If this list is big, it can sometimes does take some time.

But 'Days/weeks/month' is not a expected/observed behavior. Is there any
logs in the log file? If not, can you do a 'strace -f' to the pid which is
consuming major CPU?? (strace for 1 mins sample is good enough).

-Amar


On Wed, Mar 20, 2019 at 2:05 AM Alvin Starr <[email protected]> wrote:

> We have a simple replicated volume  with 1 brick on each node of 17TB.
>
> There is something like 35M files and directories on the volume.
>
> One of the servers rebooted and is now "doing something".
>
> It kind of looks like its doing some kind of sality check with the node
> that did not reboot but its hard to say and it looks like it may run for
> hours/days/months....
>
> Will Gluster take a long time with Lots of little files to resync?
>
>
> --
> Alvin Starr                   ||   land:  (905)513-7688
> Netvel Inc.                   ||   Cell:  (416)806-0133
> [email protected]              ||
>
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Amar Tumballi (amarts)

_______________________________________________
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] recovery from reboot time?

Reply via email to