There are 2 things happen after a reboot. 1. glusterd (management layer) does a sanity check of its volumes, and sees if there are anything different while it went down, and tries to correct its state. - This is fine as long as number of volumes are less, or numbers of nodes are less. (less is referred as < 100).
2. If it is a replicate or disperse volume, then self-heal daemon does check if there are any self-heal pending. - This does a 'index' crawl to check which files actually changed when one of the brick/node was down. - If this list is big, it can sometimes does take some time. But 'Days/weeks/month' is not a expected/observed behavior. Is there any logs in the log file? If not, can you do a 'strace -f' to the pid which is consuming major CPU?? (strace for 1 mins sample is good enough). -Amar On Wed, Mar 20, 2019 at 2:05 AM Alvin Starr <[email protected]> wrote: > We have a simple replicated volume with 1 brick on each node of 17TB. > > There is something like 35M files and directories on the volume. > > One of the servers rebooted and is now "doing something". > > It kind of looks like its doing some kind of sality check with the node > that did not reboot but its hard to say and it looks like it may run for > hours/days/months.... > > Will Gluster take a long time with Lots of little files to resync? > > > -- > Alvin Starr || land: (905)513-7688 > Netvel Inc. || Cell: (416)806-0133 > [email protected] || > > _______________________________________________ > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users -- Amar Tumballi (amarts)
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
