On Tue, Aug 30, 2016 at 10:02 AM, David Gossage <[email protected] > wrote:
> updated test server to 3.8.3 > > Brick1: 192.168.71.10:/gluster2/brick1/1 > Brick2: 192.168.71.11:/gluster2/brick2/1 > Brick3: 192.168.71.12:/gluster2/brick3/1 > Options Reconfigured: > cluster.granular-entry-heal: on > performance.readdir-ahead: on > performance.read-ahead: off > nfs.disable: on > nfs.addr-namelookup: off > nfs.enable-ino32: off > cluster.background-self-heal-count: 16 > cluster.self-heal-window-size: 1024 > performance.quick-read: off > performance.io-cache: off > performance.stat-prefetch: off > cluster.eager-lock: enable > network.remote-dio: on > cluster.quorum-type: auto > cluster.server-quorum-type: server > storage.owner-gid: 36 > storage.owner-uid: 36 > server.allow-insecure: on > features.shard: on > features.shard-block-size: 64MB > performance.strict-o-direct: off > cluster.locking-scheme: granular > > kill -15 brickpid > rm -Rf /gluster2/brick3 > mkdir -p /gluster2/brick3/1 > mkdir mkdir /rhev/data-center/mnt/glusterSD/192.168.71.10\:_ > glustershard/fake2 > setfattr -n "user.some-name" -v "some-value" /rhev/data-center/mnt/ > glusterSD/192.168.71.10\:_glustershard/fake2 > gluster v start glustershard force > > at this point brick process starts and all visible files including new dir > are made on brick > handful of shards are in heal statistics still but no .shard directory > created and no increase in shard count > > gluster v heal glustershard > > At this point still no increase in count or dir made no additional > activity in logs for healing generated. waited few minutes tailing logs to > check if anything kicked in. > > gluster v heal glustershard full > > gluster shards added to list and heal commences. logs show full sweep > starting on all 3 nodes. though this time it only shows as finishing on > one which looks to be the one that had brick deleted. > > [2016-08-30 14:45:33.098589] I [MSGID: 108026] > [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0: > starting full sweep on subvol glustershard-client-0 > [2016-08-30 14:45:33.099492] I [MSGID: 108026] > [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0: > starting full sweep on subvol glustershard-client-1 > [2016-08-30 14:45:33.100093] I [MSGID: 108026] > [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0: > starting full sweep on subvol glustershard-client-2 > [2016-08-30 14:52:29.760213] I [MSGID: 108026] > [afr-self-heald.c:656:afr_shd_full_healer] 0-glustershard-replicate-0: > finished full sweep on subvol glustershard-client-2 > Just realized its still healing so that may be why sweep on 2 other bricks haven't replied as finished. > > > my hope is that later tonight a full heal will work on production. Is it > possible self-heal daemon can get stale or stop listening but still show as > active? Would stopping and starting self-heal daemon from gluster cli > before doing these heals be helpful? > > > On Tue, Aug 30, 2016 at 9:29 AM, David Gossage < > [email protected]> wrote: > >> On Tue, Aug 30, 2016 at 8:52 AM, David Gossage < >> [email protected]> wrote: >> >>> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <[email protected] >>>> > wrote: >>>> >>>>> >>>>> >>>>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage < >>>>> [email protected]> wrote: >>>>> >>>>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika Dhananjay < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Could you also share the glustershd logs? >>>>>>> >>>>>> >>>>>> I'll get them when I get to work sure >>>>>> >>>>> >>>>>> >>>>>>> >>>>>>> I tried the same steps that you mentioned multiple times, but heal >>>>>>> is running to completion without any issues. >>>>>>> >>>>>>> It must be said that 'heal full' traverses the files and directories >>>>>>> in a depth-first order and does heals also in the same order. But if it >>>>>>> gets interrupted in the middle (say because self-heal-daemon was either >>>>>>> intentionally or unintentionally brought offline and then brought back >>>>>>> up), >>>>>>> self-heal will only pick up the entries that are so far marked as >>>>>>> new-entries that need heal which it will find in indices/xattrop >>>>>>> directory. >>>>>>> What this means is that those files and directories that were not >>>>>>> visited >>>>>>> during the crawl, will remain untouched and unhealed in this second >>>>>>> iteration of heal, unless you execute a 'heal-full' again. >>>>>>> >>>>>> >>>>>> So should it start healing shards as it crawls or not until after it >>>>>> crawls the entire .shard directory? At the pace it was going that could >>>>>> be >>>>>> a week with one node appearing in the cluster but with no shard files if >>>>>> anything tries to access a file on that node. From my experience other >>>>>> day >>>>>> telling it to heal full again did nothing regardless of node used. >>>>>> >>>>> >>>> Crawl is started from '/' of the volume. Whenever self-heal detects >>>> during the crawl that a file or directory is present in some brick(s) and >>>> absent in others, it creates the file on the bricks where it is absent and >>>> marks the fact that the file or directory might need data/entry and >>>> metadata heal too (this also means that an index is created under >>>> .glusterfs/indices/xattrop of the src bricks). And the data/entry and >>>> metadata heal are picked up and done in >>>> >>> the background with the help of these indices. >>>> >>> >>> Looking at my 3rd node as example i find nearly an exact same number of >>> files in xattrop dir as reported by heal count at time I brought down node2 >>> to try and alleviate read io errors that seemed to occur from what I was >>> guessing as attempts to use the node with no shards for reads. >>> >>> Also attached are the glustershd logs from the 3 nodes, along with the >>> test node i tried yesterday with same results. >>> >> >> Looking at my own logs I notice that a full sweep was only ever recorded >> in glustershd.log on 2nd node with missing directory. I believe I should >> have found a sweep begun on every node correct? >> >> On my test dev when it did work I do see that >> >> [2016-08-30 13:56:25.223333] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0: >> starting full sweep on subvol glustershard-client-0 >> [2016-08-30 13:56:25.223522] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0: >> starting full sweep on subvol glustershard-client-1 >> [2016-08-30 13:56:25.224616] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0: >> starting full sweep on subvol glustershard-client-2 >> [2016-08-30 14:18:48.333740] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-glustershard-replicate-0: >> finished full sweep on subvol glustershard-client-2 >> [2016-08-30 14:18:48.356008] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-glustershard-replicate-0: >> finished full sweep on subvol glustershard-client-1 >> [2016-08-30 14:18:49.637811] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-glustershard-replicate-0: >> finished full sweep on subvol glustershard-client-0 >> >> While when looking at past few days of the 3 prod nodes i only found that >> on my 2nd node >> [2016-08-27 01:26:42.638772] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> starting full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 11:37:01.732366] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> finished full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 12:58:34.597228] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> starting full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 12:59:28.041173] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> finished full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 20:03:42.560188] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> starting full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 20:03:44.278274] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> finished full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 21:00:42.603315] I [MSGID: 108026] >> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> starting full sweep on subvol GLUSTER1-client-1 >> [2016-08-27 21:00:46.148674] I [MSGID: 108026] >> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0: >> finished full sweep on subvol GLUSTER1-client-1 >> >> >> >> >> >>> >>>> >>>>>> >>>>>>> My suspicion is that this is what happened on your setup. Could you >>>>>>> confirm if that was the case? >>>>>>> >>>>>> >>>>>> Brick was brought online with force start then a full heal launched. >>>>>> Hours later after it became evident that it was not adding new files to >>>>>> heal I did try restarting self-heal daemon and relaunching full heal >>>>>> again. >>>>>> But this was after the heal had basically already failed to work as >>>>>> intended. >>>>>> >>>>> >>>>> OK. How did you figure it was not adding any new files? I need to know >>>>> what places you were monitoring to come to this conclusion. >>>>> >>>>> -Krutika >>>>> >>>>> >>>>>> >>>>>> >>>>>>> As for those logs, I did manager to do something that caused these >>>>>>> warning messages you shared earlier to appear in my client and server >>>>>>> logs. >>>>>>> Although these logs are annoying and a bit scary too, they didn't do >>>>>>> any harm to the data in my volume. Why they appear just after a brick is >>>>>>> replaced and under no other circumstances is something I'm still >>>>>>> investigating. >>>>>>> >>>>>>> But for future, it would be good to follow the steps Anuradha gave >>>>>>> as that would allow self-heal to at least detect that it has some >>>>>>> repairing >>>>>>> to do whenever it is restarted whether intentionally or otherwise. >>>>>>> >>>>>> >>>>>> I followed those steps as described on my test box and ended up with >>>>>> exact same outcome of adding shards at an agonizing slow pace and no >>>>>> creation of .shard directory or heals on shard directory. Directories >>>>>> visible from mount healed quickly. This was with one VM so it has only >>>>>> 800 >>>>>> shards as well. After hours at work it had added a total of 33 shards to >>>>>> be healed. I sent those logs yesterday as well though not the >>>>>> glustershd. >>>>>> >>>>>> Does replace-brick command copy files in same manner? For these >>>>>> purposes I am contemplating just skipping the heal route. >>>>>> >>>>>> >>>>>>> -Krutika >>>>>>> >>>>>>> On Tue, Aug 30, 2016 at 2:22 AM, David Gossage < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> attached brick and client logs from test machine where same >>>>>>>> behavior occurred not sure if anything new is there. its still on >>>>>>>> 3.8.2 >>>>>>>> >>>>>>>> Number of Bricks: 1 x 3 = 3 >>>>>>>> Transport-type: tcp >>>>>>>> Bricks: >>>>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1 >>>>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1 >>>>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1 >>>>>>>> Options Reconfigured: >>>>>>>> cluster.locking-scheme: granular >>>>>>>> performance.strict-o-direct: off >>>>>>>> features.shard-block-size: 64MB >>>>>>>> features.shard: on >>>>>>>> server.allow-insecure: on >>>>>>>> storage.owner-uid: 36 >>>>>>>> storage.owner-gid: 36 >>>>>>>> cluster.server-quorum-type: server >>>>>>>> cluster.quorum-type: auto >>>>>>>> network.remote-dio: on >>>>>>>> cluster.eager-lock: enable >>>>>>>> performance.stat-prefetch: off >>>>>>>> performance.io-cache: off >>>>>>>> performance.quick-read: off >>>>>>>> cluster.self-heal-window-size: 1024 >>>>>>>> cluster.background-self-heal-count: 16 >>>>>>>> nfs.enable-ino32: off >>>>>>>> nfs.addr-namelookup: off >>>>>>>> nfs.disable: on >>>>>>>> performance.read-ahead: off >>>>>>>> performance.readdir-ahead: on >>>>>>>> cluster.granular-entry-heal: on >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Aug 29, 2016 at 2:20 PM, David Gossage < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur <[email protected] >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ----- Original Message ----- >>>>>>>>>> > From: "David Gossage" <[email protected]> >>>>>>>>>> > To: "Anuradha Talur" <[email protected]> >>>>>>>>>> > Cc: "[email protected] List" <[email protected]>, >>>>>>>>>> "Krutika Dhananjay" <[email protected]> >>>>>>>>>> > Sent: Monday, August 29, 2016 5:12:42 PM >>>>>>>>>> > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow >>>>>>>>>> > >>>>>>>>>> > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> > >>>>>>>>>> > > Response inline. >>>>>>>>>> > > >>>>>>>>>> > > ----- Original Message ----- >>>>>>>>>> > > > From: "Krutika Dhananjay" <[email protected]> >>>>>>>>>> > > > To: "David Gossage" <[email protected]> >>>>>>>>>> > > > Cc: "[email protected] List" < >>>>>>>>>> [email protected]> >>>>>>>>>> > > > Sent: Monday, August 29, 2016 3:55:04 PM >>>>>>>>>> > > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier >>>>>>>>>> Slow >>>>>>>>>> > > > >>>>>>>>>> > > > Could you attach both client and brick logs? Meanwhile I >>>>>>>>>> will try these >>>>>>>>>> > > steps >>>>>>>>>> > > > out on my machines and see if it is easily recreatable. >>>>>>>>>> > > > >>>>>>>>>> > > > -Krutika >>>>>>>>>> > > > >>>>>>>>>> > > > On Mon, Aug 29, 2016 at 2:31 PM, David Gossage < >>>>>>>>>> > > [email protected] >>>>>>>>>> > > > > wrote: >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> > > > Centos 7 Gluster 3.8.3 >>>>>>>>>> > > > >>>>>>>>>> > > > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 >>>>>>>>>> > > > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 >>>>>>>>>> > > > Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 >>>>>>>>>> > > > Options Reconfigured: >>>>>>>>>> > > > cluster.data-self-heal-algorithm: full >>>>>>>>>> > > > cluster.self-heal-daemon: on >>>>>>>>>> > > > cluster.locking-scheme: granular >>>>>>>>>> > > > features.shard-block-size: 64MB >>>>>>>>>> > > > features.shard: on >>>>>>>>>> > > > performance.readdir-ahead: on >>>>>>>>>> > > > storage.owner-uid: 36 >>>>>>>>>> > > > storage.owner-gid: 36 >>>>>>>>>> > > > performance.quick-read: off >>>>>>>>>> > > > performance.read-ahead: off >>>>>>>>>> > > > performance.io-cache: off >>>>>>>>>> > > > performance.stat-prefetch: on >>>>>>>>>> > > > cluster.eager-lock: enable >>>>>>>>>> > > > network.remote-dio: enable >>>>>>>>>> > > > cluster.quorum-type: auto >>>>>>>>>> > > > cluster.server-quorum-type: server >>>>>>>>>> > > > server.allow-insecure: on >>>>>>>>>> > > > cluster.self-heal-window-size: 1024 >>>>>>>>>> > > > cluster.background-self-heal-count: 16 >>>>>>>>>> > > > performance.strict-write-ordering: off >>>>>>>>>> > > > nfs.disable: on >>>>>>>>>> > > > nfs.addr-namelookup: off >>>>>>>>>> > > > nfs.enable-ino32: off >>>>>>>>>> > > > cluster.granular-entry-heal: on >>>>>>>>>> > > > >>>>>>>>>> > > > Friday did rolling upgrade from 3.8.3->3.8.3 no issues. >>>>>>>>>> > > > Following steps detailed in previous recommendations began >>>>>>>>>> proces of >>>>>>>>>> > > > replacing and healngbricks one node at a time. >>>>>>>>>> > > > >>>>>>>>>> > > > 1) kill pid of brick >>>>>>>>>> > > > 2) reconfigure brick from raid6 to raid10 >>>>>>>>>> > > > 3) recreate directory of brick >>>>>>>>>> > > > 4) gluster volume start <> force >>>>>>>>>> > > > 5) gluster volume heal <> full >>>>>>>>>> > > Hi, >>>>>>>>>> > > >>>>>>>>>> > > I'd suggest that full heal is not used. There are a few bugs >>>>>>>>>> in full heal. >>>>>>>>>> > > Better safe than sorry ;) >>>>>>>>>> > > Instead I'd suggest the following steps: >>>>>>>>>> > > >>>>>>>>>> > > Currently I brought the node down by systemctl stop glusterd >>>>>>>>>> as I was >>>>>>>>>> > getting sporadic io issues and a few VM's paused so hoping that >>>>>>>>>> will help. >>>>>>>>>> > I may wait to do this till around 4PM when most work is done in >>>>>>>>>> case it >>>>>>>>>> > shoots load up. >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > > 1) kill pid of brick >>>>>>>>>> > > 2) to configuring of brick that you need >>>>>>>>>> > > 3) recreate brick dir >>>>>>>>>> > > 4) while the brick is still down, from the mount point: >>>>>>>>>> > > a) create a dummy non existent dir under / of mount. >>>>>>>>>> > > >>>>>>>>>> > >>>>>>>>>> > so if noee 2 is down brick, pick node for example 3 and make a >>>>>>>>>> test dir >>>>>>>>>> > under its brick directory that doesnt exist on 2 or should I be >>>>>>>>>> dong this >>>>>>>>>> > over a gluster mount? >>>>>>>>>> You should be doing this over gluster mount. >>>>>>>>>> > >>>>>>>>>> > > b) set a non existent extended attribute on / of mount. >>>>>>>>>> > > >>>>>>>>>> > >>>>>>>>>> > Could you give me an example of an attribute to set? I've >>>>>>>>>> read a tad on >>>>>>>>>> > this, and looked up attributes but haven't set any yet myself. >>>>>>>>>> > >>>>>>>>>> Sure. setfattr -n "user.some-name" -v "some-value" <path-to-mount> >>>>>>>>>> > Doing these steps will ensure that heal happens only from >>>>>>>>>> updated brick to >>>>>>>>>> > > down brick. >>>>>>>>>> > > 5) gluster v start <> force >>>>>>>>>> > > 6) gluster v heal <> >>>>>>>>>> > > >>>>>>>>>> > >>>>>>>>>> > Will it matter if somewhere in gluster the full heal command >>>>>>>>>> was run other >>>>>>>>>> > day? Not sure if it eventually stops or times out. >>>>>>>>>> > >>>>>>>>>> full heal will stop once the crawl is done. So if you want to >>>>>>>>>> trigger heal again, >>>>>>>>>> run gluster v heal <>. Actually even brick up or volume start >>>>>>>>>> force should >>>>>>>>>> trigger the heal. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Did this on test bed today. its one server with 3 bricks on same >>>>>>>>> machine so take that for what its worth. also it still runs 3.8.2. >>>>>>>>> Maybe >>>>>>>>> ill update and re-run test. >>>>>>>>> >>>>>>>>> killed brick >>>>>>>>> deleted brick dir >>>>>>>>> recreated brick dir >>>>>>>>> created fake dir on gluster mount >>>>>>>>> set suggested fake attribute on it >>>>>>>>> ran volume start <> force >>>>>>>>> >>>>>>>>> looked at files it said needed healing and it was just 8 shards >>>>>>>>> that were modified for few minutes I ran through steps >>>>>>>>> >>>>>>>>> gave it few minutes and it stayed same >>>>>>>>> ran gluster volume <> heal >>>>>>>>> >>>>>>>>> it healed all the directories and files you can see over mount >>>>>>>>> including fakedir. >>>>>>>>> >>>>>>>>> same issue for shards though. it adds more shards to heal at >>>>>>>>> glacier pace. slight jump in speed if I stat every file and dir in VM >>>>>>>>> running but not all shards. >>>>>>>>> >>>>>>>>> It started with 8 shards to heal and is now only at 33 out of 800 >>>>>>>>> and probably wont finish adding for few days at rate it goes. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> > > >>>>>>>>>> > > > 1st node worked as expected took 12 hours to heal 1TB data. >>>>>>>>>> Load was >>>>>>>>>> > > little >>>>>>>>>> > > > heavy but nothing shocking. >>>>>>>>>> > > > >>>>>>>>>> > > > About an hour after node 1 finished I began same process on >>>>>>>>>> node2. Heal >>>>>>>>>> > > > proces kicked in as before and the files in directories >>>>>>>>>> visible from >>>>>>>>>> > > mount >>>>>>>>>> > > > and .glusterfs healed in short time. Then it began crawl of >>>>>>>>>> .shard adding >>>>>>>>>> > > > those files to heal count at which point the entire proces >>>>>>>>>> ground to a >>>>>>>>>> > > halt >>>>>>>>>> > > > basically. After 48 hours out of 19k shards it has added >>>>>>>>>> 5900 to heal >>>>>>>>>> > > list. >>>>>>>>>> > > > Load on all 3 machnes is negligible. It was suggested to >>>>>>>>>> change this >>>>>>>>>> > > value >>>>>>>>>> > > > to full cluster.data-self-heal-algorithm and restart >>>>>>>>>> volume which I >>>>>>>>>> > > did. No >>>>>>>>>> > > > efffect. Tried relaunching heal no effect, despite any node >>>>>>>>>> picked. I >>>>>>>>>> > > > started each VM and performed a stat of all files from >>>>>>>>>> within it, or a >>>>>>>>>> > > full >>>>>>>>>> > > > virus scan and that seemed to cause short small spikes in >>>>>>>>>> shards added, >>>>>>>>>> > > but >>>>>>>>>> > > > not by much. Logs are showing no real messages indicating >>>>>>>>>> anything is >>>>>>>>>> > > going >>>>>>>>>> > > > on. I get hits to brick log on occasion of null lookups >>>>>>>>>> making me think >>>>>>>>>> > > its >>>>>>>>>> > > > not really crawling shards directory but waiting for a >>>>>>>>>> shard lookup to >>>>>>>>>> > > add >>>>>>>>>> > > > it. I'll get following in brick log but not constant and >>>>>>>>>> sometime >>>>>>>>>> > > multiple >>>>>>>>>> > > > for same shard. >>>>>>>>>> > > > >>>>>>>>>> > > > [2016-08-29 08:31:57.478125] W [MSGID: 115009] >>>>>>>>>> > > > [server-resolve.c:569:server_resolve] 0-GLUSTER1-server: >>>>>>>>>> no resolution >>>>>>>>>> > > type >>>>>>>>>> > > > for (null) (LOOKUP) >>>>>>>>>> > > > [2016-08-29 08:31:57.478170] E [MSGID: 115050] >>>>>>>>>> > > > [server-rpc-fops.c:156:server_lookup_cbk] >>>>>>>>>> 0-GLUSTER1-server: 12591783: >>>>>>>>>> > > > LOOKUP (null) (00000000-0000-0000-00 >>>>>>>>>> > > > 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) >>>>>>>>>> ==> (Invalid >>>>>>>>>> > > > argument) [Invalid argument] >>>>>>>>>> > > > >>>>>>>>>> > > > This one repeated about 30 times in row then nothing for 10 >>>>>>>>>> minutes then >>>>>>>>>> > > one >>>>>>>>>> > > > hit for one different shard by itself. >>>>>>>>>> > > > >>>>>>>>>> > > > How can I determine if Heal is actually running? How can I >>>>>>>>>> kill it or >>>>>>>>>> > > force >>>>>>>>>> > > > restart? Does node I start it from determine which >>>>>>>>>> directory gets >>>>>>>>>> > > crawled to >>>>>>>>>> > > > determine heals? >>>>>>>>>> > > > >>>>>>>>>> > > > David Gossage >>>>>>>>>> > > > Carousel Checks Inc. | System Administrator >>>>>>>>>> > > > Office 708.613.2284 >>>>>>>>>> > > > >>>>>>>>>> > > > _______________________________________________ >>>>>>>>>> > > > Gluster-users mailing list >>>>>>>>>> > > > [email protected] >>>>>>>>>> > > > http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> > > > _______________________________________________ >>>>>>>>>> > > > Gluster-users mailing list >>>>>>>>>> > > > [email protected] >>>>>>>>>> > > > http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>> > > >>>>>>>>>> > > -- >>>>>>>>>> > > Thanks, >>>>>>>>>> > > Anuradha. >>>>>>>>>> > > >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks, >>>>>>>>>> Anuradha. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
