Additional information, After the volume was 100% full, I delete some of the files but not the files which are listed in heal info. When it was 98%, I delete the folder which was marked as to be healed: /archive1/data/fff
After start and stop the volume the files in /archive1/data/fff were still there. Regards David Spisla Am Mo., 24. Juni 2019 um 15:33 Uhr schrieb David Spisla <[email protected] >: > Hello Ravi and Gluster Community, > > Am Mo., 24. Juni 2019 um 14:25 Uhr schrieb David Spisla < > [email protected]>: > >> >> >> ---------- Forwarded message --------- >> Von: David Spisla <[email protected]> >> Date: Fr., 21. Juni 2019 um 10:02 Uhr >> Subject: Re: [Gluster-users] Pending heal status when deleting files >> which are marked as to be healed >> To: Ravishankar N <[email protected]> >> >> >> Hello Ravi, >> >> Am Mi., 19. Juni 2019 um 18:06 Uhr schrieb Ravishankar N < >> [email protected]>: >> >>> >>> On 17/06/19 3:45 PM, David Spisla wrote: >>> >>> Hello Gluster Community, >>> >>> my newest observation concerns the self heal daemon: >>> Scenario: 2 Node Gluster v5.5 Cluster with Replica 2 Volume. Just one >>> brick per node. Access via SMB Client from a Win10 machine >>> >>> How to reproduce: >>> I have created a small folder with a lot of small files and I copied >>> that folder recursively into itself for a few times. Additionally I copied >>> three big folders with a lot of content into the root of the volume. >>> Note: There was no node down or something else like brick down, etc.. So >>> the whole volume was accessible. >>> >>> Because of the recursively copy action all this copied files whre listed >>> as to be healed (via gluster heal info). >>> >>> This is odd. How did you conclude that writing to the volume (i.e. >>> recursive copy) was the reason for the files to be needing heal? Did you >>> check if there were any gluster messages about disconnects in the smb >>> client logs? >>> >> There was no disconnection, I am sure. But at all I am not really sure >> whats the cause of this problem. >> > I reproduce it. Now I don't think that recursive copy is the reason. I > copied several small files in the volume (capacity 1GB) unless it is full > (see steps to reproduce below). I didn't set RO to the file. There was > never a disconnection. > >> >>> Now I set some of the effected files ReadOnly (they get WORMed because >>> worm-file-level is enabled). After this I tried to delete the parent folder >>> of that files. >>> >>> Expected: All files should be healed >>> Actually: All files, which are Read-Only, are not healed. heal info >>> shows permanently that this files has to be healed. >>> >>> Does disabling read-only let the files to be healed? >>> >> I have to ty this. >> > I tried it out and it had no efffect. > >> >>> glustershd log throws error and brick log (with level DEBUG) permanently >>> throws a lot of messages which I don't understand. See the attached file >>> which contains all informations, also heal info and volume info, beside the >>> logs >>> >>> Maybe some of you know whats going on there? Since we can reproduce this >>> scenario, we can give more debug information if needed. >>> >>> Is it possible to script the list of steps to reproduce this issue? >>> >> I will do that and post it here. Although I will collect more data when >> it happens >> > Steps to reproduce: > > 1. Copy several small files into a volume (here: 1GB capacity) > 2. Copy until the volume is nearly full (70-80% or more) > 3. Now self-heal is listing files to be healed > 4. Move or delete all of this files or a just a part. > 5. The files won't be healed and stay in the heal info list. > > In my case I copied until the volume was 100% full (storage.reserve was > 1%). I delete some of the files, to get a level of 98%. I wait for a while > but nothing happens. After this I stopped and started the volume. Files are > now healed. > Attached there is the glustershd.log where you can see that performing > entry.self-heal (2019-06-24 10:04:02.007328) could not be finished for > pgfid:7e4fa649-434a-4bb7-a1c2-258818d76076 until the volume was stopped and > started again. After starting again entry.self-heal could be finished for > that pgfid (at 2019-06-24 12:38:38.689632). The pgfid refers to the files > which were listed to be healed: > > fs-davids-c2-n1:~ # gluster vo heal archive1 info > Brick fs-davids-c2-n1:/gluster/brick1/glusterbrick > /archive1/data/fff/gg - Kopie.txt > /archive1/data/fff > /archive1/data/fff/gg - Kopie - Kopie.txt > /archive1/data/fff/gg - Kopie - Kopie (2).txt > Status: Connected > Number of entries: 4 > > All of this files has the same pgfid: > > fs-davids-c2-n1:~ # getfattr -e hex -d -m "" > '/gluster/brick1/glusterbrick/archive1/data/fff/'* | grep trusted.pgfid > getfattr: Removing leading '/' from absolute path names > trusted.pgfid.7e4fa649-434a-4bb7-a1c2-258818d76076=0x00000001 > trusted.pgfid.7e4fa649-434a-4bb7-a1c2-258818d76076=0x00000001 > trusted.pgfid.7e4fa649-434a-4bb7-a1c2-258818d76076=0x00000001 > > Summary: The pending heal problem seems to occur if a volume is nearly > full or completely full. > > Regards > David Spisla > > >> Regards >> David >> >>> Regards, >>> >>> Ravi >>> >>> >>> Regards >>> David Spisla >>> >>> >>> _______________________________________________ >>> Gluster-users mailing >>> [email protected]https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>>
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
