Gaspar This probably relates to https://access.redhat.com/solutions/2316
Deleting a file removes it from the file table but doesn’t immediately free the space if a process is still accessing those files. That could be something else inside the container, or in a containerised environment where the disk space is mounted that could potentially be host processes on the K8S node that are monitoring the storage. There’s some suggested debugging steps in the RedHat article about ways to figure out what processes might still be holding onto the old database files Rob From: Gaspar Bartalus <[email protected]> Date: Wednesday, 20 March 2024 at 11:41 To: [email protected] <[email protected]> Subject: Re: Requesting advice on Fuseki memory settings Hi Andy On Sat, Mar 16, 2024 at 8:58 PM Andy Seaborne <[email protected]> wrote: > > > On 12/03/2024 13:17, Gaspar Bartalus wrote: > > On Mon, Mar 11, 2024 at 6:28 PM Andy Seaborne<[email protected]> wrote: > >> > >> On 11/03/2024 14:35, Gaspar Bartalus wrote: > >>> Hi Andy, > >>> > >>> On Fri, Mar 8, 2024 at 4:41 PM Andy Seaborne<[email protected]> wrote: > >>> > >>>> > >>>> On 08/03/2024 10:40, Gaspar Bartalus wrote: > >>>>> Hi, > >>>>> > >>>>> Thanks for the responses. > >>>>> > >>>>> We were actually curious if you'd have some explanation for the > >>>>> linear increase in the storage, and why we are seeing differences > >> between > >>>>> the actual size of our dataset and the size it uses on disk. (Changes > >>>>> between `df -h` and `du -lh`)? > >>>> Linear increase between compactions or across compactions? The latter > >>>> sounds like the previous version hasn't been deleted. > >>>> > >>> Across compactions, increasing linearly over several days, with > >> compactions > >>> running every day. The compaction is used with the "deleteOld" > parameter, > >>> and there is only one Data- folder in the volume, so I assume > compaction > >>> itself works as expected. > > >> Strange - I can't explain that. Could you check that there is only one > >> Data-NNNN directory inside the database directory? > >> > > Yes, there is surely just one Data-NNNN folder in the database directory. > > > >> What's the disk storage setup? e.g filesystem type. > >> > > We have an Azure disk of type Standard SSD LRS with a filesystem of type > > Ext4. > > Hi Gaspar, > > I still can't explain what your seeing I'm afraid. > > Can we get some more details? > > When the server has Data-N -- how big (as reported by 'du -sh') is that > directory and how big is the whole directory for the database. They > should be nearly equal. > When a compaction is done, and the server is at Data-(N+1), what are the > sizes of Data-(N+1) and the database directory? > What we see with respect to compaction is usually the following: - We start with the Data-N folder of ~210MB - After compaction we have a Data-(N+1) folder of size ~185MB, the old Data-N being deleted. - The sizes of the database directory and the Data-* directory are equal. However when we check with df -h we sometimes see that volume usage is not dropping, but on the contrary, it goes up ~140MB after each compaction. > > Does stop/starting the server change those numbers? > Yes, then we start fresh where du -sh and df -h return the same numbers. > > Andy >
