Re: Requesting advice on Fuseki memory settings

Andy Seaborne Mon, 25 Mar 2024 06:49:08 -0700



On 25/03/2024 07:05, Gaspar Bartalus wrote:

Dear Andy and co.,

Thanks for the support, I think we can close this thread for now.
We will continue to monitor this behaviour and if we can retrieve any
additional useful information then we might reopen it.

Please do pass on any information and techniques for operationFuseki/TDB. There is so much variety "out there" that all reports arehelpful.


    Andy


Best regards,
Gaspar

On Sun, Mar 24, 2024 at 5:00 PM Andy Seaborne <[email protected]> wrote:



On 21/03/2024 09:52, Rob @ DNR wrote:

Gaspar

This probably relates to https://access.redhat.com/solutions/2316

Deleting a file removes it from the file table but doesn’t immediately

free the space if a process is still accessing those files.  That could be
something else inside the container, or in a containerised environment
where the disk space is mounted that could potentially be host processes on
the K8S node that are monitoring the storage.
  >

There’s some suggested debugging steps in the RedHat article about ways

to figure out what processes might still be holding onto the old database
files

Rob


Fuseki does close the database connections after compact but only after
all read transactions on the old database have completed. that can hold
the database open for a while.

Another delay is the ext4 filing system. Deletes will be in the journal
and only when the journal operations are performed will the file system
be released. Usually this happens quickly, but I've seen it take an
appreciable length of time occasionally.

Gaspar wrote:
  > then we start fresh where du -sh and df -h return the same numbers.

This indicates the file space has been release. Restarting clears any
outstanding read-transactions and likely gives the ext4 journal to run
through.

Just about any layer (K8s, VMs) adds delays to real release of the space
but it should happen eventually.

      Andy

From: Gaspar Bartalus <[email protected]>
Date: Wednesday, 20 March 2024 at 11:41
To: [email protected] <[email protected]>
Subject: Re: Requesting advice on Fuseki memory settings
Hi Andy

On Sat, Mar 16, 2024 at 8:58 PM Andy Seaborne <[email protected]> wrote:



On 12/03/2024 13:17, Gaspar Bartalus wrote:

On Mon, Mar 11, 2024 at 6:28 PM Andy Seaborne<[email protected]>  wrote:


On 11/03/2024 14:35, Gaspar Bartalus wrote:

Hi Andy,

On Fri, Mar 8, 2024 at 4:41 PM Andy Seaborne<[email protected]>

wrote:


On 08/03/2024 10:40, Gaspar Bartalus wrote:

Hi,

Thanks for the responses.

We were actually curious if you'd have some explanation for the
linear increase in the storage, and why we are seeing differences

between

the actual size of our dataset and the size it uses on disk.

(Changes

between `df -h` and `du -lh`)?

Linear increase between compactions or across compactions? The

latter

sounds like the previous version hasn't been deleted.

Across compactions, increasing linearly over several days, with

compactions

running every day. The compaction is used with the "deleteOld"

parameter,

and there is only one Data- folder in the volume, so I assume

compaction

itself works as expected.

Strange - I can't explain that. Could you check that there is only one
Data-NNNN directory inside the database directory?

Yes, there is surely just one Data-NNNN folder in the database

directory.

What's the disk storage setup? e.g filesystem type.

We have an Azure disk of type Standard SSD LRS with a filesystem of

type

Ext4.


Hi Gaspar,

I still can't explain what your seeing I'm afraid.

Can we get some more details?

When the server has Data-N -- how big (as reported by 'du -sh') is that
directory and how big is the whole directory for the database. They
should be nearly equal.

When a compaction is done, and the server is at Data-(N+1), what are the
sizes of Data-(N+1) and the database directory?


What we see with respect to compaction is usually the following:
- We start with the Data-N folder of ~210MB
- After compaction we have a Data-(N+1) folder of size ~185MB, the old
Data-N being deleted.
- The sizes of the database directory and the Data-* directory are equal.

However when we check with df -h we sometimes see that volume usage is

not

dropping, but on the contrary, it goes up ~140MB after each compaction.


Does stop/starting the server change those numbers?


Yes, then we start fresh where du -sh and df -h return the same numbers.


       Andy

Re: Requesting advice on Fuseki memory settings

Reply via email to