[
https://issues.apache.org/jira/browse/JENA-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andy Seaborne resolved JENA-1615.
---------------------------------
Resolution: Fixed
Assignee: Andy Seaborne
Fix Version/s: Jena 3.10.0
> Compaction leaks file descriptors
> ---------------------------------
>
> Key: JENA-1615
> URL: https://issues.apache.org/jira/browse/JENA-1615
> Project: Apache Jena
> Issue Type: Bug
> Components: Core, TDB2
> Affects Versions: Jena 3.8.0
> Environment: I reproduced the issue on the following environments:
> * OS / Java:
> ** MacOS 10.13.5
> Java 1.8.0_161 (Oracle)
> ** Debian 9.5
> Java 1.8.0_181 (OpenJDK)
> * Jena version 3.8.0
> * TDB2 mode: mapped
> Reporter: Damien Obrist
> Assignee: Andy Seaborne
> Priority: Major
> Fix For: Jena 3.10.0
>
> Attachments: open_files_after_compaction_after_gc.png,
> open_files_after_compaction_after_gc_with_fix.png,
> open_files_after_compaction_before_gc.png, open_files_before_compaction.png
>
>
> h3. Context
> I'm using a TDB2 dataset in a long-running Scala application, in which the
> dataset gets compacted regularly. After compactions, the application removes
> the {{Data-xxxx}} folder of the previous generation. However, the
> corresponding disk space isn't properly returned back to the OS, but is still
> reported as being used by {{df}}. Indeed, {{lsof}} shows that the application
> keeps open file descriptors that point to the old generation's files. Only
> stopping / restarting the JVM frees the disk space for good.
> h3. Reproduction steps
> * Connect to an existing TDB2 dataset
> {code}
> val dataset = TDB2Factory.connectDataset("sample"){code}
> * Check open files
> [^open_files_before_compaction.png]
> * Compact the dataset
> {code}DatabaseMgr.compact(dataset.asDatasetGraph){code}
> * Check open files (before garbage collection)
> [^open_files_after_compaction_before_gc.png]
> * Check open files (after garbage collection)
> [^open_files_after_compaction_after_gc.png]
> The last sceenshot shows that, even after garbage collection, there are still
> open file descriptors pointing to the old generation {{Data-0001}}.
> h3. Impact
> Depending on how disk usage is being reported, this can be quite problematic.
> In our case, we're running on an OpenShift infrastructure with limited
> storage. After only a handful of compactions, the storage is considered full
> and cannot be used anymore.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)