[ https://issues.apache.org/jira/browse/JENA-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643243#comment-16643243 ]
ASF GitHub Bot commented on JENA-1615: -------------------------------------- GitHub user dobrist opened a pull request: https://github.com/apache/jena/pull/481 JENA-1615 - Compaction leaks file descriptors Close file channel when closing a block to release open file descriptors You can merge this pull request into a Git repository by running: $ git pull https://github.com/dobrist/jena JENA-1615 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/481.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #481 ---- commit 199d4f267a70627b6a79f5a3081b37ca88224921 Author: damienobrist <damien@...> Date: 2018-10-09T12:12:43Z JENA-1615: Close file channel when closing a block This is necessary to release open file descriptors ---- > Compaction leaks file descriptors > --------------------------------- > > Key: JENA-1615 > URL: https://issues.apache.org/jira/browse/JENA-1615 > Project: Apache Jena > Issue Type: Bug > Components: Core, TDB2 > Affects Versions: Jena 3.8.0 > Environment: I reproduced the issue on the following environments: > * OS / Java: > ** MacOS 10.13.5 > Java 1.8.0_161 (Oracle) > ** Debian 9.5 > Java 1.8.0_181 (OpenJDK) > * Jena version 3.8.0 > * TDB2 mode: mapped > Reporter: Damien Obrist > Priority: Major > Attachments: open_files_after_compaction_after_gc.png, > open_files_after_compaction_before_gc.png, open_files_before_compaction.png > > > h3. Context > I'm using a TDB2 dataset in a long-running Scala application, in which the > dataset gets compacted regularly. After compactions, the application removes > the {{Data-xxxx}} folder of the previous generation. However, the > corresponding disk space isn't properly returned back to the OS, but is still > reported as being used by {{df}}. Indeed, {{lsof}} shows that the application > keeps open file descriptors that point to the old generation's files. Only > stopping / restarting the JVM frees the disk space for good. > h3. Reproduction steps > * Connect to an existing TDB2 dataset > {code} > val dataset = TDB2Factory.connectDataset("sample"){code} > * Check open files > [^open_files_before_compaction.png] > * Compact the dataset > {code}DatabaseMgr.compact(dataset.asDatasetGraph){code} > * Check open files (before garbage collection) > [^open_files_after_compaction_before_gc.png] > * Check open files (after garbage collection) > [^open_files_after_compaction_after_gc.png] > The last sceenshot shows that, even after garbage collection, there are still > open file descriptors pointing to the old generation {{Data-0001}}. > h3. Impact > Depending on how disk usage is being reported, this can be quite problematic. > In our case, we're running on an OpenShift infrastructure with limited > storage. After only a handful of compactions, the storage is considered full > and cannot be used anymore. -- This message was sent by Atlassian JIRA (v7.6.3#76005)