[ 
https://issues.apache.org/jira/browse/JENA-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643243#comment-16643243
 ] 

ASF GitHub Bot commented on JENA-1615:
--------------------------------------

GitHub user dobrist opened a pull request:

    https://github.com/apache/jena/pull/481

    JENA-1615 - Compaction leaks file descriptors

    Close file channel when closing a block to release open file descriptors

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dobrist/jena JENA-1615

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/481.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #481
    
----
commit 199d4f267a70627b6a79f5a3081b37ca88224921
Author: damienobrist <damien@...>
Date:   2018-10-09T12:12:43Z

    JENA-1615: Close file channel when closing a block
    
    This is necessary to release open file descriptors

----


> Compaction leaks file descriptors
> ---------------------------------
>
>                 Key: JENA-1615
>                 URL: https://issues.apache.org/jira/browse/JENA-1615
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Core, TDB2
>    Affects Versions: Jena 3.8.0
>         Environment: I reproduced the issue on the following environments:
>  * OS / Java:
>  ** MacOS 10.13.5
> Java 1.8.0_161 (Oracle)
>  ** Debian 9.5
> Java 1.8.0_181 (OpenJDK)
>  * Jena version 3.8.0
>  * TDB2 mode: mapped
>            Reporter: Damien Obrist
>            Priority: Major
>         Attachments: open_files_after_compaction_after_gc.png, 
> open_files_after_compaction_before_gc.png, open_files_before_compaction.png
>
>
> h3. Context
> I'm using a TDB2 dataset in a long-running Scala application, in which the 
> dataset gets compacted regularly. After compactions, the application removes 
> the {{Data-xxxx}} folder of the previous generation. However, the 
> corresponding disk space isn't properly returned back to the OS, but is still 
> reported as being used by {{df}}. Indeed, {{lsof}} shows that the application 
> keeps open file descriptors that point to the old generation's files. Only 
> stopping / restarting the JVM frees the disk space for good.
> h3. Reproduction steps
>  * Connect to an existing TDB2 dataset
> {code}
> val dataset = TDB2Factory.connectDataset("sample"){code}
>  * Check open files
>   [^open_files_before_compaction.png]
>  * Compact the dataset
>   {code}DatabaseMgr.compact(dataset.asDatasetGraph){code}
>  * Check open files (before garbage collection)
>  [^open_files_after_compaction_before_gc.png]
>  * Check open files (after garbage collection)
>  [^open_files_after_compaction_after_gc.png]
> The last sceenshot shows that, even after garbage collection, there are still 
> open file descriptors pointing to the old generation {{Data-0001}}.
> h3. Impact
> Depending on how disk usage is being reported, this can be quite problematic. 
> In our case, we're running on an OpenShift infrastructure with limited 
> storage. After only a handful of compactions, the storage is considered full 
> and cannot be used anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to