[jira] [Commented] (JENA-567) Improve Fuseki/TDB transaction memory usage

Andy Seaborne (JIRA) Sat, 26 Oct 2013 10:09:03 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806139#comment-13806139
 ]


Andy Seaborne commented on JENA-567:
------------------------------------

Looks good.

To document it, did you think a paragraph or two on the TDB transactions is the 
right thing to do?  Or start an FAQ?

Are there any tests?

Can it (should it) be settable per dataset?  If the context object for the 
dataset is used, then it makes it non-global.

In Journal.java, would it be better to loop, reading bytes from the non-array 
ByteBuffer and updating the adler CRC byte-size.  This avoids the copy into an 
array, which might have CPU cache effects, but is more method calls.


> Improve Fuseki/TDB transaction memory usage
> -------------------------------------------
>
>                 Key: JENA-567
>                 URL: https://issues.apache.org/jira/browse/JENA-567
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>            Reporter: Stephen Allen
>            Assignee: Stephen Allen
>
> TDB has to buffer in memory all of the modified blocks for a transaction 
> before committing it.  This causes out of memory exceptions when attempting 
> to add a large number of statements in a single transaction.
> An easy way to fix this would be to copy the write block contents into a 
> memory mapped file instead of heap memory ^†^.  We can provide three user 
> specified options for controlling the location of the temporary blocks:
> # JVM heap (default, and what we currently use)
> # Direct memory (process heap, but not in the JVM)
> # Memory mapped temporary file
> See this [Jena thread| http://markmail.org/thread/ckeevvhl2luevixw] for some 
> additional discussion.
> ^†^ _The harder way would involve writing the old blocks to the journal, then 
> writing the new blocks directly to the indexes, with a tombstone pointing to 
> the old block in the journal so that readers could still retrieve the old 
> version.  This however would seem to require a substantial refactor, as well 
> as a change to the on-disk database format._



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (JENA-567) Improve Fuseki/TDB transaction memory usage

Reply via email to