Jiri Frolik created JCR-3866:
--------------------------------

             Summary: Performance issue when adding large files to Apache 
Jackrabbit 2.8.0 using DbDataStore
                 Key: JCR-3866
                 URL: https://issues.apache.org/jira/browse/JCR-3866
             Project: Jackrabbit Content Repository
          Issue Type: Bug
          Components: jackrabbit-core, jackrabbit-data
    Affects Versions: 2.8
         Environment: Windows 8, Windows Server 2012 R2;
using Oracle jdk1.7.0_51
            Reporter: Jiri Frolik


When adding large (> 10MB) files into Jackrabbit 2.8.0, there can be seen a 
noticeable delay (e.g. 10-20 seconds by 100MB file, tested on more environments 
- Windows 8, Windows Server 2012 R2).
I am using org.apache.jackrabbit.core.data.db.DbDataStore (see configuration in 
attachment).

I found out that the delay is made by calling following from my Java 
application:
org.apache.jackrabbit.core.NodeImpl.setProperty("jcr:data", new 
org.apache.jackrabbit.value.BinaryImpl(inputStream));   // input stream of 
added large file
By debugging I further found out that the BinaryImpl object creates a temp file 
from given input stream.

Assuming that there is used a default setting of DbDataStore.storeStream 
property (which is equal to "tempFile"), I found two long-lasting instructions 
in org.apache.jackrabbit.core.data.db.DbDataStore.addRecord(InputStream).
(Here the input stream is of the newly created temp file).

One delay is caused by storing the content on line 357: 
conHelper.exec(updateDataSQL, wrapper, tempId);
But I understand that this is an unnecessary step and cannot be simply 
optimized.

The second delay, on line 350, is caused by creating of a new temp file from 
current input stream, just to see its length, by calling:
org.apache.commons.io.IOUtils.copy(InputStream, OutputStream), which copies the 
file by 4kB (!) buffer by default.

When I tried to change the value of the DbDataStore.storeStream property to 
"max" and skip the temp file creation, an exception raised from line 357 
(conHelper.exec(updateDataSQL, wrapper, tempId);):
java.lang.RuntimeException: Unable to reset the Stream.
        at 
org.apache.jackrabbit.core.util.db.ConnectionHelper.execute(ConnectionHelper.java:527)
        at 
org.apache.jackrabbit.core.util.db.ConnectionHelper.reallyExec(ConnectionHelper.java:313)
        at 
org.apache.jackrabbit.core.util.db.ConnectionHelper$1.call(ConnectionHelper.java:294)
        at 
org.apache.jackrabbit.core.util.db.ConnectionHelper$1.call(ConnectionHelper.java:290)
        at 
org.apache.jackrabbit.core.util.db.ConnectionHelper$RetryManager.doTry(ConnectionHelper.java:559)
        at 
org.apache.jackrabbit.core.util.db.ConnectionHelper.exec(ConnectionHelper.java:290)
        at 
org.apache.jackrabbit.core.data.db.DbDataStore.addRecord(DbDataStore.java:357)
        at 
org.apache.jackrabbit.core.value.BLOBInDataStore.getInstance(BLOBInDataStore.java:132)
        at 
org.apache.jackrabbit.core.value.InternalValue.getBLOBFileValue(InternalValue.java:626)
        at 
org.apache.jackrabbit.core.value.InternalValue.create(InternalValue.java:381)
        at 
org.apache.jackrabbit.core.value.InternalValueFactory.create(InternalValueFactory.java:108)
        at 
org.apache.jackrabbit.core.value.ValueFactoryImpl.createValue(ValueFactoryImpl.java:130)
        at 
org.apache.jackrabbit.core.value.ValueFactoryImpl.createValue(ValueFactoryImpl.java:119)
        at org.apache.jackrabbit.core.NodeImpl.setProperty(NodeImpl.java:3467)
...

Could you please give me some advice, how to either avoid creation of the 
second temp file in Jackrabbit and solve the exception above, or configure the 
Jackrabbit to increase performance by adding large files to the DbDataStore?

Thank you in advance
Jiri Frolik



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to