Joerg Hoh created OAK-10116:
-------------------------------

             Summary: Performance problem when importing nodes with many binary 
properties and remote blobstore
                 Key: OAK-10116
                 URL: https://issues.apache.org/jira/browse/OAK-10116
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: blob-cloud, blob-plugins, jcr
    Affects Versions: 1.48.0
            Reporter: Joerg Hoh


We often import binaryless packages (using JR filevault) into our Oak 
instances, which are using a remote blobstore.

We observe bad performance when we import nodes with binary properties. In this 
case stacktraces often look like this:

{noformat}
"Queue Processor for Subscriber agent publishSubscriber" #311 daemon prio=5 
os_prio=0 cpu=298928.76ms elapsed=576.04s tid=0x0000563f968c6800 nid=0x1644 
runnable  [0x00007f2a609e3000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0([email protected]/Native 
Method)
        at 
java.net.SocketInputStream.socketRead([email protected]/SocketInputStream.java:115)
        at 
java.net.SocketInputStream.read([email protected]/SocketInputStream.java:168)
        at 
java.net.SocketInputStream.read([email protected]/SocketInputStream.java:140)
        at 
sun.security.ssl.SSLSocketInputRecord.read([email protected]/SSLSocketInputRecord.java:478)
        at 
sun.security.ssl.SSLSocketInputRecord.readHeader([email protected]/SSLSocketInputRecord.java:472)
        at 
sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket([email protected]/SSLSocketInputRecord.java:70)
        at 
sun.security.ssl.SSLSocketImpl.readApplicationRecord([email protected]/SSLSocketImpl.java:1328)
        at 
sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:971)
        at 
java.io.BufferedInputStream.fill([email protected]/BufferedInputStream.java:252)
        at 
java.io.BufferedInputStream.read1([email protected]/BufferedInputStream.java:292)
        at 
java.io.BufferedInputStream.read([email protected]/BufferedInputStream.java:351)
        - locked <0x00000007d98d0ca8> (a java.io.BufferedInputStream)
        at 
sun.net.www.http.HttpClient.parseHTTPHeader([email protected]/HttpClient.java:746)
        at 
sun.net.www.http.HttpClient.parseHTTP([email protected]/HttpClient.java:689)
        at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0([email protected]/HttpURLConnection.java:1615)
        - locked <0x00000007d98cb480> (a 
sun.net.www.protocol.https.DelegateHttpsURLConnection)
        at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream([email protected]/HttpURLConnection.java:1520)
        - locked <0x00000007d98cb480> (a 
sun.net.www.protocol.https.DelegateHttpsURLConnection)
        at 
java.net.HttpURLConnection.getResponseCode([email protected]/HttpURLConnection.java:527)
        at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode([email protected]/HttpsURLConnectionImpl.java:334)
        at 
com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:115)
        at 
com.microsoft.azure.storage.blob.CloudBlob.downloadAttributes(CloudBlob.java:1414)
        at 
com.microsoft.azure.storage.blob.CloudBlob.downloadAttributes(CloudBlob.java:1381)
        at 
org.apache.jackrabbit.oak.blob.cloud.azure.blobstorage.AzureBlobStoreBackend.getRecord(AzureBlobStoreBackend.java:408)
        at 
org.apache.jackrabbit.oak.plugins.blob.AbstractSharedCachingDataStore.getRecordIfStored(AbstractSharedCachingDataStore.java:210)
        at 
org.apache.jackrabbit.core.data.AbstractDataStore.getRecordFromReference(AbstractDataStore.java:72)
        at 
org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getBlobId(DataStoreBlobStore.java:402)
        at 
org.apache.jackrabbit.oak.segment.SegmentNodeStore.getBlob(SegmentNodeStore.java:257)
        at 
org.apache.jackrabbit.oak.composite.CompositeNodeStore.getBlob(CompositeNodeStore.java:202)
        at 
org.apache.jackrabbit.oak.core.MutableRoot.getBlob(MutableRoot.java:342)
        at 
org.apache.jackrabbit.oak.plugins.value.jcr.ValueFactoryImpl.createValue(ValueFactoryImpl.java:111)
        at 
org.apache.jackrabbit.vault.util.DocViewProperty.apply(DocViewProperty.java:413)
        at 
org.apache.jackrabbit.vault.fs.impl.io.DocViewSAXImporter.createNode(DocViewSAXImporter.java:1131)
        at 
org.apache.jackrabbit.vault.fs.impl.io.DocViewSAXImporter.addNode(DocViewSAXImporter.java:891)
        at 
org.apache.jackrabbit.vault.fs.impl.io.DocViewSAXImporter.startElement(DocViewSAXImporter.java:681)
        at 
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement([email protected]/AbstractSAXParser.java:510)
        at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement([email protected]/XMLNSDocumentScannerImpl.java:374)
        at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next([email protected]/XMLDocumentFragmentScannerImpl.java:2710)
        at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next([email protected]/XMLDocumentScannerImpl.java:605)
        at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next([email protected]/XMLNSDocumentScannerImpl.java:112)
        at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument([email protected]/XMLDocumentFragmentScannerImpl.java:534)
        at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse([email protected]/XML11Configuration.java:888)
        at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse([email protected]/XML11Configuration.java:824)
        at 
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse([email protected]/XMLParser.java:141)
        at 
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse([email protected]/AbstractSAXParser.java:1216)
        at 
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse([email protected]/SAXParserImpl.java:635)
        at 
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse([email protected]/SAXParserImpl.java:324)
        at 
org.apache.jackrabbit.vault.fs.impl.io.GenericArtifactHandler.accept(GenericArtifactHandler.java:100)
        at org.apache.jackrabbit.vault.fs.io.Importer.commit(Importer.java:932)
        at org.apache.jackrabbit.vault.fs.io.Importer.commit(Importer.java:799)
{noformat}

In this context we can ensure that all binaries are available on the remote 
blobstore, so a call to the blobstore would not required, at least not for 
validating its presence; all other information could/should be part of the 
filevault package.

In my opinion the ValueFactory should be able to create a binary property 
without reaching out to the blobstore to avoid the network latency. This would 
speed up the import process dramatically; as in the context of this situation 
we can create approx 20 binary properties per second, while we can create 
thousands of non-binary properties in the same time.







--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to