Yu-An Lin created OAK-9754:
------------------------------

             Summary: Increase default dump threshold for multithreaded download
                 Key: OAK-9754
                 URL: https://issues.apache.org/jira/browse/OAK-9754
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: indexing
            Reporter: Yu-An Lin


Looking at the detailed log output of indexing job using Oak with 
Multi-Threaded Download Strategy, lots of small files are being created because 
we have a low dump threshold of 1MB per file. 
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/FlatFileNodeStoreBuilder.java#L91]
 

We should increase the threshold if possible to even larger to 16 MB instead, 
that way we have 16 MB, with 8 threads that is 128 MB. This would (hopefully) 
reduce the number of files from 22'972 to 1'435, which is more more reasonable. 
Also, I don't think it would bring any risk of out-of-memory.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to