nfsantos commented on code in PR #1577:
URL: https://github.com/apache/jackrabbit-oak/pull/1577#discussion_r1745538475


##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/FlatFileNodeStoreBuilder.java:
##########
@@ -283,7 +324,7 @@ public List<IndexStore> buildList(IndexHelper indexHelper, 
IndexerSupport indexe
     private IndexStoreFiles createdSortedStoreFiles() throws IOException, 
CompositeException {
         // Check system property defined path
         String sortedFilePath = 
System.getProperty(OAK_INDEXER_SORTED_FILE_PATH);
-        if (StringUtils.isNotBlank(sortedFilePath)) {
+        if (sortedFilePath != null && !sortedFilePath.isEmpty()) {

Review Comment:
   This is not equivalent. If the string contains only whitespace, 
`StringUtils::isNotBlank` returns false, while this test here returns true, as 
`#isEmpty` tests that the String contains no characters.



##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/DocumentStoreIndexerBase.java:
##########
@@ -239,6 +239,12 @@ nodeStore, getMongoDocumentStore(), traversalLog))
         return storeList;
     }
 
+    public IndexStore buildTreeStore() throws IOException, 
CommitFailedException {
+        
System.setProperty(FlatFileNodeStoreBuilder.OAK_INDEXER_SORT_STRATEGY_TYPE,

Review Comment:
   This will leave this system property behind, even after finishing building 
the tree store, so it changes global state. Couldn't this cause unintended 
problems?



##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/FlatFileNodeStoreBuilder.java:
##########
@@ -19,17 +19,32 @@
 
 package org.apache.jackrabbit.oak.index.indexer.document.flatfile;
 
-import com.mongodb.MongoClientURI;
-import com.mongodb.client.MongoDatabase;
-import org.apache.commons.lang3.StringUtils;
+import static java.util.Collections.unmodifiableSet;
+import static org.apache.jackrabbit.guava.common.base.Preconditions.checkState;
+import static 
org.apache.jackrabbit.oak.index.indexer.document.indexstore.IndexStoreUtils.OAK_INDEXER_USE_LZ4;
+import static 
org.apache.jackrabbit.oak.index.indexer.document.indexstore.IndexStoreUtils.OAK_INDEXER_USE_ZIP;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.function.Predicate;
+import java.util.stream.Collectors;
+

Review Comment:
   The order of imports is different from the one of InteliJ. We do not have an 
agreed formatting style, but since most of us are using InteliJ and the file is 
already with the InteliJ ordering, I'd prefer we leave it the current order.



##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedTransformTask.java:
##########
@@ -160,7 +160,7 @@ public Result call() throws Exception {
                     for (NodeDocument nodeDoc : nodeDocumentBatch) {
                         statistics.incrementMongoDocumentsTraversed();
                         mongoObjectsProcessed++;
-                        if (mongoObjectsProcessed % 50000 == 0) {
+                        if (mongoObjectsProcessed % 200_000 == 0) {

Review Comment:
   Why the change? I don't find that this log line is printed too often. But 
adding a thousand separator to the `50000` would be nice. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to