thomasmueller commented on code in PR #536:
URL: https://github.com/apache/jackrabbit-oak/pull/536#discussion_r849616744


##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/NodeStateEntryTraverser.java:
##########
@@ -176,12 +175,6 @@ private boolean includeId(String id) {
         }
 
         String path = Utils.getPathFromId(id);
-
-        //Exclude hidden nodes from index data
-        if (NodeStateUtils.isHiddenPath(path)){
-            return false;
-        }
-
         return pathPredicate.test(path);

Review Comment:
   I think we should also move pathPredicate.test, not only 
NodeStateUtils.isHiddenPath... right? I mean, assuming there is a path 
predicate like "/content/project" then we only store the intermediate progress 
for nodes within "/content/project"... and that could in theory be: no node at 
all. So on each scale up / scale down, we download from scratch.



##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/DocumentStoreIndexerBase.java:
##########
@@ -172,6 +172,7 @@ private FlatFileStore buildFlatFileStore(NodeState 
checkpointedState, CompositeI
                         .withBlobStore(indexHelper.getGCBlobStore())
                         .withPreferredPathElements((preferredPathElements != 
null) ? preferredPathElements : indexer.getRelativeIndexedNodeNames())
                         
.addExistingDataDumpDir(indexerSupport.getExistingDataDumpDir())
+                        .withPathPredicate((pathPredicate != null) ? 
pathPredicate : indexer::shouldInclude)

Review Comment:
   We have removed the check for hidden nodes (NodeStateUtils.isHiddenPath), 
but I don't see where we add the check in the higher level (that is: here), in 
case there is a pathPredicate... Does pathPredicate do a check for hidden 
nodes? If yes, why did we check for both pathPredicate.test(path) AND hidden 
nodes before? (Maybe it's correct, but possibly not).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to