[ 
https://issues.apache.org/jira/browse/OAK-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain updated OAK-10048:
----------------------------
    Description: 
{{DocumentStoreIndexerBase#buildFlatFileStore}} below outputs the first 
FlatFileStore and its path which is incorrect if the FlatFileSplitter is 
triggered to split files based on config and index definitions. 

Also, the buildFlatFileStoreList method always passes false to split but shoud 
instead use the configuration {{IndexerConfiguration.parallelIndexEnabled()}} 
to try splitting based on this configuration.

{{The FlatFileNodeStoreBuilder#buildList}} needs to handle already available 
split files as well.
{code:java}
public FlatFileStore buildFlatFileStore() throws IOException, 
CommitFailedException {
        NodeState checkpointedState = 
indexerSupport.retrieveNodeStateForCheckpoint();
        Set<String> preferredPathElements = new HashSet<>();
        Set<IndexDefinition> indexDefinitions = getIndexDefinitions();
        for (IndexDefinition indexDf : indexDefinitions) {
            preferredPathElements.addAll(indexDf.getRelativeNodeNames());
        }
        Predicate<String> predicate = s -> 
indexDefinitions.stream().anyMatch(indexDef -> 
indexDef.getPathFilter().filter(s) != PathFilter.Result.EXCLUDE);
        FlatFileStore flatFileStore = buildFlatFileStoreList(checkpointedState, 
null, predicate, preferredPathElements, false, indexDefinitions).get(0);
        log.info("FlatFileStore built at {}. To use this flatFileStore in a 
reindex step, set System Property-{} with value {}",
                flatFileStore.getFlatFileStorePath(), 
OAK_INDEXER_SORTED_FILE_PATH, flatFileStore.getFlatFileStorePath());
        return flatFileStore;
}
{code}

  was:
DocumentStoreIndexerBase#buildFlatFileStore below outputs the first 
FlatFileStore and its path which is incorrect if the FlatFileSplitter is 
triggered to split files based on config and index definitions. 
Also, the buildFlatFileStoreList method always passes false to split but shoud 
instead use the configuration {{IndexerConfiguration.parallelIndexEnabled()}} 
to try splitting based on this configuration.

{code:java}
public FlatFileStore buildFlatFileStore() throws IOException, 
CommitFailedException {
        NodeState checkpointedState = 
indexerSupport.retrieveNodeStateForCheckpoint();
        Set<String> preferredPathElements = new HashSet<>();
        Set<IndexDefinition> indexDefinitions = getIndexDefinitions();
        for (IndexDefinition indexDf : indexDefinitions) {
            preferredPathElements.addAll(indexDf.getRelativeNodeNames());
        }
        Predicate<String> predicate = s -> 
indexDefinitions.stream().anyMatch(indexDef -> 
indexDef.getPathFilter().filter(s) != PathFilter.Result.EXCLUDE);
        FlatFileStore flatFileStore = buildFlatFileStoreList(checkpointedState, 
null, predicate, preferredPathElements, false, indexDefinitions).get(0);
        log.info("FlatFileStore built at {}. To use this flatFileStore in a 
reindex step, set System Property-{} with value {}",
                flatFileStore.getFlatFileStorePath(), 
OAK_INDEXER_SORTED_FILE_PATH, flatFileStore.getFlatFileStorePath());
        return flatFileStore;
}
{code}


> DocumentStoreIndexerBase#buildFlatFileStore outputs the wrong path when 
> FlatFileSplitter used
> ---------------------------------------------------------------------------------------------
>
>                 Key: OAK-10048
>                 URL: https://issues.apache.org/jira/browse/OAK-10048
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: oak-run
>            Reporter: Amit Jain
>            Priority: Major
>
> {{DocumentStoreIndexerBase#buildFlatFileStore}} below outputs the first 
> FlatFileStore and its path which is incorrect if the FlatFileSplitter is 
> triggered to split files based on config and index definitions. 
> Also, the buildFlatFileStoreList method always passes false to split but 
> shoud instead use the configuration 
> {{IndexerConfiguration.parallelIndexEnabled()}} to try splitting based on 
> this configuration.
> {{The FlatFileNodeStoreBuilder#buildList}} needs to handle already available 
> split files as well.
> {code:java}
> public FlatFileStore buildFlatFileStore() throws IOException, 
> CommitFailedException {
>         NodeState checkpointedState = 
> indexerSupport.retrieveNodeStateForCheckpoint();
>         Set<String> preferredPathElements = new HashSet<>();
>         Set<IndexDefinition> indexDefinitions = getIndexDefinitions();
>         for (IndexDefinition indexDf : indexDefinitions) {
>             preferredPathElements.addAll(indexDf.getRelativeNodeNames());
>         }
>         Predicate<String> predicate = s -> 
> indexDefinitions.stream().anyMatch(indexDef -> 
> indexDef.getPathFilter().filter(s) != PathFilter.Result.EXCLUDE);
>         FlatFileStore flatFileStore = 
> buildFlatFileStoreList(checkpointedState, null, predicate, 
> preferredPathElements, false, indexDefinitions).get(0);
>         log.info("FlatFileStore built at {}. To use this flatFileStore in a 
> reindex step, set System Property-{} with value {}",
>                 flatFileStore.getFlatFileStorePath(), 
> OAK_INDEXER_SORTED_FILE_PATH, flatFileStore.getFlatFileStorePath());
>         return flatFileStore;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to