[ 
https://issues.apache.org/jira/browse/OAK-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994281#comment-15994281
 ] 

Chetan Mehrotra commented on OAK-6081:
--------------------------------------

This tooling can be implemented in following phases

# v0 - Target here would be to have enable support for doing async indexing as 
is done currently via oak-run. In this phase we would not be changing indexing 
logic much but would just allow the async indexing run done via oak-run
# v1 - Indexer creates index files only on file system and in the end the files 
are copied to repo
# v2 - Support interruption i.e. if indexing gets killed in between it can 
start from last state
# v3 - multithreaded traversal

> Indexing tooling via oak-run
> ----------------------------
>
>                 Key: OAK-6081
>                 URL: https://issues.apache.org/jira/browse/OAK-6081
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: indexing, run
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8
>
>
> To enable better management for indexing related operation specially around 
> reindexing indexes on large repository setup we should implement some tooling 
> as part of oak-run 
> The tool would support
> # *Resumable tarversal* - It should be able to reindex large repo with 
> resumable traversal such that even if indexing breaks due to some issue it 
> can resume from last state (OAK-5833)
> # *Multithreaded traversal* - Current indexing is single threaded and hence 
> for large repo it can take long time. Plan here is to support multi threaded 
> indexing where each thread can be assigned a part of repository tree to index 
> and in the end the indexes are merged
> # For DocumentNodeStore setup it would be possible to connect oak-run to a 
> live cluster and it would take care of indexing -> storing index on disk -> 
> merging index ->  importing it back at end. This would ensure that live setup 
> faces minimum disruption and is not loaded much
> # For SegementNodeStore setup it would be possible to index on a cloned setup 
> and then provide  a way to copy the index back



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to