[
https://issues.apache.org/jira/browse/OAK-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chetan Mehrotra updated OAK-6081:
---------------------------------
Description:
To enable better management for indexing related operation specially around
reindexing indexes on large repository setup we should implement some tooling
as part of oak-run
The tool would support
# For DocumentNodeStore setup it would be possible to connect oak-run to a live
cluster and it would take care of indexing -> storing index on disk -> merging
index -> importing it back at end. This would ensure that live setup faces
minimum disruption and is not loaded much
# For SegementNodeStore setup it would be possible to index on a cloned setup
and then provide a way to copy the index back
Future Enhancements
# *Resumable tarversal* - It should be able to reindex large repo with
resumable traversal such that even if indexing breaks due to some issue it can
resume from last state (OAK-5833)
# *Multithreaded traversal* - Current indexing is single threaded and hence for
large repo it can take long time. Plan here is to support multi threaded
indexing where each thread can be assigned a part of repository tree to index
and in the end the indexes are merged
was:
To enable better management for indexing related operation specially around
reindexing indexes on large repository setup we should implement some tooling
as part of oak-run
The tool would support
# *Resumable tarversal* - It should be able to reindex large repo with
resumable traversal such that even if indexing breaks due to some issue it can
resume from last state (OAK-5833)
# *Multithreaded traversal* - Current indexing is single threaded and hence for
large repo it can take long time. Plan here is to support multi threaded
indexing where each thread can be assigned a part of repository tree to index
and in the end the indexes are merged
# For DocumentNodeStore setup it would be possible to connect oak-run to a live
cluster and it would take care of indexing -> storing index on disk -> merging
index -> importing it back at end. This would ensure that live setup faces
minimum disruption and is not loaded much
# For SegementNodeStore setup it would be possible to index on a cloned setup
and then provide a way to copy the index back
> Indexing tooling via oak-run
> ----------------------------
>
> Key: OAK-6081
> URL: https://issues.apache.org/jira/browse/OAK-6081
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: indexing, run
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.8, 1.7.4
>
>
> To enable better management for indexing related operation specially around
> reindexing indexes on large repository setup we should implement some tooling
> as part of oak-run
> The tool would support
> # For DocumentNodeStore setup it would be possible to connect oak-run to a
> live cluster and it would take care of indexing -> storing index on disk ->
> merging index -> importing it back at end. This would ensure that live setup
> faces minimum disruption and is not loaded much
> # For SegementNodeStore setup it would be possible to index on a cloned setup
> and then provide a way to copy the index back
> Future Enhancements
> # *Resumable tarversal* - It should be able to reindex large repo with
> resumable traversal such that even if indexing breaks due to some issue it
> can resume from last state (OAK-5833)
> # *Multithreaded traversal* - Current indexing is single threaded and hence
> for large repo it can take long time. Plan here is to support multi threaded
> indexing where each thread can be assigned a part of repository tree to index
> and in the end the indexes are merged
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)