[
https://issues.apache.org/jira/browse/OAK-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312866#comment-16312866
]
Chetan Mehrotra edited comment on OAK-7122 at 1/5/18 10:24 AM:
---------------------------------------------------------------
Implemented the script at [1]. Currently it build up the structure in memory.
If this proves to be problamatic for large index can look into building the
structure on file system
*Usage*
{code}
java -DindexPath=/path/to/indexing-result/indexes/lucene/data \
-jar oak-run-*.jar \
console /path/to/segmentstore \
":load
https://raw.githubusercontent.com/chetanmeh/oak-console-scripts/master/src/main/groovy/lucene/luceneIndexDumper.groovy"
{code}
[1]
https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/lucene
was (Author: chetanm):
Implemented the script at [1]. Currently it build up the structure in memory.
If this proves to be problamatic for large index can look into building the
structure on file system
[1]
https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/lucene
> Implement script to compare lucene indexes logically
> ----------------------------------------------------
>
> Key: OAK-7122
> URL: https://issues.apache.org/jira/browse/OAK-7122
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: run
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.8
>
>
> With Document Traversal based indexing we have implemented a newer indexing
> logic. To validate that index produced by it is is same as one done by
> existing indexing flow we need to implement a script which can enable
> comparing the index content logically
> This was recently discussed on lucene mailing list [1] and suggestion there
> was it can be done by un-inverting the index. So to enable that we need to
> implement a script which can
> # Open a Lucene index
> # Map the Lucene Document to path of node
> # For each document determine what all fields are associated with it (stored
> and non stored)
> # Dump this content in file sorted by path and for each line field name
> sorted by name
> Then such dumps can be generated for old and new index and compared via
> simple text diff
> [1] http://lucene.markmail.org/thread/wt22gk6aufs4uz55
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)