Based on suggestion here implemented a script to un-invert the index
(details at OAK-7122 [1], [2]).

uninverting was done by following logic

  def collectFieldNames(DirectoryReader reader) {
        println "Proceeding to collect the field names per document"

        Bits liveDocs = MultiFields.getLiveDocs(reader)
        Fields fields = MultiFields.getFields(reader)
        fields.each {String fieldName ->
            Terms terms = fields.terms(fieldName)
            TermsEnum termsEnum = terms.iterator(null)

            while (termsEnum.next() != null) {
                DocsEnum docsEnum = termsEnum.docs(liveDocs, null,
DocsEnum.FLAG_NONE)
                while(docsEnum.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
                    int docId = docsEnum.docID()
                    DocInfo di = infos.get(docId)
                    assert di : "No DocInfo for docId : $docId"
                    di.fieldIds << getFieldId(fieldName)
                }
            }
        }
    }

Thanks for the all the help!

Chetan Mehrotra
[1] https://issues.apache.org/jira/browse/OAK-7122
[2] 
https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/lucene

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to