[
https://issues.apache.org/jira/browse/LUCENE-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915264#action_12915264
]
Michael McCandless commented on LUCENE-2664:
--------------------------------------------
Committed, but I had to leave SimpleText out of the nightly rotation... some
tests run incredibly slowly, due to heavy reliance on the terms dict cache
(which SimpleText doesn't have)... I'd like to separately fix that and then
hopefully put SImpleText in for rotation, so I'll leave this issue open for
that.
> Add SimpleText codec
> --------------------
>
> Key: LUCENE-2664
> URL: https://issues.apache.org/jira/browse/LUCENE-2664
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-2664.patch
>
>
> Inspired by Sahin Buyrukbilen's question here:
>
> http://www.lucidimagination.com/search/document/b68846e383824653/how_to_export_lucene_index_to_a_simple_text_file#b68846e383824653
> I made a simple read/write codec that stores all postings data into a
> single text file (_X.pst), looking like this:
> {noformat}
> field contents
> term file
> doc 0
> pos 5
> term is
> doc 0
> pos 1
> term second
> doc 0
> pos 3
> term test
> doc 0
> pos 4
> term the
> doc 0
> pos 2
> term this
> doc 0
> pos 0
> END
> {noformat}
> The codec is fully funtional -- all Lucene & Solr tests pass with
> -Dtests.codec=SimpleText -- but, its performance is obviously poor.
> However, it should be useful for debugging, transparency,
> understanding just what Lucene stores in its index, etc. And it's a
> quick way to gain some understanding on how a codec works...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]