[ 
https://issues.apache.org/jira/browse/LUCENE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3490.
---------------------------------

    Resolution: Fixed

thanks again for the help here guys.
                
> Restructure codec hierarchy
> ---------------------------
>
>                 Key: LUCENE-3490
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3490
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3490.patch, LUCENE-3490_SPI.patch, 
> LUCENE-3490_reintegrate.patch, lucene2621-trunk-2.patch, 
> lucene2621-trunk-3.patch, lucene2621-trunk.patch
>
>
> Spinoff of LUCENE-2621. (Hoping we can do some of the renaming etc here in a 
> rote way to make progress).
> Currently Codec.java only represents a portion of the index, but there are 
> other parts of the index 
> (stored fields, term vectors, fieldinfos, ...) that we want under codec 
> control. There is also some 
> inconsistency about what a Codec is currently, for example Memory and Pulsing 
> are really just 
> PostingsFormats, you might just apply them to a specific field. On the other 
> hand, PreFlex actually
> is a Codec: it represents the Lucene 3.x index format (just not all parts 
> yet). I imagine we would
> like SimpleText to be the same way.
> So, I propose restructuring the classes so that we have something like:
> * CodecProvider <-- dead, replaced by java ServiceProvider mechanism. All 
> indexes are 'readable' if codecs are in classpath.
> * Codec <-- represents the index format (PostingsFormat + FieldsFormat + ...)
> * PostingsFormat: this is what Codec controls today, and Codec will return 
> one of these for a field.
> * FieldsFormat: Stored Fields + Term Vectors + FieldInfos?
> I think for PreFlex, it doesnt make sense to expose its PostingsFormat as a 
> 'public' class, because preflex
> can never be per-field so there is no use in allowing you to configure 
> PreFlex for a specific field.
> Similarly, I think in the future we should do the same thing for SimpleText. 
> Nobody needs SimpleText for production, it should
> just be a Codec where we try to make as much of the index as plain text and 
> simple as possible for debugging/learning/etc.
> So we don't need to expose its PostingsFormat. On the other hand, I don't 
> think we need Pulsing or Memory codecs,
> because its pretty silly to make your entire index use one of their 
> PostingsFormats. To parallel with analysis:
> PostingsFormat is like Tokenizer and Codec is like Analyzer, and we don't 
> need Analyzers to "show off" every Tokenizer.
> we can also move the baked in PerFieldCodecWrapper out (it would basically be 
> PerFieldPostingsFormat). Privately it would
> write the ids to the file like it does today. in the future, all 3.x hairy 
> backwards code would move to PreflexCodec. 
> SimpleTextCodec would get a plain text fieldinfos impl, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to