[ 
https://issues.apache.org/jira/browse/LUCENE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006439#comment-13006439
 ] 

Simon Willnauer commented on LUCENE-2621:
-----------------------------------------

bq. When you mentioned Codec API, do you mean the abstract class 
org.apache.lucene.index.codecs.Codec?
Yes that is the main entry point. Currently a codec offer a FieldsConsumer 
which is pulled by the IndexWriter upon a flush request. Codecs are assigned 
per field and segment via the CodecProvider. So each field can have its own 
codec and each codec can have a different implementation. Yet, currently we 
only provide codec support for the reverse index so a codec can customize the 
term dictionary (TermsEnum would be the API counterpart) and posting lists 
(DocsEnum / DocsAndPositionsEnum in the API). What this issue tries to do is to 
open up this API as a general low level customization layer that enables users 
to also customize how Stored Fields and TermVectors are stored on disk.

 bq. Term vectors refer to org.apache.lucene.index.TermFreqVector, and it is 
processed by TermVectorsWriter now, correct?
yes thats true.

bq. But what are the stored fields? I cannot find them immediately.
there should be a StoredFieldsWriter and a FieldsReader.

bq. BTW, is there any design document of Lucene in the Wiki?

nothing that I would call a design document. there are some pages which could 
be similar to what you are looking for but those might be out of date. You 
should maybe look int the corresponding issues to find design decisions.

> Extend Codec to handle also stored fields and term vectors
> ----------------------------------------------------------
>
>                 Key: LUCENE-2621
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2621
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 4.0
>            Reporter: Andrzej Bialecki 
>              Labels: gsoc2011,, lucene-gsoc-11, mentor,
>
> Currently Codec API handles only writing/reading of term-related data, while 
> stored fields data and term frequency vector data writing/reading is handled 
> elsewhere.
> I propose to extend the Codec API to handle this data as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to