[ https://issues.apache.org/jira/browse/LUCENE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006439#comment-13006439 ]
Simon Willnauer commented on LUCENE-2621: ----------------------------------------- bq. When you mentioned Codec API, do you mean the abstract class org.apache.lucene.index.codecs.Codec? Yes that is the main entry point. Currently a codec offer a FieldsConsumer which is pulled by the IndexWriter upon a flush request. Codecs are assigned per field and segment via the CodecProvider. So each field can have its own codec and each codec can have a different implementation. Yet, currently we only provide codec support for the reverse index so a codec can customize the term dictionary (TermsEnum would be the API counterpart) and posting lists (DocsEnum / DocsAndPositionsEnum in the API). What this issue tries to do is to open up this API as a general low level customization layer that enables users to also customize how Stored Fields and TermVectors are stored on disk. bq. Term vectors refer to org.apache.lucene.index.TermFreqVector, and it is processed by TermVectorsWriter now, correct? yes thats true. bq. But what are the stored fields? I cannot find them immediately. there should be a StoredFieldsWriter and a FieldsReader. bq. BTW, is there any design document of Lucene in the Wiki? nothing that I would call a design document. there are some pages which could be similar to what you are looking for but those might be out of date. You should maybe look int the corresponding issues to find design decisions. > Extend Codec to handle also stored fields and term vectors > ---------------------------------------------------------- > > Key: LUCENE-2621 > URL: https://issues.apache.org/jira/browse/LUCENE-2621 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: 4.0 > Reporter: Andrzej Bialecki > Labels: gsoc2011,, lucene-gsoc-11, mentor, > > Currently Codec API handles only writing/reading of term-related data, while > stored fields data and term frequency vector data writing/reading is handled > elsewhere. > I propose to extend the Codec API to handle this data as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org