[
https://issues.apache.org/jira/browse/LUCENE-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144724#comment-13144724
]
Uwe Schindler commented on LUCENE-3560:
---------------------------------------
bq. I'd like to extend an existing codec to add one file to files() - bummer, I
have to reimplement the whole codec now
The abstract base class Codec is as stupid simple as Analyzer. There is no
logic in it, it just defines the following:
- name of codec (which cannot be changed by subclassing!!!)
- factory methods for the format readers/writers of the different parts of an
index (postings, stored fields, segments file,...)
If you want to create a new codec, you have to simply write this wrapper with a
new name, otherwise SPI won't work.
> add extra safety to concrete codec implementations
> --------------------------------------------------
>
> Key: LUCENE-3560
> URL: https://issues.apache.org/jira/browse/LUCENE-3560
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Robert Muir
> Attachments: LUCENE-3560.patch
>
>
> In LUCENE-3490, we reorganized the codec model, and a key part of this is
> that Codecs are "safer"
> and don't rely upon client-side configuration: IndexReader doesn't take Codec
> or anything of that
> nature, only IndexWriter.
> Instead for "read" all codecs are initialized from the classpath via a no-arg
> ctor from Java's
> Service Provider Mechanism.
> So, although Codecs can still take parameters in the constructors, be
> subclassable, etc (for passing
> to IndexWriter), this enforces that they must write any configuration
> information they need into
> the index, so that we don't have a flimsy API.
> I think we should go even further, for additional safety. Any methods on our
> concrete codecs that
> are not intended to be subclassed should be final, and we should add
> assertions to verify this.
> For example, SimpleText's files() implementation should be final. If you want
> to make an extension
> of simpletext that has additional files, then this is a different index
> format and should have a
> different name!
> Note: This doesn't stop extensibility, only stupid mistakes.
> For example, this means that Lucene40Codec's postingsFormat() implementation
> is final, even though
> it offers a configurable "hook" (getPostingsFormatForField) for you to
> specify per-field postings
> formats (which it writes into a .per file into the index, so that it knows
> how to read each field).
> {code}
> private final PostingsFormat postingsFormat = new PerFieldPostingsFormat() {
> @Override
> public PostingsFormat getPostingsFormatForField(String field) {
> return Lucene40Codec.this.getPostingsFormatForField(field);
> }
> };
> ...
> @Override
> public final PostingsFormat postingsFormat() {
> return postingsFormat;
> }
> ...
> /** Returns the postings format that should be used for writing
> * new segments of <code>field</code>.
> *
> * The default implementation always returns "Lucene40"
> */
> public PostingsFormat getPostingsFormatForField(String field) {
> return defaultFormat;
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]