On Thu, Mar 11, 2010 at 12:35 PM, Marvin Humphrey <mar...@rectangular.com> wrote: > On Mon, Mar 08, 2010 at 02:10:35PM -0500, Michael McCandless wrote: > >> We ask it to give us a Codec. > > There's a conflict between the segment-wide role of the "Codec" class and its > role as specifier for posting format. > > In some sense, you could argue that the "codec" reads/writes the entire index > segment -- which includes not only postings files, but also stored fields, > term vectors, etc. However, the compression algorithms after which these > codecs are named have nothing to do with those other files. PFORCodec isn't > relevant to stored fields. > > I'd argue for limiting the role of "Codec" to encoding and decoding posting > files.
Yeah perhaps we should rename Codec -> PostingsCodec. And with time add different interfaces for the other components of a segment (eg StoredFieldsCodec). > As far as modularizing other aspects of index reading and writing, I don't > think a simple factory is the way to go. I favor using a composite design > pattern for SegWriter and SegReader (rather than subclassing), and an > initialization phase controlled by an Architecture object. > > It was Earwin Burrfoot who persuaded me of the merits of a user-defined > initialization phase over a user-defined factory method: > <http://markmail.org/message/ukhcvp2ydfxpcg7q>. How would this work specifically for postings reading & writing? When a segment is opened (eg via IndexReader.open/reopen, IndexWriter.getReader), we need to fully init all components before returning control. >> So far my fav is still CodecProvider ;) > > It seems that the primary reason this object is needed is that IndexReader > needs to be able to find the right decoder when it encounters an unfamiliar > codec name. Since the core doesn't know about user-created codecs, it's > necessary for the user to register the name => codec pairing in advance so > that core can find it. > > If that's this object's main role, I'd suggest "CodecRegistry". Well, it also provides a writer for newly created segments... >> Naming is the hardest part!! > > For me, the hardest parts of API design are... > > A) Designing public abstract classes / interfaces. > B) Compensating for the curse of knowledge. Yes both of these are hard. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org