I agree with Robert, as I have had similar wishes about more interface capabilities, but also agree with Eric in that Lucene works great in a lot of ways. I have found the current design causes you to have to hard code things that shouldn't need to be hard coded, especially in the TokenStream area. The idea of writing a new Analyzer every time you want to change a Tokenizer or TokenFilter is very limiting. In my application I need the flexibility to re-index and evaluate fairly often. The current Analyzer implementation would require me to write a new Analyzer for every experiment and that is not manageable. Do others have this issue?
I submitted a "broken" patch that converts the analyzers and token streams to interfaces, but as Doug pointed out, it is not currently thread safe (I have another version that uses reflection that is thread safe). I intend to go back and make it thread-safe, but haven't had the time. Anyway, this patch contains an interface implementation of Analyzer and TokenStream that we may find useful in the future and if someone else wants to take up the ball and make it thread-safe, I don't think it would take too long. >>> [EMAIL PROTECTED] 04/20/04 10:14AM >>> Erik Hatcher wrote: > Robert, > > I agree with your sentiment about interfaces. I haven't looked into > the specifics of what you mention about IndexReader though. > > Out of curiosity - is there something about Lucene's design that is > preventing you from extending Lucene? Pragmatically speaking, we > shouldn't change anything unless there is a compelling reason to do > so, and while a clean architecture is a nice ideal, if it ain't broke > don't fix it :) > > Erik > > > On Apr 19, 2004, at 8:12 PM, Robert Engels wrote: > >> Lucene is often cited as an excellent example of OO design. >> >> I see several areas where this does not seem to be the case, mainly >> because >> of the Lucene code uses abstract base classes where it should be using >> interfaces, and other cases, where an interface makes reference to an >> abstract base class as a method/return parameter. >> >> It seems the abstract base classes were created so there was a place >> to put >> the static 'helper' methods. Would it not be better to have separate >> interfaces and abstract base classes that the implementations could >> extend >> if necessary? >> >> Some examples are listed below. Am I missing something, or is the code >> structured this way for a reason? >> >> ----- >> >> The main problem seems to be the use of IndexReader in many >> interfaces, when >> it should be using a Searchable. Otherwise, what is the point of >> Searchable. >> >> The following object use chain shows the 'pointlessness' of Searchable: >> >> Searchable->Query->Weight->IndexReader >> >> ----- >> >> The Query object has the method: >> >> public Query rewrite(IndexReader reader) throws IOException... >> >> but the entire search package is based on 'Searchable's, should it >> not be >> >> public Query rewrite(Searchable searchable) throws IOException... >> >> If not, the entire idea of 'external' searchable implementations break, >> because the end up having to 'know' an IndexReader, and IndexReader is a >> baseclass, etc. >> >> The same goes for >> >> public Weight weight(Searcher searcher); >> >> should this not be a 'Searchable'? >> >> ----- >> >> The Weight interface has the following: >> >> Scorer scorer(IndexReader reader) throws IOException; >> >> Should it not be >> >> Scorer scorer(Searchable searchable) throws IOException; >> >> The same goes for explain(). >> >> ----- >> >> It seems that Searchable was created because IndexReader was not an >> interface, but the methods have not been changed to use a Searchable >> instead >> of IndexReader. >> >> ----- >> >> It seems many of the 'search' package classes use a 'Searcher', when >> they >> should be using a 'Searchable'. If these core classes really need the >> methods defined in 'Searcher', then these methods should be moved to >> 'Searchable'. >> >> ----- >> >> Shouldn't 'Filter' just be an interface, with the method >> >> boolean filter(int docnum); >> >> as to whether or not the document should be filtered? >> >> ----- >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > I have had similar wishes recently for interfaces for TokenStream and Analyzer. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]