I believe this is a question of identity. What is Lucene?
IMO Lucene is a full text search library, that is it's purpose. It
isn't trying to be a search server or a search engine. It is easy to
include as a library and is used on everything from embedded servers to
www search engines.
Quoting from Yonik's previous posting:
> Some in Lucene development have expressed a desire to make Lucene more
> of a complete solution, rather than just a core full-text search
> library... things like a data schema, faceting, etc. The Lucene
> project already has an enterprise search platform with these
> features... that's Solr.
So is Lucene a full text search library or is it something different?
And isn't that something different already Solr? Why should they be the
same thing when their goals aren't the same?
> Trying to pull popular pieces out of Solr
> makes life harder for Solr developers, brings our projects into
> conflict, and is often unsuccessful (witness the largely failed
> migration of FunctionQueries from Solr to Lucene).
I feel for you, really. I remember trying to develop in Nutch on Hadoop
0.04. But the logic is not correct. Just because Solr wants X feature
and Solr uses Lucene != everyone who uses Lucene wants X. Faceting for
example, great feature, but not useful in every full text search.
> For Lucene to achieve the ultimate in usability for users, it can't
> require Java experience... it needs higher level abstractions provided
> by Solr.
I don't believe this to be true. If the Lucene community had wanted
very general language agnostic search, it would have happened by now.
Lucene is a Java API. Solr on the other hand is a server and therefore
should be language agnostic.
> The other benefit to Lucene would be to bring features to developers
> much sooner... Solr has had features years before they were developed
> in Lucene, and currently has more developers working with it.
"We have more developers than you do" isn't a valid reason to merge,
especially in open source software. Maybe in the corporate world. IMO
if Solr has more developers and want some architecture changed in Lucene
and it is to the benefit of the entire Lucene community, then those
changes can be proposed and voted upon.
> Esp with Solr not using Lucene trunk, if a Solr developer wants a
> feature quickly, they cannot add it to Lucene (even if it might make
> sense there) since that introduces a big unpredictable lag
Solr has the option of not using Lucene. If something needs to go into
Lucene, it should be voted on and support all of the different uses for
Lucene. As a friend told me recently, specialization is for insects.
> 1) Solr would go back to using Lucene's trunk
Use trunk, don't use trunk. That is up to the Solr project. It
shouldn't influence Lucene's behavior.
> 2) For new Solr features, there would be an effort to abstract it such
> that non-Solr users could use the functionality (faceting, field
> collapsing, etc)
Can you say that every feature would be applicable to a full text search
library. If not then it is beyond the core responsibilities of Lucene.
> 3) For new Lucene features, there would be an effort to integrate it
> into Solr.
No. Because by specializing towards Solr, or Nutch, or any of the
hundred other applications that use Lucene, it looses its general
applicability. Where would Hadoop be if it never made it past Nutch?
> 4) Releases would be synchronized... Lucene and Solr would release at
> the same time.
So synchronize your releases. Communicate.
I am open to listening to your responses, but all of this is to say my
vote is still currently -1.
Dennis