On Jun 13, 2009, at 8:58 AM, Michael McCandless wrote:
OK, good points Grant. I now agree that it's not a simple task, moving stuff core stuff from Solr -> Lucene. So summing this all up: * Some feel Lucene should only aim to be the core "expert" engine used by Solr/Nutch/etc., so things like moving trie to core (with consumable naming, good defaults, etc.) are near zero priority.
I agree on the engine part, but don't agree on the expert part. Many people who have their own frameworks and needs should be able to plug in Lucene and it should just work. Likewise, there is a huge install base that must be thought of. Still Solr/Nutch are the single largest users of Lucene and, wearing my PMC hat, I think it makes sense that we make it obvious for newbies coming in where their time is best spent. If someone shows up in Solr-land and just needs Lucene because they want to be next to the metal, we should tell them that. Likewise if they don't want to spend time doing warming, faceting, etc. they should just go use Solr.
Also, I have no problem with Trie being in core. If someone wants to do it, go for it. That's how it all works anyway. Do-acracy in action. It's not a priority for me, but that shouldn't stop anyone else.
While I see & agree that this is indeed what Solr needs of Lucene, I still think direct consumbility of Lucene is important and Lucene should try to have a consumable API, good names for classes methods, good defaults, etc.
Agreed, although I think all the deprecation stuff severely limits Lucene's consumability. You, as a writer of LIA know this first hand, and I also experience this first hand when doing Lucene training. As I've pointed out countless times lately, so much cruft builds up in Lucene by the time that we get to X.Y release (for Y > 2, as in 2.2) that consumability suffers greatly.
And I don't see those two goals as being in conflict (ie, I don't see Lucene having a consumable API as preventing Solr from using Lucene's advanced APIs), except for the fact that we all have limited time. * We have two communities. Each has its own goal (to make its product good), it's own committers, etc. While technically we seem to agree certain things (function queries, NumberUtils, highlighters, analyzers, faceted nav, etc.) logically "belong" as Lucene modules, the logistics and work required and different requirements (both one time, and ongoing) are in fact sizable challenges/barriers.
I take the "I know where to put it when I do it approach", but as is obvious, not everyone has that luxury b/c they aren't committers on both projects. Integrating Tika into Solr was logical, while the DelimitedPayload stuff logically belonged in contrib/analyzers (to me anyway, and one of my primary motivations for that patch is to easily enable Payloads in Solr w/o having to modify how Solr works). Likewise, I think it makes sense for Solr's analyzers (WordDelimiter) to be in contrib/analyzers too, but I don't particularly think moving Solr's faceting stuff to Lucene is necessarily core to Lucene. As seems to be my theme lately, I take it on a "case-by-case" basis.
Perhaps once Lucene "modularizes", in the future, such consolidation may be easier, ie if/once there are committers focused on "analyzers" I could seem them helping out all around in pulling all analyzers together. * We all are obviously busy and there are more important things to work on than "shuffling stuff around".
+1 -Grant --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org