+1 I can forsee a lot of components which does not need the QueryComponent. SOLR-706 being one.
On Tue, Oct 21, 2008 at 8:39 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > On Oct 21, 2008, at 8:17 AM, Grant Ingersoll wrote: > >> >> On Oct 20, 2008, at 11:35 PM, Otis Gospodnetic wrote: >> >>> This is related to something I must have only day dreamed (dreamt?) >>> about, but not actually mentioned on solr-dev. >>> My feeling is we are moving Solr in a direction of a more general web >>> service that can host various NLP and ML components, and no longer only do >>> IR/Lucene. We see that with a few patches that Grant is cooking, I think >>> we'll see that in the Solr+Mahout marriage down the road, and so on. >> >> I somewhat agree, but I hesitate to go so far as saying a "general web >> service". > > I won't suggest that solr is (or should be) a general web service, but > wt=json/xml/python + RequestHandler makes a pretty nice cross platform > interface all on its own. > > >> I see Solr as a pretty nice platform for doing things like NLP/ML (see the >> AnalysisRequestHandler, TermVectorComponent, ClusteringComponent, >> LukeReqHandler, FacetingComp., Payloads, etc.), but I mostly view them as >> enhancing search/navigation. That is, things like clustering/faceting >> (they are closely related), named entity recognition, search, etc. all act >> as organizing components for structured and unstructured data. Expressing >> my vision for Solr (and actually, the Lucene TLP, too, if I put on my PMC >> hat) it's one that aims to bring coherence to (structured and unstructured) >> content. This starts with search as a foundation, since the indexing >> process creates much of the information that empowers the others. I think >> once you see the flexible indexing stuff added to Lucene Java, we'll see >> even more opportunity for making Solr even more powerful in these regards. >> > > agree. > > >>> >>> >>> Is it time to start thinking about Solr sa a server for IR and ML and NLP >>> tasks and see how the tightly coupled Lucene can be made more....pluggable? >> >> Yeah, this is what the Solr 2.0 thread that Yonik started a few weeks ago >> aims to discuss, along with scalability/fault tolerance. More important, >> for me anyway, is the decoupling of the configuration. For instance, I see >> no reason why IndexSchema needs to know anything about an InputStream. > > also agree. The biggest challenge for 2.0 is decoupling configuration > >> As for Lucene, it's really quite good at serving as the backend >> store/enabler for all these tasks. >> > > I have not messed with it yet, but perhaps also HBase... > >> >> At any rate, the question still remains as to how best to handle the >> QueryComponent :-) >> > > aaah, your question! > > I see two options: > 1. If no other component needs docList or docSet and the query is empty, > then skip the QueryComponent > 2. add a 'runQuery' param (or somethign like that) and default to true. It > can be turned off when not necessary. > > I like option 1 better. > > ryan > > > -- --Noble Paul