Re: Must QueryComponent always be on and other Design Questions

Noble Paul നോബിള്‍ नोब्ळ् Tue, 21 Oct 2008 08:22:24 -0700

+1
I can forsee a lot of components which does not need the
QueryComponent. SOLR-706 being one.




On Tue, Oct 21, 2008 at 8:39 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:
>
> On Oct 21, 2008, at 8:17 AM, Grant Ingersoll wrote:
>
>>
>> On Oct 20, 2008, at 11:35 PM, Otis Gospodnetic wrote:
>>
>>> This is related to something I must have only day dreamed (dreamt?)
>>> about, but not actually mentioned on solr-dev.
>>> My feeling is we are moving Solr in a direction of a more general web
>>> service that can host various NLP and ML components, and no longer only do
>>> IR/Lucene.  We see that with a few patches that Grant is cooking, I think
>>> we'll see that in the Solr+Mahout marriage down the road, and so on.
>>
>> I somewhat agree, but I hesitate to go so far as saying a "general web
>> service".
>
> I won't suggest that solr is (or should be) a general web service, but
> wt=json/xml/python + RequestHandler makes a pretty nice cross platform
> interface all on its own.
>
>
>> I see Solr as a pretty nice platform for doing things like NLP/ML (see the
>> AnalysisRequestHandler, TermVectorComponent, ClusteringComponent,
>> LukeReqHandler, FacetingComp., Payloads, etc.), but I mostly view them as
>> enhancing search/navigation.   That is, things like clustering/faceting
>> (they are closely related), named entity recognition, search, etc. all act
>> as organizing components for structured and unstructured data.  Expressing
>> my vision for Solr (and actually, the Lucene TLP, too, if I put on my PMC
>> hat) it's one that aims to bring coherence to (structured and unstructured)
>> content.  This starts with search as a foundation, since the indexing
>> process creates much of the information that empowers the others.  I think
>> once you see the flexible indexing stuff added to Lucene Java, we'll see
>> even more opportunity for making Solr even more powerful in these regards.
>>
>
> agree.
>
>
>>>
>>>
>>> Is it time to start thinking about Solr sa a server for IR and ML and NLP
>>> tasks and see how the tightly coupled Lucene can be made more....pluggable?
>>
>> Yeah, this is what the Solr 2.0 thread that Yonik started a few weeks ago
>> aims to discuss, along with scalability/fault tolerance.  More important,
>> for me anyway, is the decoupling of the configuration.  For instance, I see
>> no reason why IndexSchema needs to know anything about an InputStream.
>
> also agree.  The biggest challenge for 2.0 is decoupling configuration
>
>> As for Lucene, it's really quite good at serving as the backend
>> store/enabler for all these tasks.
>>
>
> I have not messed with it yet, but perhaps also HBase...
>
>>
>> At any rate, the question still remains as to how best to handle the
>> QueryComponent :-)
>>
>
> aaah, your question!
>
> I see two options:
> 1.  If no other component needs docList or docSet and the query is empty,
> then skip the QueryComponent
> 2.  add a 'runQuery' param (or somethign like that) and default to true.  It
> can be turned off when not necessary.
>
> I like option 1 better.
>
> ryan
>
>
>



-- 
--Noble Paul

Re: Must QueryComponent always be on and other Design Questions

Reply via email to