Re: Filtering results

ristretto . rb Thu, 18 Sep 2008 20:41:01 -0700

Thanks Otis for reply!  Always appreciated!

That is indeed what we are looking for implementing.  But, I'm running
out of time to prototype or experiment for this release.
I'm going to run the two index thing for now, unless I find something
saying is really easy and sensible to run one and collapse
on a field.


thanks
gene


On Fri, Sep 19, 2008 at 3:24 PM, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Gene,
> I haven't looked at Field Collapsing for a while, but if you have a single 
> index and collapse hits on your category field, then won't first 10 hits be 
> items you are looking for - top 1 item for each category x 10 using a single 
> query.
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: ristretto.rb <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, September 18, 2008 7:35:43 PM
>> Subject: Re: Filtering results
>>
>> Otis,
>>
>> Would be reasonable to run a query like this
>>
>> http://localhost:8280/solr/select/?q=terms_x&version=2.2&start=0&rows=0&indent=on
>>
>> 10 times, one for each result from an initial category query on a
>> different index.
>> So, it's still 1+10, but I'm  not returning values.
>> This would give me the number of pages that would match, and I can
>> display that number.
>> Not ideal, but better then nothing, and hopefully not a problem with scaling.
>>
>> cheers
>> gene
>>
>>
>>
>> On Wed, Sep 17, 2008 at 1:21 PM, Gene Campbell wrote:
>> > OK thanks Otis.  Any gut feeling on the best approach to get this
>> > collapsed data?  I hate to ask you to do my homework, but I'm coming
>> > to the
>> > end of my Solr/Lucene knowledge.  I don't code java too well - used
>> > to, but switched to Python a while back.
>> >
>> > gene
>> >
>> >
>> >
>> >
>> > On Wed, Sep 17, 2008 at 12:47 PM, Otis Gospodnetic
>> > wrote:
>> >> Gene,
>> >>
>> >> The latest patch from Bojan for SOLR-236 works with whatever revision of 
>> >> Solr
>> he used when he made the patch.
>> >>
>> >> I didn't follow this thread to know your original requirements, but 
>> >> running
>> 1+10 queries doesn't sound good to me from scalability/performance point of
>> view.
>> >>
>> >> Otis
>> >> --
>> >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >>
>> >>
>> >>
>> >> ----- Original Message ----
>> >>> From: ristretto.rb
>> >>> To: solr-user@lucene.apache.org
>> >>> Sent: Tuesday, September 16, 2008 6:45:02 PM
>> >>> Subject: Re: Filtering results
>> >>>
>> >>> thanks.  very interesting.  The plot thickens.  And, yes, I think
>> >>> field collapsing is exactly what I'm after.
>> >>>
>> >>> I'm am considering now trying this patch.  I have a solr 1.2 instance
>> >>> on Jetty.  I looks like I need to install the patch.
>> >>> Does anyone use that patch?  Recommend it?  The wiki page
>> >>> (http://wiki.apache.org/solr/FieldCollapsing) says
>> >>> "This patch is not complete, but it will be useful to keep this page
>> >>> updated while the interface evolves."  And the page
>> >>> was last updated over a year ago, so I'm not sure if that is a good.
>> >>> I'm trying to read through all the comments now.
>> >>>
>> >>> .....  I'm also considering creating a second index of just the
>> >>> categories which contains all the content from the main index
>> >>> collapsed
>> >>> down in to the corresponding categories - basically a complete
>> >>> collapsed index.
>> >>> Initial searches will be done against this collapsed category index,
>> >>> and then the first 10 results
>> >>> will be used to do 10 field queries against the main index to get the
>> >>> "top" records to return with each Category.
>> >>>
>> >>> Haven't decided which path to take yet.
>> >>>
>> >>> cheers
>> >>> gene
>> >>>
>> >>>
>> >>> On Wed, Sep 17, 2008 at 9:42 AM, Chris Hostetter
>> >>> wrote:
>> >>> >
>> >>> > : 1.  Identify all records that would match search terms.  (Suppose I
>> >>> > : search for 'dog', and get 450,000 matches)
>> >>> > : 2.  Of those records, find the distinct list of groups over all the
>> >>> > : matches.  (Suppose there are 300.)
>> >>> > : 3.  Now get the top ranked record from each group, as if you search
>> >>> > : just for docs in the group.
>> >>> >
>> >>> > this sounds similar to "Field Collapsing" although i don't really
>> >>> > understand it or your specific use case enough to be certain that it's 
>> >>> > the
>> >>> > same thing.  You may find the patch, and/or the discussions about the
>> >>> > patch useful starting points...
>> >>> >
>> >>> > https://issues.apache.org/jira/browse/SOLR-236
>> >>> > http://wiki.apache.org/solr/FieldCollapsing
>> >>> >
>> >>> >
>> >>> > -Hoss
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>
>

Re: Filtering results

Reply via email to