Re: Best solution to avoiding multiple query requests

2010-08-04 Thread kenf_nc

Not sure the processing would be any faster than just querying again, but, in
your original result set the first doc that has a field value that matches a
to 10 facet, will be the number 1 item if you fq on that facet value. So you
don't need to query it again. You would only need to query those that aren't
in your result set.
ie:
   q=dogfacet=onfacet.field=foo
results 10 docs
   id=1, foo=A
   id=2, foo=A
   id=3, foo=B
   id=4, foo=C
   id=5, foo=B
   id=6, foo=A
   id=7, foo=Z
   id=8, foo=T
   id=9, foo=B
   id=10, foo=J

If your facet results top 10 were (A, B, T, J, D, X, Q, O, P, I)
you already have the number 1 for A (id 1), B (id 3), T (id 8) and J (id 10)
in your very first query. You only need to query D, X, Q, O, P, I. 

If your first query returned 100 instead of 10 you may even have more of the
top 10 represented. Again, the processing steps you would need to do may not
be any faster than re-querying, it depends on the speed of your index and
network etc.

I would think that if your second query was
q=dogfq=(foo=A OR foo=B OR foo=T...etc) then you have even a greater chance
of having the number 1 result for each of the top 10 in just your second
query.

  
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Best-solution-to-avoiding-multiple-query-requests-tp1020886p1022397.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Geert-Jan Brits
Field Collapsing (currently as patch) is exactly what you're looking for
imo.

http://wiki.apache.org/solr/FieldCollapsing

http://wiki.apache.org/solr/FieldCollapsingGeert-Jan


2010/8/4 Ken Krugler kkrugler_li...@transpac.com

 Hi all,

 I've got a situation where the key result from an initial search request
 (let's say for dog) is the list of values from a faceted field, sorted by
 hit count.

 For the top 10 of these faceted field values, I need to get the top hit for
 the target request (dog) restricted to that value for the faceted field.

 Currently this is 11 total requests, of which the 10 requests following the
 initial query can be made in parallel. But that's still a lot of requests.

 So my questions are:

 1. Is there any magic query to handle this with Solr as-is?

 2. if not, is the best solution to create my own request handler?

 3. And in that case, any input/tips on developing this type of custom
 request handler?

 Thanks,

 -- Ken


 
 Ken Krugler
 +1 530-210-6378
 http://bixolabs.com
 e l a s t i c   w e b   m i n i n g







Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Ken Krugler

Hi Geert-Jan,

On Aug 4, 2010, at 5:30am, Geert-Jan Brits wrote:

Field Collapsing (currently as patch) is exactly what you're looking  
for

imo.

http://wiki.apache.org/solr/FieldCollapsing


Thanks for the ref, good stuff.

I think it's close, but if I understand this correctly, then I could  
get (using just top two, versus top 10 for simplicity) results that  
looked like


dog training (faceted field value A)
super dog (faceted field value B)

but if the actual faceted field value/hit counts were:

C (10)
D (8)
A (2)
B (1)

Then what I'd want is the top hit for dog AND facet field:C,  
followed by dog AND facet field:D.


Used field collapsing would improve the probability that if I asked  
for the top 100 hits, I'd find entries for each of my top N faceted  
field values.


Thanks again,

-- Ken

I've got a situation where the key result from an initial search  
request
(let's say for dog) is the list of values from a faceted field,  
sorted by

hit count.

For the top 10 of these faceted field values, I need to get the top  
hit for
the target request (dog) restricted to that value for the faceted  
field.


Currently this is 11 total requests, of which the 10 requests  
following the
initial query can be made in parallel. But that's still a lot of  
requests.


So my questions are:

1. Is there any magic query to handle this with Solr as-is?

2. if not, is the best solution to create my own request handler?

3. And in that case, any input/tips on developing this type of custom
request handler?

Thanks,

-- Ken



Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Geert-Jan Brits
If I understand correctly: you want to sort your collapsed results by 'nr of
collapsed results'/ hits.

It seems this can't be done out-of-the-box using this patch (I'm not
entirely sure, at least it doesn't follow from the wiki-page. Perhaps best
is to check the jira-issues to make sure this isn't already available now,
but just not updated on the wiki)

Also I found a blogpost (from the patch creator afaik) with in the comments
someone with the same issue + some pointers.
http://blog.jteam.nl/2009/10/20/result-grouping-field-collapsing-with-solr/

hope that helps,
Geert-jan

2010/8/4 Ken Krugler kkrugler_li...@transpac.com

 Hi Geert-Jan,


 On Aug 4, 2010, at 5:30am, Geert-Jan Brits wrote:

  Field Collapsing (currently as patch) is exactly what you're looking for
 imo.

 http://wiki.apache.org/solr/FieldCollapsing


 Thanks for the ref, good stuff.

 I think it's close, but if I understand this correctly, then I could get
 (using just top two, versus top 10 for simplicity) results that looked like

 dog training (faceted field value A)
 super dog (faceted field value B)

 but if the actual faceted field value/hit counts were:

 C (10)
 D (8)
 A (2)
 B (1)

 Then what I'd want is the top hit for dog AND facet field:C, followed by
 dog AND facet field:D.

 Used field collapsing would improve the probability that if I asked for the
 top 100 hits, I'd find entries for each of my top N faceted field values.

 Thanks again,

 -- Ken


  I've got a situation where the key result from an initial search request
 (let's say for dog) is the list of values from a faceted field, sorted
 by
 hit count.

 For the top 10 of these faceted field values, I need to get the top hit
 for
 the target request (dog) restricted to that value for the faceted
 field.

 Currently this is 11 total requests, of which the 10 requests following
 the
 initial query can be made in parallel. But that's still a lot of
 requests.

 So my questions are:

 1. Is there any magic query to handle this with Solr as-is?

 2. if not, is the best solution to create my own request handler?

 3. And in that case, any input/tips on developing this type of custom
 request handler?

 Thanks,

 -- Ken


 
 Ken Krugler
 +1 530-210-6378
 http://bixolabs.com
 e l a s t i c   w e b   m i n i n g







Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Ken Krugler

Hi Geert-jan,

On Aug 4, 2010, at 12:04pm, Geert-Jan Brits wrote:

If I understand correctly: you want to sort your collapsed results  
by 'nr of

collapsed results'/ hits.

It seems this can't be done out-of-the-box using this patch (I'm not
entirely sure, at least it doesn't follow from the wiki-page.  
Perhaps best
is to check the jira-issues to make sure this isn't already  
available now,

but just not updated on the wiki)

Also I found a blogpost (from the patch creator afaik) with in the  
comments

someone with the same issue + some pointers.
http://blog.jteam.nl/2009/10/20/result-grouping-field-collapsing-with-solr/


Yup, that's the one - 
http://blog.jteam.nl/2009/10/20/result-grouping-field-collapsing-with-solr/comment-page-1/#comment-1249

So with some modifications to that patch, it could work...thanks for  
the info!


-- Ken


2010/8/4 Ken Krugler kkrugler_li...@transpac.com


Hi Geert-Jan,


On Aug 4, 2010, at 5:30am, Geert-Jan Brits wrote:

Field Collapsing (currently as patch) is exactly what you're  
looking for

imo.

http://wiki.apache.org/solr/FieldCollapsing



Thanks for the ref, good stuff.

I think it's close, but if I understand this correctly, then I  
could get
(using just top two, versus top 10 for simplicity) results that  
looked like


dog training (faceted field value A)
super dog (faceted field value B)

but if the actual faceted field value/hit counts were:

C (10)
D (8)
A (2)
B (1)

Then what I'd want is the top hit for dog AND facet field:C,  
followed by

dog AND facet field:D.

Used field collapsing would improve the probability that if I asked  
for the
top 100 hits, I'd find entries for each of my top N faceted field  
values.


Thanks again,

-- Ken


I've got a situation where the key result from an initial search  
request
(let's say for dog) is the list of values from a faceted field,  
sorted

by
hit count.

For the top 10 of these faceted field values, I need to get the  
top hit

for
the target request (dog) restricted to that value for the faceted
field.

Currently this is 11 total requests, of which the 10 requests  
following

the
initial query can be made in parallel. But that's still a lot of
requests.

So my questions are:

1. Is there any magic query to handle this with Solr as-is?

2. if not, is the best solution to create my own request handler?

3. And in that case, any input/tips on developing this type of  
custom

request handler?

Thanks,

-- Ken





Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g








Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Best solution to avoiding multiple query requests

2010-08-03 Thread Ken Krugler

Hi all,

I've got a situation where the key result from an initial search  
request (let's say for dog) is the list of values from a faceted  
field, sorted by hit count.


For the top 10 of these faceted field values, I need to get the top  
hit for the target request (dog) restricted to that value for the  
faceted field.


Currently this is 11 total requests, of which the 10 requests  
following the initial query can be made in parallel. But that's still  
a lot of requests.


So my questions are:

1. Is there any magic query to handle this with Solr as-is?

2. if not, is the best solution to create my own request handler?

3. And in that case, any input/tips on developing this type of custom  
request handler?


Thanks,

-- Ken



Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g