Re: Showing facet of first N docs

2011-06-20 Thread Toke Eskildsen
On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote:
 Do you know if it is possible to show the facets for a particular field
 related only to the first N docs of the total number of results?

It collides with the inner working in Solr, as faceting does not process
the doc-IDs from the matching documents in result order. It also uses
all the hits, but that could be hacked.

What is N? If it is a fairly low number (hundreds) and your documents
are indexed with an unique ID, you can extract the IDs and perform a
facet-request with the ORed IDs as query.


I am a bit curious about what you're trying to achieve here.
Conventionally, faceting provides an overview of all data, often
prioritized by occurrence count. While I understand the idea of trying
to use weights to prioritize, limiting the faceting to a subset of the
result set seems very much like a standard ranked document search.



Re: Showing facet of first N docs

2011-06-20 Thread Tommaso Teofili
2011/6/18 Dmitry Kan dmitry@gmail.com

 Do you mean you would like to boost the facets that contain the most of the
 lemmas?


That would be good, but I'd prefer getting facets, for example, from first
50 of 500 docs only .


 What is the user query in this case and if possible, what is the use case
 (may be some other solution exists for what you are trying to achieve)?


the use case is to help the user refining a query with the most relevant
facets, which in theory come from the most relevant documents.
So with 500 results being sorted by score (desc) the facet counts would come
also from the documents ranked 490 to 500, which contain less relevant
information.


2011/6/18 lee carroll lee.a.carr...@googlemail.com

 Hi Tommaso

 I don't think you can achieve what you want using vanilla solr.
 Facet counts will be for the result set matching not for the top n
 result sets matching.

 However what is your use case ? Assuming its for faceted navigation
 showing facets for the
 top n result sets could be confusing to your users. As the next
 incremental filter applied by the user would change the relevancy
 focus of the user and produce another set of top n facet counts with
 a document set un-related to the last result set. This could be a very
 bad user experience producing a fluctuating facet counts (ie a filter
 narrowing the search could produce an increase in a facet term count -
 very odd) also the result set could change strangely with docs
 floating in and out of the result list.


Right :-) Thanks for pointing this out.



 relevancy seems to be the answer here - if your docs are scored
 correctly then counting all docs in the result set for the facet
 counts is correct. do you need to improve relevancy?


I have a quite good relevance obtained after playing a bit with dismax and
bq.
I think the problem is just in how the facets are being used, I think a
customized SpellChecker sounds like the right component to provide smart
suggestions.


2011/6/20 Toke Eskildsen t...@statsbiblioteket.dk

 On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote:
  Do you know if it is possible to show the facets for a particular field
  related only to the first N docs of the total number of results?

 It collides with the inner working in Solr, as faceting does not process
 the doc-IDs from the matching documents in result order. It also uses
 all the hits, but that could be hacked.

 What is N? If it is a fairly low number (hundreds) and your documents
 are indexed with an unique ID, you can extract the IDs and perform a
 facet-request with the ORed IDs as query.


 I am a bit curious about what you're trying to achieve here.
 Conventionally, faceting provides an overview of all data, often
 prioritized by occurrence count. While I understand the idea of trying
 to use weights to prioritize, limiting the faceting to a subset of the
 result set seems very much like a standard ranked document search.


my use case (that is my customer's) sounds like a mixed one; as I said I
suspect that an interesting try would be mixing the spellcheck's result with
facets using spellcheck's suggestions as facet queries.

Thanks all for your responses as they were very useful to understand how to
face my use case.
Regards,
Tommaso


Re: Showing facet of first N docs

2011-06-18 Thread Dmitry Kan
Do you mean you would like to boost the facets that contain the most of the
lemmas?
What is the user query in this case and if possible, what is the use case
(may be some other solution exists for what you are trying to achieve)?

On Thu, Jun 16, 2011 at 5:23 PM, Tommaso Teofili
tommaso.teof...@gmail.comwrote:

 Thanks Dmitry, but maybe I didn't explain correctly as I am not sure
 facet.offset is the right solution, I'd like not to page but to filter
 facets.
 I'll try to explain better with an example.
 Imagine I make a query and first 2 docs in results have both 'xyz' and
 'abc'
 as values for field 'lemmas' while also other docs in the results have
 'xyz'
 or 'abc' as values of field 'lemmas' then I would like to show facets
 coming from only the first 2 docs in the results thus having :
 lst name=lemmas
  str name=xyz2/str
  str name=abc2/str
 /lst
 You can imagine this like a 'give me only facets related to the most
 relevant docs in the results' functionality.
 Any idea on how to do that?
 Tommaso


 2011/6/16 Dmitry Kan dmitry@gmail.com

  http://wiki.apache.org/solr/SimpleFacetParameters
  facet.offset
 
  This param indicates an offset into the list of constraints to allow
  paging.
 
  The default value is 0.
 
  This parameter can be specified on a per field basis.
 
 
  Dmitry
 
 
  On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
  tommaso.teof...@gmail.comwrote:
 
   Hi all,
   Do you know if it is possible to show the facets for a particular field
   related only to the first N docs of the total number of results?
   It seems facet.limit doesn't help with it as it defines a window in the
   facet constraints returned.
   Thanks in advance,
   Tommaso
  
 
 
 
  --
  Regards,
 
  Dmitry Kan
 




-- 
Regards,

Dmitry Kan


Re: Showing facet of first N docs

2011-06-18 Thread lee carroll
Hi Tommaso

I don't think you can achieve what you want using vanilla solr.
Facet counts will be for the result set matching not for the top n
result sets matching.

However what is your use case ? Assuming its for faceted navigation
showing facets for the
top n result sets could be confusing to your users. As the next
incremental filter applied by the user would change the relevancy
focus of the user and produce another set of top n facet counts with
a document set un-related to the last result set. This could be a very
bad user experience producing a fluctuating facet counts (ie a filter
narrowing the search could produce an increase in a facet term count -
very odd) also the result set could change strangely with docs
floating in and out of the result list.

relevancy seems to be the answer here - if your docs are scored
correctly then counting all docs in the result set for the facet
counts is correct. do you need to improve relevancy?




On 18 June 2011 08:23, Dmitry Kan dmitry@gmail.com wrote:
 Do you mean you would like to boost the facets that contain the most of the
 lemmas?
 What is the user query in this case and if possible, what is the use case
 (may be some other solution exists for what you are trying to achieve)?

 On Thu, Jun 16, 2011 at 5:23 PM, Tommaso Teofili
 tommaso.teof...@gmail.comwrote:

 Thanks Dmitry, but maybe I didn't explain correctly as I am not sure
 facet.offset is the right solution, I'd like not to page but to filter
 facets.
 I'll try to explain better with an example.
 Imagine I make a query and first 2 docs in results have both 'xyz' and
 'abc'
 as values for field 'lemmas' while also other docs in the results have
 'xyz'
 or 'abc' as values of field 'lemmas' then I would like to show facets
 coming from only the first 2 docs in the results thus having :
 lst name=lemmas
  str name=xyz2/str
  str name=abc2/str
 /lst
 You can imagine this like a 'give me only facets related to the most
 relevant docs in the results' functionality.
 Any idea on how to do that?
 Tommaso


 2011/6/16 Dmitry Kan dmitry@gmail.com

  http://wiki.apache.org/solr/SimpleFacetParameters
  facet.offset
 
  This param indicates an offset into the list of constraints to allow
  paging.
 
  The default value is 0.
 
  This parameter can be specified on a per field basis.
 
 
  Dmitry
 
 
  On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
  tommaso.teof...@gmail.comwrote:
 
   Hi all,
   Do you know if it is possible to show the facets for a particular field
   related only to the first N docs of the total number of results?
   It seems facet.limit doesn't help with it as it defines a window in the
   facet constraints returned.
   Thanks in advance,
   Tommaso
  
 
 
 
  --
  Regards,
 
  Dmitry Kan
 




 --
 Regards,

 Dmitry Kan



Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili
Hi all,
Do you know if it is possible to show the facets for a particular field
related only to the first N docs of the total number of results?
It seems facet.limit doesn't help with it as it defines a window in the
facet constraints returned.
Thanks in advance,
Tommaso


Re: Showing facet of first N docs

2011-06-16 Thread Dmitry Kan
http://wiki.apache.org/solr/SimpleFacetParameters
facet.offset

This param indicates an offset into the list of constraints to allow paging.

The default value is 0.

This parameter can be specified on a per field basis.


Dmitry


On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
tommaso.teof...@gmail.comwrote:

 Hi all,
 Do you know if it is possible to show the facets for a particular field
 related only to the first N docs of the total number of results?
 It seems facet.limit doesn't help with it as it defines a window in the
 facet constraints returned.
 Thanks in advance,
 Tommaso




-- 
Regards,

Dmitry Kan


Re: Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili
Thanks Dmitry, but maybe I didn't explain correctly as I am not sure
facet.offset is the right solution, I'd like not to page but to filter
facets.
I'll try to explain better with an example.
Imagine I make a query and first 2 docs in results have both 'xyz' and 'abc'
as values for field 'lemmas' while also other docs in the results have 'xyz'
or 'abc' as values of field 'lemmas' then I would like to show facets
coming from only the first 2 docs in the results thus having :
lst name=lemmas
  str name=xyz2/str
  str name=abc2/str
/lst
You can imagine this like a 'give me only facets related to the most
relevant docs in the results' functionality.
Any idea on how to do that?
Tommaso


2011/6/16 Dmitry Kan dmitry@gmail.com

 http://wiki.apache.org/solr/SimpleFacetParameters
 facet.offset

 This param indicates an offset into the list of constraints to allow
 paging.

 The default value is 0.

 This parameter can be specified on a per field basis.


 Dmitry


 On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
 tommaso.teof...@gmail.comwrote:

  Hi all,
  Do you know if it is possible to show the facets for a particular field
  related only to the first N docs of the total number of results?
  It seems facet.limit doesn't help with it as it defines a window in the
  facet constraints returned.
  Thanks in advance,
  Tommaso
 



 --
 Regards,

 Dmitry Kan



Re: Showing facet of first N docs

2011-06-16 Thread karsten-solr
Hi Tommaso,

the FacetComponent works with the DocListAndSet#docSet.
It should be easy to switch to DocListAndSet#docList (which contains all 
documents for result list (default: TOP-10, but possible 15-25 (if start=15, 
rows=11). Which means to change the source code.

Instead of changing the source-code the easier way should be to send a second 
request with relevance-Filter (if your sort-criteria is relevance):
 http://lucene.472066.n3.nabble.com/Filter-by-relevance-td1837486.html

Best regards
  Karsten

http://lucene.472066.n3.nabble.com/Showing-facet-of-first-N-docs-td3071395.html
 Original-Nachricht 
 Datum: Thu, 16 Jun 2011 12:39:32 +0200
 Von: Tommaso Teofili tommaso.teof...@gmail.com
 An: solr-user@lucene.apache.org
 Betreff: Showing facet of first N docs

 Hi all,
 Do you know if it is possible to show the facets for a particular field
 related only to the first N docs of the total number of results?
 It seems facet.limit doesn't help with it as it defines a window in the
 facet constraints returned.
 Thanks in advance,
 Tommaso