Re: Showing facet of first N docs
On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote: Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It collides with the inner working in Solr, as faceting does not process the doc-IDs from the matching documents in result order. It also uses all the hits, but that could be hacked. What is N? If it is a fairly low number (hundreds) and your documents are indexed with an unique ID, you can extract the IDs and perform a facet-request with the ORed IDs as query. I am a bit curious about what you're trying to achieve here. Conventionally, faceting provides an overview of all data, often prioritized by occurrence count. While I understand the idea of trying to use weights to prioritize, limiting the faceting to a subset of the result set seems very much like a standard ranked document search.
Re: Showing facet of first N docs
2011/6/18 Dmitry Kan dmitry@gmail.com Do you mean you would like to boost the facets that contain the most of the lemmas? That would be good, but I'd prefer getting facets, for example, from first 50 of 500 docs only . What is the user query in this case and if possible, what is the use case (may be some other solution exists for what you are trying to achieve)? the use case is to help the user refining a query with the most relevant facets, which in theory come from the most relevant documents. So with 500 results being sorted by score (desc) the facet counts would come also from the documents ranked 490 to 500, which contain less relevant information. 2011/6/18 lee carroll lee.a.carr...@googlemail.com Hi Tommaso I don't think you can achieve what you want using vanilla solr. Facet counts will be for the result set matching not for the top n result sets matching. However what is your use case ? Assuming its for faceted navigation showing facets for the top n result sets could be confusing to your users. As the next incremental filter applied by the user would change the relevancy focus of the user and produce another set of top n facet counts with a document set un-related to the last result set. This could be a very bad user experience producing a fluctuating facet counts (ie a filter narrowing the search could produce an increase in a facet term count - very odd) also the result set could change strangely with docs floating in and out of the result list. Right :-) Thanks for pointing this out. relevancy seems to be the answer here - if your docs are scored correctly then counting all docs in the result set for the facet counts is correct. do you need to improve relevancy? I have a quite good relevance obtained after playing a bit with dismax and bq. I think the problem is just in how the facets are being used, I think a customized SpellChecker sounds like the right component to provide smart suggestions. 2011/6/20 Toke Eskildsen t...@statsbiblioteket.dk On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote: Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It collides with the inner working in Solr, as faceting does not process the doc-IDs from the matching documents in result order. It also uses all the hits, but that could be hacked. What is N? If it is a fairly low number (hundreds) and your documents are indexed with an unique ID, you can extract the IDs and perform a facet-request with the ORed IDs as query. I am a bit curious about what you're trying to achieve here. Conventionally, faceting provides an overview of all data, often prioritized by occurrence count. While I understand the idea of trying to use weights to prioritize, limiting the faceting to a subset of the result set seems very much like a standard ranked document search. my use case (that is my customer's) sounds like a mixed one; as I said I suspect that an interesting try would be mixing the spellcheck's result with facets using spellcheck's suggestions as facet queries. Thanks all for your responses as they were very useful to understand how to face my use case. Regards, Tommaso
Re: Showing facet of first N docs
Do you mean you would like to boost the facets that contain the most of the lemmas? What is the user query in this case and if possible, what is the use case (may be some other solution exists for what you are trying to achieve)? On Thu, Jun 16, 2011 at 5:23 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Thanks Dmitry, but maybe I didn't explain correctly as I am not sure facet.offset is the right solution, I'd like not to page but to filter facets. I'll try to explain better with an example. Imagine I make a query and first 2 docs in results have both 'xyz' and 'abc' as values for field 'lemmas' while also other docs in the results have 'xyz' or 'abc' as values of field 'lemmas' then I would like to show facets coming from only the first 2 docs in the results thus having : lst name=lemmas str name=xyz2/str str name=abc2/str /lst You can imagine this like a 'give me only facets related to the most relevant docs in the results' functionality. Any idea on how to do that? Tommaso 2011/6/16 Dmitry Kan dmitry@gmail.com http://wiki.apache.org/solr/SimpleFacetParameters facet.offset This param indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified on a per field basis. Dmitry On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso -- Regards, Dmitry Kan -- Regards, Dmitry Kan
Re: Showing facet of first N docs
Hi Tommaso I don't think you can achieve what you want using vanilla solr. Facet counts will be for the result set matching not for the top n result sets matching. However what is your use case ? Assuming its for faceted navigation showing facets for the top n result sets could be confusing to your users. As the next incremental filter applied by the user would change the relevancy focus of the user and produce another set of top n facet counts with a document set un-related to the last result set. This could be a very bad user experience producing a fluctuating facet counts (ie a filter narrowing the search could produce an increase in a facet term count - very odd) also the result set could change strangely with docs floating in and out of the result list. relevancy seems to be the answer here - if your docs are scored correctly then counting all docs in the result set for the facet counts is correct. do you need to improve relevancy? On 18 June 2011 08:23, Dmitry Kan dmitry@gmail.com wrote: Do you mean you would like to boost the facets that contain the most of the lemmas? What is the user query in this case and if possible, what is the use case (may be some other solution exists for what you are trying to achieve)? On Thu, Jun 16, 2011 at 5:23 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Thanks Dmitry, but maybe I didn't explain correctly as I am not sure facet.offset is the right solution, I'd like not to page but to filter facets. I'll try to explain better with an example. Imagine I make a query and first 2 docs in results have both 'xyz' and 'abc' as values for field 'lemmas' while also other docs in the results have 'xyz' or 'abc' as values of field 'lemmas' then I would like to show facets coming from only the first 2 docs in the results thus having : lst name=lemmas str name=xyz2/str str name=abc2/str /lst You can imagine this like a 'give me only facets related to the most relevant docs in the results' functionality. Any idea on how to do that? Tommaso 2011/6/16 Dmitry Kan dmitry@gmail.com http://wiki.apache.org/solr/SimpleFacetParameters facet.offset This param indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified on a per field basis. Dmitry On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso -- Regards, Dmitry Kan -- Regards, Dmitry Kan
Showing facet of first N docs
Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso
Re: Showing facet of first N docs
http://wiki.apache.org/solr/SimpleFacetParameters facet.offset This param indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified on a per field basis. Dmitry On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso -- Regards, Dmitry Kan
Re: Showing facet of first N docs
Thanks Dmitry, but maybe I didn't explain correctly as I am not sure facet.offset is the right solution, I'd like not to page but to filter facets. I'll try to explain better with an example. Imagine I make a query and first 2 docs in results have both 'xyz' and 'abc' as values for field 'lemmas' while also other docs in the results have 'xyz' or 'abc' as values of field 'lemmas' then I would like to show facets coming from only the first 2 docs in the results thus having : lst name=lemmas str name=xyz2/str str name=abc2/str /lst You can imagine this like a 'give me only facets related to the most relevant docs in the results' functionality. Any idea on how to do that? Tommaso 2011/6/16 Dmitry Kan dmitry@gmail.com http://wiki.apache.org/solr/SimpleFacetParameters facet.offset This param indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified on a per field basis. Dmitry On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso -- Regards, Dmitry Kan
Re: Showing facet of first N docs
Hi Tommaso, the FacetComponent works with the DocListAndSet#docSet. It should be easy to switch to DocListAndSet#docList (which contains all documents for result list (default: TOP-10, but possible 15-25 (if start=15, rows=11). Which means to change the source code. Instead of changing the source-code the easier way should be to send a second request with relevance-Filter (if your sort-criteria is relevance): http://lucene.472066.n3.nabble.com/Filter-by-relevance-td1837486.html Best regards Karsten http://lucene.472066.n3.nabble.com/Showing-facet-of-first-N-docs-td3071395.html Original-Nachricht Datum: Thu, 16 Jun 2011 12:39:32 +0200 Von: Tommaso Teofili tommaso.teof...@gmail.com An: solr-user@lucene.apache.org Betreff: Showing facet of first N docs Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso