Re: How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery

2018-08-24 Thread Mikhail Khludnev
There might be something like fq=filter(foo:[2 TO 3]) OR filter(foo:[3 TO
100])

On Fri, Aug 24, 2018 at 2:23 PM zhenyuan wei  wrote:

> Hi All,
> I am confuse about How to hit filterCache?
>
> If filterQuery is range [3 to 100] , but not cache in FilterCache,
> and  filterCache already exists  filterQuery range [2 to 100],
>
> My question is " Dose this filterQuery range [3 to 100]  will  fetch DocSet
> from FilterCache range[2 to 100]" ?
>


-- 
Sincerely yours
Mikhail Khludnev


Re: How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery

2018-08-24 Thread Shawn Heisey

On 8/24/2018 5:23 AM, zhenyuan wei wrote:

I am confuse about How to hit filterCache?

If filterQuery is range [3 to 100] , but not cache in FilterCache,
and  filterCache already exists  filterQuery range [2 to 100],

My question is " Dose this filterQuery range [3 to 100]  will  fetch DocSet
from FilterCache range[2 to 100]" ?


Each entry in the filterCache uses the query as its key.  So for the 
first one, the key will be something like "field:[3 TO 100]" or whatever 
your fq parameter value was.  When the second one is executed, it will 
have a different key, so it will not be found in the cache.  Once it 
executes, it will be added to the cache as an additional entry.


Thanks,
Shawn



Re: How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery

2018-08-24 Thread Emir Arnautović
Hi,
No it will not and it does not make sense to - it would still have to apply 
filter on top of cached results since they can include values with 2. You can 
consider a query as entry into cache.

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 24 Aug 2018, at 13:23, zhenyuan wei  wrote:
> 
> Hi All,
> I am confuse about How to hit filterCache?
> 
> If filterQuery is range [3 to 100] , but not cache in FilterCache,
> and  filterCache already exists  filterQuery range [2 to 100],
> 
> My question is " Dose this filterQuery range [3 to 100]  will  fetch DocSet
> from FilterCache range[2 to 100]" ?



How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery

2018-08-24 Thread zhenyuan wei
Hi All,
I am confuse about How to hit filterCache?

If filterQuery is range [3 to 100] , but not cache in FilterCache,
and  filterCache already exists  filterQuery range [2 to 100],

My question is " Dose this filterQuery range [3 to 100]  will  fetch DocSet
from FilterCache range[2 to 100]" ?


is it possible to consolidate filterquery cache strings

2014-03-03 Thread solr-user
lets say I have a largish set of data (120M docs) and that I am partitioning
my data by groups of states (using the state codes)

Someone suggested that I could use the following format in my solrconfig.xml
when defining the filterqueries work:

listener event=newSearcher class=solr.QuerySenderListener
  arr name=queries
lst
  str name=q*:*/str
  str name=fqState:AL/str
  str name=fqState:AK/str
...
  str name=fqState:WY/str
  /arr
/listener

Would that work, and if so how would I know that the cache is being hit?

Or do I need to use the following traditional syntax instead:

listener event=newSearcher class=solr.QuerySenderListener
  arr name=queries
lst
  str name=q*:*/str
  str name=fqState:AL/str
/str
lst
  str name=q*:*/str
  str name=fqState:AK/str
/str
...
lst
  str name=q*:*/str
  str name=fqState:WY/str
/str
  /arr
/listener

any help appreciated



--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is it possible to consolidate filterquery cache strings

2014-03-03 Thread solr-user
note: by partitioning I mean that I have sharded the 120M docs into 9 Solr
partitions (each on a separate server)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005p4121012.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is it possible to consolidate filterquery cache strings

2014-03-03 Thread Chris Hostetter


: Would that work, and if so how would I know that the cache is being hit?

It should work -- filters are evaluated independently, so the fact that 
you are using all of them in query query (vs all of them in individual 
queries) won't change anything as far as the filterCache goes.

You can prove that it works by looking at the cache stats (available 
from the Admin UI) after opening a new searcher and verifying that they 
are all in the new caches.  you can also then do a query for soemthing 
like q=foofq=State:AK and reload the cache stats and see a hit on 
your filterCcahe.

: Or do I need to use the following traditional syntax instead:

The only reason to break them all out like that is if you in addition to 
populating the *filterCache* you also want to populate the 
*queryResultCache* with ~50 queries for *:* each with a different fq 
applied.



-Hoss
http://www.lucidworks.com/


Re: is it possible to consolidate filterquery cache strings

2014-03-03 Thread solr-user
would not breaking the FQs out by state be faster for warming up the fq
caches?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005p4121030.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problems with SolrEnitityProcessor + frange filterQuery

2012-09-21 Thread Dirceu Vieira
Hi Jack,

Your suggestion works perfectly!
Thank you very much!!

it ended up being something like this:

query=_query_:'status:1 AND NOT priority:\-1' AND _query_:'{!frange l=3000
u=5000}max(sum(suser_count), sum(user_count))' 

Regards,

Dirceu

On Thu, Sep 20, 2012 at 10:46 PM, Jack Krupansky j...@basetechnology.comwrote:

 Sorry, but it looks like the SolrEntityProcessor does a raw split on
 commas of its fq parameter, with no provision for escaping.

 You should be able to combine the fq into the query parameter as a nested
 query which does not have the split issue.

 -- Jack Krupansky

 -Original Message- From: Dirceu Vieira
 Sent: Thursday, September 20, 2012 4:16 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Problems with SolrEnitityProcessor + frange filterQuery


 Hi guys,

 Has anybody got any idea about that?
 I'm really open for any suggestions

 Thanks!

 Dirceu

 On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira dirceu...@gmail.com
 wrote:

  Hi,

 I'm attempting to write a filter query for my SolrEntityProcessor using
 {frange} over a function.
 It works fine when I'm testing it on the admin, but once I move it into my
 data-config.xml the query blows up because of the commas in the function.
 The problem is that fq parameter can be a comma separated list, which
 means that if I have commas within my query, it'll try to split it into
 multiple filter queries.

 Does anybody knows a way of escaping the comma or another way I can work
 around that?

 I've been using SolrEntityProcessor to import filtered data from a core to
 another, here's the queries:

 query=status:1 AND NOT priority:\-1
 fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count))

 I'm using Solr-4.0.0-BETA.



 Best regards,

 --
 Dirceu Vieira Júnior
 --**--**---
 +47 9753 2473
 dirceuvjr.blogspot.com
 twitter.com/dirceuvjr




 --
 Dirceu Vieira Júnior
 --**--**---
 +47 9753 2473
 dirceuvjr.blogspot.com
 twitter.com/dirceuvjr




-- 
Dirceu Vieira Júnior
---
+47 9753 2473
dirceuvjr.blogspot.com
twitter.com/dirceuvjr


Problems with SolrEnitityProcessor + frange filterQuery

2012-09-20 Thread Dirceu Vieira
Hi,

I'm attempting to write a filter query for my SolrEntityProcessor using
{frange} over a function.
It works fine when I'm testing it on the admin, but once I move it into my
data-config.xml the query blows up because of the commas in the function.
The problem is that fq parameter can be a comma separated list, which means
that if I have commas within my query, it'll try to split it into multiple
filter queries.

Does anybody knows a way of escaping the comma or another way I can work
around that?

I've been using SolrEntityProcessor to import filtered data from a core to
another, here's the queries:

query=status:1 AND NOT priority:\-1
fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count))

I'm using Solr-4.0.0-BETA.



Best regards,

-- 
Dirceu Vieira Júnior
---
+47 9753 2473
dirceuvjr.blogspot.com
twitter.com/dirceuvjr


Re: Problems with SolrEnitityProcessor + frange filterQuery

2012-09-20 Thread Dirceu Vieira
Hi guys,

Has anybody got any idea about that?
I'm really open for any suggestions

Thanks!

Dirceu

On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira dirceu...@gmail.com wrote:

 Hi,

 I'm attempting to write a filter query for my SolrEntityProcessor using
 {frange} over a function.
 It works fine when I'm testing it on the admin, but once I move it into my
 data-config.xml the query blows up because of the commas in the function.
 The problem is that fq parameter can be a comma separated list, which
 means that if I have commas within my query, it'll try to split it into
 multiple filter queries.

 Does anybody knows a way of escaping the comma or another way I can work
 around that?

 I've been using SolrEntityProcessor to import filtered data from a core to
 another, here's the queries:

 query=status:1 AND NOT priority:\-1
 fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count))

 I'm using Solr-4.0.0-BETA.



 Best regards,

 --
 Dirceu Vieira Júnior
 ---
 +47 9753 2473
 dirceuvjr.blogspot.com
 twitter.com/dirceuvjr




-- 
Dirceu Vieira Júnior
---
+47 9753 2473
dirceuvjr.blogspot.com
twitter.com/dirceuvjr


Re: Problems with SolrEnitityProcessor + frange filterQuery

2012-09-20 Thread Jack Krupansky
Sorry, but it looks like the SolrEntityProcessor does a raw split on commas 
of its fq parameter, with no provision for escaping.


You should be able to combine the fq into the query parameter as a nested 
query which does not have the split issue.


-- Jack Krupansky

-Original Message- 
From: Dirceu Vieira

Sent: Thursday, September 20, 2012 4:16 PM
To: solr-user@lucene.apache.org
Subject: Re: Problems with SolrEnitityProcessor + frange filterQuery

Hi guys,

Has anybody got any idea about that?
I'm really open for any suggestions

Thanks!

Dirceu

On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira dirceu...@gmail.com wrote:


Hi,

I'm attempting to write a filter query for my SolrEntityProcessor using
{frange} over a function.
It works fine when I'm testing it on the admin, but once I move it into my
data-config.xml the query blows up because of the commas in the function.
The problem is that fq parameter can be a comma separated list, which
means that if I have commas within my query, it'll try to split it into
multiple filter queries.

Does anybody knows a way of escaping the comma or another way I can work
around that?

I've been using SolrEntityProcessor to import filtered data from a core to
another, here's the queries:

query=status:1 AND NOT priority:\-1
fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count))

I'm using Solr-4.0.0-BETA.



Best regards,

--
Dirceu Vieira Júnior
---
+47 9753 2473
dirceuvjr.blogspot.com
twitter.com/dirceuvjr





--
Dirceu Vieira Júnior
---
+47 9753 2473
dirceuvjr.blogspot.com
twitter.com/dirceuvjr 



RE: OR-FilterQuery

2012-02-15 Thread spring
  q=some text
  fq=id:(1 OR 2 OR 3...)
 
  Should I better use q:some text AND id:(1 OR 2 OR 3...)?
 
 1. These two opts have the different scoring.
 2. if you hit same fq=id:(1 OR 2 OR 3...) many times you have 
 a benefit due
 to reading docset from heap instead of searching on disk.

OK, understood.
Thank you.



RE: OR-FilterQuery

2012-02-15 Thread spring
 In other words, there's no attempt to decompose the fq clause
 and store parts of it in the cache, it's exact-match or
 nothing.

Ah ok, thank you.



Re: OR-FilterQuery

2012-02-14 Thread Mikhail Khludnev
On Mon, Feb 13, 2012 at 11:17 PM, spr...@gmx.eu wrote:

 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

1. These two opts have the different scoring.
2. if you hit same fq=id:(1 OR 2 OR 3...) many times you have a benefit due
to reading docset from heap instead of searching on disk.



 Is the Filter Cache used for the OR'ed fq?

Filter cache is used for whatever filter. I guess I didn't get you. Can't
you rephrase your question?



 Thank you




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: OR-FilterQuery

2012-02-14 Thread Erick Erickson
bq:  Is the Filter Cache used for the OR'ed fq?

The filter cache is actually pretty simple conceptually. It's
just a map where the key is the fq and the value is the set
of documents that satisfy that fq (we'll skip the implementation
here, just think of it as the list of all the docs that the fq selects).

Solr doesn't attempt to do much with the key, just think of it
as a single string. Whether or not an fq is reused from the
cache depends upon whether the key is in the map.

So fq=id:(1 OR 2 OR 3) will just look to see if
id:(1 OR 2 OR 3) is a key. If so, it'll just use the
document list stored in the cache.

It won't match
id:(1 OR 2)
or
id:(2)
or
id:1 OR id:2 OR id:3

In other words, there's no attempt to decompose the fq clause
and store parts of it in the cache, it's exact-match or
nothing.

Hope that helps
Erick

On Mon, Feb 13, 2012 at 2:17 PM,  spr...@gmx.eu wrote:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you



Re: OR-FilterQuery

2012-02-14 Thread Mikhail Khludnev
Hi Em,

I briefly read the thread. Are you talking about combing of cached clauses
of BooleanQuery, instead of evaluating whole BQ as a filter?

I found something like that in API (but only in API)
http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

Am I get you right? Why do you need it, btw? If I'm ..
I have idea how to do it in two mins:

q=+f:text
+(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...

Right leg will be a BooleanQuery with SHOULD clauses backed on cached
queries (see below).

if you are not scarred by the syntax yet you can implement trivial
fqQParserPlugin, which will be just

// lazily through User/Generic Cache
q = new FilteredQuery (new MatchAllDocsQuery(), new
CachingWrapperFilter(new
QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
return q;

it will use per segment bitset at contrast to Solr's fq which caches for
top level reader.

WDYT?

On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
  Hi,
 
  how efficent is such an query:
 
  q=some text
  fq=id:(1 OR 2 OR 3...)
 
  Should I better use q:some text AND id:(1 OR 2 OR 3...)?
 
  Is the Filter Cache used for the OR'ed fq?
 
  Thank you
 
 




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: OR-FilterQuery

2012-02-14 Thread Em
Hi Mikhail,

thanks for kicking in some brainstorming-code!
The given thread is almost a year old and I was working with Solr in my
freetime to see where it fails to behave/perform as I expect/wish.

I found out that if you got a lot of different access-patterns for a
filter-query, you might end up with either a big cache to make things
fast or with lower performance (impact depends on usecase and
circumstances).

Scenario:
You got a permission-field and the client is able to filter by one to
three permission-values.
That is:
fq=foo:user
fq=foo:moderator
fq=foo:manager

If you can not control/guarantee the order of the fq's values, you could
end up with a lot of mess which all returns the same.

Example:
fq=permission:user OR permission:moderator OR permission:manager
fq=permission:user OR permission:manager OR permission:moderator
fq=permission:moderator OR permission:user OR permission:manager
...
They all return the same but where cached seperately which leads to the
fact that you are wasting memory a lot.

Furthermore, if your access pattern will lead to a lot of different fq's
on a small set of distinct values, it may make more sense to cache each
filter-query for itself from a memory-consuming point of view (may cost
a little bit performance).

That beeing said, if you cache a filter for foo:user, foo:moderator and
foo:manager you can combine those filters with AND, OR, NOT or whatever
without recomputing every filter over and over again which would be the
case if your filter-cache is not large enough.

However, I never compared the performance differences (in terms of
speed) of a cached filter-query like
foo:bar OR foo:baz
With a combination of two cached filter-queries like
foo:bar
foo:baz
combined by a logical OR.

That's how the background looks like.
Unfortunately I didn't had the time to implement this in the past.

Back to your post:
Looks like a cool idea and is almost what I had in mind!

I would formulate an easier syntax so that one is able to parse each
fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
Could you explain why this bitset would be per-segment based, please?
I don't see a reason why this *have* to be so.
What is the benefit you are seeing?

Kind regards,
Em

Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,
 
 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?
 
 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)
 
 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:
 
 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...
 
 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).
 
 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just
 
 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;
 
 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 
 WDYT?
 
 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:
 
 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you



 
 
 


Re: OR-FilterQuery

2012-02-14 Thread Erick Erickson
Whoa!

fq=id(1 OR 2)
is not the same thing at all as
fq=id:1fq=id:2

Assuming that any document had one and only one ID,  the second clause
would return exactly 0 documents, each and every time.

Multiple fq clauses are essentially set intersections. So the first query is the
set of all documents where id is 1 or 2
the second is the intersection of two sets of documents, one set
with an id of 1 and one with an id of 2. Not the same thing at all.

There's no support for the concept of
(fq=id:1 OR fq=id:2)

Best
Erick

On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again which would be the
 case if your filter-cache is not large enough.

 However, I never compared the performance differences (in terms of
 speed) of a cached filter-query like
 foo:bar OR foo:baz
 With a combination of two cached filter-queries like
 foo:bar
 foo:baz
 combined by a logical OR.

 That's how the background looks like.
 Unfortunately I didn't had the time to implement this in the past.

 Back to your post:
 Looks like a cool idea and is almost what I had in mind!

 I would formulate an easier syntax so that one is able to parse each
 fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 Could you explain why this bitset would be per-segment based, please?
 I don't see a reason why this *have* to be so.
 What is the benefit you are seeing?

 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,

 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?

 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:

 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...

 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).

 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just

 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.

 WDYT?

 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you








Re: OR-FilterQuery

2012-02-14 Thread Erick Erickson
BTW, you're not the first person who would like this capability, see:
https://issues.apache.org/jira/browse/SOLR-1223

But the fact that this JIRA was originally opened in in June of 2009
and hasn't been implemented yet indicates that it's not  super-high
priority.

Best
Erick

On Tue, Feb 14, 2012 at 4:33 PM, Erick Erickson erickerick...@gmail.com wrote:
 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2

 Assuming that any document had one and only one ID,  the second clause
 would return exactly 0 documents, each and every time.

 Multiple fq clauses are essentially set intersections. So the first query is 
 the
 set of all documents where id is 1 or 2
 the second is the intersection of two sets of documents, one set
 with an id of 1 and one with an id of 2. Not the same thing at all.

 There's no support for the concept of
 (fq=id:1 OR fq=id:2)

 Best
 Erick

 On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again which would be the
 case if your filter-cache is not large enough.

 However, I never compared the performance differences (in terms of
 speed) of a cached filter-query like
 foo:bar OR foo:baz
 With a combination of two cached filter-queries like
 foo:bar
 foo:baz
 combined by a logical OR.

 That's how the background looks like.
 Unfortunately I didn't had the time to implement this in the past.

 Back to your post:
 Looks like a cool idea and is almost what I had in mind!

 I would formulate an easier syntax so that one is able to parse each
 fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 Could you explain why this bitset would be per-segment based, please?
 I don't see a reason why this *have* to be so.
 What is the benefit you are seeing?

 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,

 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?

 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:

 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 
 _query_:{!fq}id:4)...

 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).

 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just

 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.

 WDYT?

 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you








Re: OR-FilterQuery

2012-02-14 Thread Em
Hi Erick,

 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2
Ahm, who said they would be the same? :)
I mean, you are completely right in what you are saying but it seems to
me that we are talking about two different things.

I was talking about caching each filter-criteria instead of the whole
filter-query to recombine the cached filter-criteria based on the
boolean-operators the client sends.

In other words:
currently
fq=id:1 OR id:2
results into ONE cached filter-entry.

fq=id:2 OR id:1
results into ANOTHER cached filter-entry

fq=id:2 AND id:1
results into (surprise, surprise) a third filter-entry (although this
example does not make sense).

My idea was to cache each filter-criteria, that means caching the bitset
for id:1 and the bitset for id:2 to recombine both bitsets via AND, OR,
NOT etc. whenever this is neccessary.

This way one could save memory (and maybe computing-time as well) which
definitely makes sense when you got a way smaller set of
filter-criterias while having a much larger set of possible (and used)
combinations of each filter-criteria with a small number of repetitions
per combination (which would destroy the benefit of caching).

Don't you agree?

Kind regards,
Em


Am 14.02.2012 22:33, schrieb Erick Erickson:
 Whoa!
 
 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2
 
 Assuming that any document had one and only one ID,  the second clause
 would return exactly 0 documents, each and every time.
 
 Multiple fq clauses are essentially set intersections. So the first query is 
 the
 set of all documents where id is 1 or 2
 the second is the intersection of two sets of documents, one set
 with an id of 1 and one with an id of 2. Not the same thing at all.
 
 There's no support for the concept of
 (fq=id:1 OR fq=id:2)
 
 Best
 Erick
 
 On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again which would be the
 case if your filter-cache is not large enough.

 However, I never compared the performance differences (in terms of
 speed) of a cached filter-query like
 foo:bar OR foo:baz
 With a combination of two cached filter-queries like
 foo:bar
 foo:baz
 combined by a logical OR.

 That's how the background looks like.
 Unfortunately I didn't had the time to implement this in the past.

 Back to your post:
 Looks like a cool idea and is almost what I had in mind!

 I would formulate an easier syntax so that one is able to parse each
 fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 Could you explain why this bitset would be per-segment based, please?
 I don't see a reason why this *have* to be so.
 What is the benefit you are seeing?

 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,

 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?

 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:

 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 
 _query_:{!fq}id:4)...

 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).

 if you are not scarred by the syntax yet you can implement trivial
 

Re: OR-FilterQuery

2012-02-14 Thread Erick Erickson
Ah, OK, I misread your post apparently. And yes, what you suggest
would result in some efficiencies, but at present I don't think there's any
syntax that allows one to combine filter queries as you suggest. There
was some discussion about it in the JIRA I referenced, but no action that
I could see.

That is, efficiencies in some circumstances, though I think it would be
hard to predict. For instance, imagine a set of 100 entries in an FQ. And
no, I'm not making things up, I've seen applications where this makes
sense. Splitting that out into 100 separate entries in the filterCache would
use up a lot of space. Likewise, I suspect that the actual process of
creating the heuristics that were able to analyze an incoming filter
query and do the right thing in terms of splitting it up and recombining
it would be pretty hairy. Local parameters for instance, and let's throw in
dereferencing too G...

So I suspect that this is one of those features that is quite easy to see
the benefits of in the simple case, but pretty quickly becomes a
nightmare to actually implement correctly, but that's mostly
a guess.

And before putting the work into it, I think modeling the actual
benefits would be wise, as well as convincing myself that there
are enough cases where this *would* be beneficial. I mean Solr
does a pretty reasonable job of caching these anyway, and with the
non-cached filters it's not clear to me that the benefits are
sufficient...

Good luck, though, if you want to tackle it!
Erick



On Tue, Feb 14, 2012 at 4:54 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Erick,

 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2
 Ahm, who said they would be the same? :)
 I mean, you are completely right in what you are saying but it seems to
 me that we are talking about two different things.

 I was talking about caching each filter-criteria instead of the whole
 filter-query to recombine the cached filter-criteria based on the
 boolean-operators the client sends.

 In other words:
 currently
 fq=id:1 OR id:2
 results into ONE cached filter-entry.

 fq=id:2 OR id:1
 results into ANOTHER cached filter-entry

 fq=id:2 AND id:1
 results into (surprise, surprise) a third filter-entry (although this
 example does not make sense).

 My idea was to cache each filter-criteria, that means caching the bitset
 for id:1 and the bitset for id:2 to recombine both bitsets via AND, OR,
 NOT etc. whenever this is neccessary.

 This way one could save memory (and maybe computing-time as well) which
 definitely makes sense when you got a way smaller set of
 filter-criterias while having a much larger set of possible (and used)
 combinations of each filter-criteria with a small number of repetitions
 per combination (which would destroy the benefit of caching).

 Don't you agree?

 Kind regards,
 Em


 Am 14.02.2012 22:33, schrieb Erick Erickson:
 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2

 Assuming that any document had one and only one ID,  the second clause
 would return exactly 0 documents, each and every time.

 Multiple fq clauses are essentially set intersections. So the first query is 
 the
 set of all documents where id is 1 or 2
 the second is the intersection of two sets of documents, one set
 with an id of 1 and one with an id of 2. Not the same thing at all.

 There's no support for the concept of
 (fq=id:1 OR fq=id:2)

 Best
 Erick

 On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again 

Re: OR-FilterQuery

2012-02-14 Thread Mikhail Khludnev
On Tue, Feb 14, 2012 at 11:13 PM, Em mailformailingli...@yahoo.de wrote:

 Hi Mikhail,

  it will use per segment bitset at contrast to Solr's fq which caches for
  top level reader.
 Could you explain why this bitset would be per-segment based, please?

I don't see a reason why this *have* to be so.

it's just how org.apache.lucene.search.CachingWrapperFilter works. The
first out-of-the box stuff which I've found.
as an top-level segment alternative we need
org.apache.solr.search.SolrIndexSearcher.getDocSet(Query).

btw, one more top-level snippet

class FQParser extends QParser{

Query parse(...){
  return new SolrConstantScoreQuery(
  solrIndexSearcher.getDocSet(
   subQuery(localParam.get(V))
  ).getTopFilter())
}
}



 What is the benefit you are seeing?

 It seems like two different POVs: Lucene prefer per segment caching to
have fast incremental updates, but maybe 'because it's good but not in
worst case' (I guess I've heard it there
http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011/many-facets-apache-solr)
Solr prefer top-reader caches.


 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
  Hi Em,
 
  I briefly read the thread. Are you talking about combing of cached
 clauses
  of BooleanQuery, instead of evaluating whole BQ as a filter?
 
  I found something like that in API (but only in API)
 
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)
 
  Am I get you right? Why do you need it, btw? If I'm ..
  I have idea how to do it in two mins:
 
  q=+f:text
  +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3
 _query_:{!fq}id:4)...
 
  Right leg will be a BooleanQuery with SHOULD clauses backed on cached
  queries (see below).
 
  if you are not scarred by the syntax yet you can implement trivial
  fqQParserPlugin, which will be just
 
  // lazily through User/Generic Cache
  q = new FilteredQuery (new MatchAllDocsQuery(), new
  CachingWrapperFilter(new
  QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
  return q;
 
  it will use per segment bitset at contrast to Solr's fq which caches for
  top level reader.
 
  WDYT?
 
  On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de
 wrote:
 
  Hi,
 
  have a look at:
  http://search-lucene.com/m/Z8lWGEiKoI
 
  I think not much had changed since then.
 
  Regards,
  Em
 
  Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
  Hi,
 
  how efficent is such an query:
 
  q=some text
  fq=id:(1 OR 2 OR 3...)
 
  Should I better use q:some text AND id:(1 OR 2 OR 3...)?
 
  Is the Filter Cache used for the OR'ed fq?
 
  Thank you
 
 
 
 
 
 




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: OR-FilterQuery

2012-02-14 Thread Em
Hi Mikhail,

 it's just how org.apache.lucene.search.CachingWrapperFilter works. The
 first out-of-the box stuff which I've found.
Thanks for your explanation and snippets - I thought this was configurable.

Regards,
Em

Am 15.02.2012 06:16, schrieb Mikhail Khludnev:
 On Tue, Feb 14, 2012 at 11:13 PM, Em mailformailingli...@yahoo.de wrote:
 
 Hi Mikhail,

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 Could you explain why this bitset would be per-segment based, please?
 
 I don't see a reason why this *have* to be so.

 it's just how org.apache.lucene.search.CachingWrapperFilter works. The
 first out-of-the box stuff which I've found.
 as an top-level segment alternative we need
 org.apache.solr.search.SolrIndexSearcher.getDocSet(Query).
 
 btw, one more top-level snippet
 
 class FQParser extends QParser{
 
 Query parse(...){
   return new SolrConstantScoreQuery(
   solrIndexSearcher.getDocSet(
subQuery(localParam.get(V))
   ).getTopFilter())
 }
 }
 
 
 
 What is the benefit you are seeing?

  It seems like two different POVs: Lucene prefer per segment caching to
 have fast incremental updates, but maybe 'because it's good but not in
 worst case' (I guess I've heard it there
 http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011/many-facets-apache-solr)
 Solr prefer top-reader caches.
 
 
 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,

 I briefly read the thread. Are you talking about combing of cached
 clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?

 I found something like that in API (but only in API)

 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:

 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3
 _query_:{!fq}id:4)...

 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).

 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just

 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.

 WDYT?

 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de
 wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you







 
 
 


OR-FilterQuery

2012-02-13 Thread spring
Hi,

how efficent is such an query:

q=some text
fq=id:(1 OR 2 OR 3...)

Should I better use q:some text AND id:(1 OR 2 OR 3...)?

Is the Filter Cache used for the OR'ed fq?

Thank you



Re: OR-FilterQuery

2012-02-13 Thread Em
Hi,

have a look at:
http://search-lucene.com/m/Z8lWGEiKoI

I think not much had changed since then.

Regards,
Em

Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,
 
 how efficent is such an query:
 
 q=some text
 fq=id:(1 OR 2 OR 3...)
 
 Should I better use q:some text AND id:(1 OR 2 OR 3...)?
 
 Is the Filter Cache used for the OR'ed fq?
 
 Thank you
 
 


Re: filterQuery (fq=) vs q differences other than scoring.

2011-12-11 Thread Erick Erickson
Hmmm, are you talking about SOLR--2429? Some context would help here...

But if you are, that capability was added to deal with situations where
calculating the fq for the entire corpus *then* applying it to the query
results was too expensive. So when you specify one of these high cost
filters, Solr calculates the set of docs that staisfies the initial query,
complete with relevance scores. Then all the lower-cost fqs are applied
which will be cached (assuming they aren't identified as high cost, the
the high cost fq is applied to the results set. That is, each document
that has made it through the initial query selection and all of the
lower-cost fqs has the high-cost fq value (i.e. inclusion/exclusion)
calculated and the doc is removed from the result set if it should be. I
haven't dived into the code to really understand the bit about calculated
in parallel...

Best
Erick
On Dec 9, 2011 4:53 PM, Andrew Lundgren lundg...@familysearch.org wrote:

 I know that fq's are used to improve performance by reducing the data set
 that you score.

 I have read the documentation that says that non-cached fq's are created
 in parallel to your query, but would like to know more about how that is
 done.

 Does it do a match on all the FQ's, then AND the resulting doc sets and
 then once that is done score the query based on the resulting subset of
 documents?


 --
 Andrew Lundgren
 lundg...@familysearch.org


  NOTICE: This email message is for the sole use of the intended
 recipient(s) and may contain confidential and privileged information. Any
 unauthorized review, use, disclosure or distribution is prohibited. If you
 are not the intended recipient, please contact the sender by reply email
 and destroy all copies of the original message.





filterQuery (fq=) vs q differences other than scoring.

2011-12-09 Thread Andrew Lundgren
I know that fq's are used to improve performance by reducing the data set that 
you score.

I have read the documentation that says that non-cached fq's are created in 
parallel to your query, but would like to know more about how that is done.

Does it do a match on all the FQ's, then AND the resulting doc sets and then 
once that is done score the query based on the resulting subset of documents?


--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




Re: custom filterquery

2011-09-01 Thread Chris Hostetter

: pricing.  I have written a functionquery to get the pricing, which works
: fine as part of the search query, but doesn't seem to be doing anything when
: I try to use it in a filter query.  I wrote my pricing function query based

how are you trying to use it in a filter query?

function queries by definition match all documents -- the function value 
just determines the score.

If you want to filter on a function query you have to use something like 
the frange parser to specify that only certian function values should 
match...

https://lucene.apache.org/solr/api/org/apache/solr/search/FunctionRangeQParserPlugin.html


-Hoss


custom filterquery

2011-08-16 Thread Jon Wagoner
Hello,

I am writing software for an e-commerce site.  Different customers can have
different selections of product depending on what is priced out for them, so
to get the faceting counts correct I need to filter the values based on the
pricing.  I have written a functionquery to get the pricing, which works
fine as part of the search query, but doesn't seem to be doing anything when
I try to use it in a filter query.  I wrote my pricing function query based
on
http://www.supermind.org/blog/756/how-to-write-a-custom-solr-functionquery,
and I can see the parser part getting logged from the filter query, but
nothing ever calls getValues on my ValueSource.  If I use my function query
as part of the main query, getValues is getting called.  Can anyone point me
in the right direction to get this working in the filter query?

Jon Wagoner


FilterQuery and Ors

2011-06-08 Thread Jamie Johnson
I'm looking for a way to do a filter query and Ors.  I've done a bit of
googling and found an open jira but nothing indicating this is possible.
I'm looking to do something like the search at
http://www.lucidimagination.com/search/?q=test
where you can do multi selects for the facets.  I've read about it at
http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParamsso
I have the tag/exclusion working but if I select two items from a
facet
group (say age from 1 to 10 and age from 10 to 20) I get nothing because
nothing meets both of those criteria.  I can obviously write something
custom to build an OR out of this but that seems less elegant.  Any guidance
would be appreciated


Re: FilterQuery and Ors

2011-06-08 Thread Erick Erickson
try fq=age:[1 TO 10] OR age:[10 TO 20]

I'm pretty sure

fq=age:([1 TO 10] OR [10 TO 20])

will work too.

But you're right, multiple fq clauses are intersections, so specifying more
than one fq clause on the SAME field results in what you're seeing.

Best
Erick

On Wed, Jun 8, 2011 at 5:34 PM, Jamie Johnson jej2...@gmail.com wrote:
 I'm looking for a way to do a filter query and Ors.  I've done a bit of
 googling and found an open jira but nothing indicating this is possible.
 I'm looking to do something like the search at
 http://www.lucidimagination.com/search/?q=test
 where you can do multi selects for the facets.  I've read about it at
 http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParamsso
 I have the tag/exclusion working but if I select two items from a
 facet
 group (say age from 1 to 10 and age from 10 to 20) I get nothing because
 nothing meets both of those criteria.  I can obviously write something
 custom to build an OR out of this but that seems less elegant.  Any guidance
 would be appreciated



Regarding filterquery

2011-04-13 Thread soumya rao
Hi,

I am a newbie to solr. I could see that the queries are not cached. Would
like to apply filterCache to queries in ruby. Can anyone provide me the
syntax for this please?

Thanks.


RE: Regarding filterquery

2011-04-13 Thread Joshua Bouchair
Uncomment solrconfig.xml at the following location.

   !-- An optimization that attempts to use a filter to satisfy a search.
 If the requested sort does not include score, then the filterCache
 will be checked for a filter matching the query. If found, the filter
 will be used as the source of document ids, and then the sort will be
 applied to that.
useFilterForSortedQuerytrue/useFilterForSortedQuery
   --

Josh B.

-Original Message-
From: soumya rao [mailto:soumrao...@gmail.com] 
Sent: Wednesday, April 13, 2011 1:59 PM
To: solr-user@lucene.apache.org
Subject: Regarding filterquery

Hi,

I am a newbie to solr. I could see that the queries are not cached. Would
like to apply filterCache to queries in ruby. Can anyone provide me the
syntax for this please?

Thanks.
The recipient of this email should check this email and any attachments for the 
presence of viruses. 
The Wasserstrom Companies accepts no liability for any damage caused by any 
virus transmitted by this email.

This footnote also confirms that this email message has been scanned for the 
presence of computer viruses.

The Wasserstrom Companies


Re: Regarding filterquery

2011-04-13 Thread soumya rao
Thanks for the reply Josh.

And where should I make changes in ruby to add filters?

Soumya

On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair 
joshuabouch...@wasserstrom.com wrote:

 Uncomment solrconfig.xml at the following location.

   !-- An optimization that attempts to use a filter to satisfy a search.
 If the requested sort does not include score, then the filterCache
 will be checked for a filter matching the query. If found, the
 filter
 will be used as the source of document ids, and then the sort will
 be
 applied to that.
useFilterForSortedQuerytrue/useFilterForSortedQuery
   --

 Josh B.

 -Original Message-
 From: soumya rao [mailto:soumrao...@gmail.com]
 Sent: Wednesday, April 13, 2011 1:59 PM
 To: solr-user@lucene.apache.org
 Subject: Regarding filterquery

 Hi,

 I am a newbie to solr. I could see that the queries are not cached. Would
 like to apply filterCache to queries in ruby. Can anyone provide me the
 syntax for this please?

 Thanks.
 The recipient of this email should check this email and any attachments for
 the presence of viruses.
 The Wasserstrom Companies accepts no liability for any damage caused by any
 virus transmitted by this email.

 This footnote also confirms that this email message has been scanned for
 the presence of computer viruses.

 The Wasserstrom Companies



Re: Regarding filterquery

2011-04-13 Thread Li
You should just ask me.


Sent from my iPhone

On Apr 13, 2011, at 11:27 AM, soumya rao soumrao...@gmail.com wrote:

 Thanks for the reply Josh.
 
 And where should I make changes in ruby to add filters?
 
 Soumya
 
 On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair 
 joshuabouch...@wasserstrom.com wrote:
 
 Uncomment solrconfig.xml at the following location.
 
  !-- An optimization that attempts to use a filter to satisfy a search.
If the requested sort does not include score, then the filterCache
will be checked for a filter matching the query. If found, the
 filter
will be used as the source of document ids, and then the sort will
 be
applied to that.
   useFilterForSortedQuerytrue/useFilterForSortedQuery
  --
 
 Josh B.
 
 -Original Message-
 From: soumya rao [mailto:soumrao...@gmail.com]
 Sent: Wednesday, April 13, 2011 1:59 PM
 To: solr-user@lucene.apache.org
 Subject: Regarding filterquery
 
 Hi,
 
 I am a newbie to solr. I could see that the queries are not cached. Would
 like to apply filterCache to queries in ruby. Can anyone provide me the
 syntax for this please?
 
 Thanks.
 The recipient of this email should check this email and any attachments for
 the presence of viruses.
 The Wasserstrom Companies accepts no liability for any damage caused by any
 virus transmitted by this email.
 
 This footnote also confirms that this email message has been scanned for
 the presence of computer viruses.
 
 The Wasserstrom Companies
 


RE: Regarding filterquery

2011-04-13 Thread Joshua Bouchair
You have to specify the query. In the query you will have fq parameter which 
means facet query.
http://wiki.apache.org/solr/solr-ruby

-Original Message-
From: soumya rao [mailto:soumrao...@gmail.com] 
Sent: Wednesday, April 13, 2011 2:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Regarding filterquery

Thanks for the reply Josh.

And where should I make changes in ruby to add filters?

Soumya

On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair 
joshuabouch...@wasserstrom.com wrote:

 Uncomment solrconfig.xml at the following location.

   !-- An optimization that attempts to use a filter to satisfy a search.
 If the requested sort does not include score, then the filterCache
 will be checked for a filter matching the query. If found, the
 filter
 will be used as the source of document ids, and then the sort will
 be
 applied to that.
useFilterForSortedQuerytrue/useFilterForSortedQuery
   --

 Josh B.

 -Original Message-
 From: soumya rao [mailto:soumrao...@gmail.com]
 Sent: Wednesday, April 13, 2011 1:59 PM
 To: solr-user@lucene.apache.org
 Subject: Regarding filterquery

 Hi,

 I am a newbie to solr. I could see that the queries are not cached. Would
 like to apply filterCache to queries in ruby. Can anyone provide me the
 syntax for this please?

 Thanks.
 The recipient of this email should check this email and any attachments for
 the presence of viruses.
 The Wasserstrom Companies accepts no liability for any damage caused by any
 virus transmitted by this email.

 This footnote also confirms that this email message has been scanned for
 the presence of computer viruses.

 The Wasserstrom Companies

The recipient of this email should check this email and any attachments for the 
presence of viruses. 
The Wasserstrom Companies accepts no liability for any damage caused by any 
virus transmitted by this email.

This footnote also confirms that this email message has been scanned for the 
presence of computer viruses.

The Wasserstrom Companies


FilterQuery OR statement

2011-03-03 Thread Tanner Postert
Trying to figure out how I can run something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this: fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.


Re: FilterQuery OR statement

2011-03-03 Thread Ahmet Arslan
 Trying to figure out how I can run
 something similar to this for the fq
 parameter
 
 Field1 in ( 1, 2, 3 4 )
 AND
 Field2 in ( 4, 5, 6, 7 )
 
 I found some examples on the net that looked like this:
 fq=+field1:(1 2 3
 4) +field2(4 5 6 7) but that yields no results.

May be your default operator is set to AND in schema.xml?
If yes, try using +field2(4 OR 5 OR 6 OR 7) 


  


Re: FilterQuery OR statement

2011-03-03 Thread Ahmet Arslan

--- On Thu, 3/3/11, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com
 Subject: Re: FilterQuery OR statement
 To: solr-user@lucene.apache.org
 Date: Thursday, March 3, 2011, 8:05 PM
  Trying to figure out how I can
 run
  something similar to this for the fq
  parameter
  
  Field1 in ( 1, 2, 3 4 )
  AND
  Field2 in ( 4, 5, 6, 7 )
  
  I found some examples on the net that looked like
 this:
  fq=+field1:(1 2 3
  4) +field2(4 5 6 7) but that yields no results.
 
 May be your default operator is set to AND in schema.xml?
 If yes, try using +field2(4 OR 5 OR 6 OR 7) 

Actually you can use local params for that.
http://wiki.apache.org/solr/LocalParams

fq={!q.op=OR df=field1}1 2 3 4fq={!q.op=OR df=field2}4 5 6 7


  


Re: FilterQuery OR statement

2011-03-03 Thread Tanner Postert
That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslan iori...@yahoo.com wrote:

  Trying to figure out how I can run
  something similar to this for the fq
  parameter
 
  Field1 in ( 1, 2, 3 4 )
  AND
  Field2 in ( 4, 5, 6, 7 )
 
  I found some examples on the net that looked like this:
  fq=+field1:(1 2 3
  4) +field2(4 5 6 7) but that yields no results.

 May be your default operator is set to AND in schema.xml?
 If yes, try using +field2(4 OR 5 OR 6 OR 7)






Re: FilterQuery OR statement

2011-03-03 Thread Jonathan Rochkind
You might also consider splitting your two seperate AND clauses into 
two seperate fq's:


fq=field1:(1 OR 2 OR 3 OR 4)
fq=field2:(4 OR 5 OR 6 OR 7)

That will cache the two seperate clauses seperately in the field cache, 
which is probably preferable in general, without knowing more about your 
use characteristics.


ALSO, instead of either supplying the OR explicitly as above, OR 
changing the default operator in schema.xml for everything, I believe it 
would work to supply it as a local param:


fq={q.op=OR}field1:(1 2 3 4)

If you want to do that.

AND, your question, can you search without a 'q'?  No, but you can 
search with a 'q' that selects all documents, to be limited by the fq's.


q=[* TO *]

On 3/3/2011 1:14 PM, Tanner Postert wrote:

That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslaniori...@yahoo.com  wrote:


Trying to figure out how I can run
something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this:
fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.

May be your default operator is set to AND in schema.xml?
If yes, try using +field2(4 OR 5 OR 6 OR 7)






FilterQuery reaching maxBooleanClauses, alternatives?

2011-01-17 Thread Stefan Matheis
Hi List,

we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per
default). So, the used query looks like:

?q=name:Stefanfq=5 10 12 15 16 [...]

where the values are ids of users, which the current user is allowed to see
- so long, nothing special. sometimes the filter-query includes user-ids
from an different Type of User (let's say we have TypeA and TypeB) where
TypeB contains more then 2k users. Then we hit the given Limit.

Now the Question is .. is it possible to enable an Filter/Function/Feature
in Solr, which it makes possible, that we don't need to send over alle the
user ids from TypeB Users? Just to tell Solr include all TypeB Users in the
(given) FilterQuery (or something in that direction)?

If so, what's the Name of this Filter/Function/Feature? :)

Don't hesitate to ask, if my question/description is weird!

Thanks
Stefan


Re: FilterQuery reaching maxBooleanClauses, alternatives?

2011-01-17 Thread Salman Akram
You can index a field which can the User types e.g. UserType (possible
values can be TypeA,TypeB and so on...) and then you can just do

?q=name:Stefanfq=UserType:TypeB

BTW you can even increase the size of maxBooleanClauses but in this case
definitely this is not a good idea. Also you would hit the max limit of HTTP
GET so you will have to change it to POST. Better handle it with a new
field.

On Mon, Jan 17, 2011 at 5:57 PM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Hi List,

 we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per
 default). So, the used query looks like:

 ?q=name:Stefanfq=5 10 12 15 16 [...]

 where the values are ids of users, which the current user is allowed to see
 - so long, nothing special. sometimes the filter-query includes user-ids
 from an different Type of User (let's say we have TypeA and TypeB) where
 TypeB contains more then 2k users. Then we hit the given Limit.

 Now the Question is .. is it possible to enable an Filter/Function/Feature
 in Solr, which it makes possible, that we don't need to send over alle the
 user ids from TypeB Users? Just to tell Solr include all TypeB Users in
 the
 (given) FilterQuery (or something in that direction)?

 If so, what's the Name of this Filter/Function/Feature? :)

 Don't hesitate to ask, if my question/description is weird!

 Thanks
 Stefan




-- 
Regards,

Salman Akram


Re: FilterQuery reaching maxBooleanClauses, alternatives?

2011-01-17 Thread Stefan Matheis
Thanks Salman,

talking with others about problems really helps. Adding another FilterQuery
is a bit too much - but combining both is working fine!

not seen the wood for the trees =)
Thanks, Stefan


On Mon, Jan 17, 2011 at 2:07 PM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:

 You can index a field which can the User types e.g. UserType (possible
 values can be TypeA,TypeB and so on...) and then you can just do

 ?q=name:Stefanfq=UserType:TypeB

 BTW you can even increase the size of maxBooleanClauses but in this case
 definitely this is not a good idea. Also you would hit the max limit of
 HTTP
 GET so you will have to change it to POST. Better handle it with a new
 field.

 On Mon, Jan 17, 2011 at 5:57 PM, Stefan Matheis 
 matheis.ste...@googlemail.com wrote:

  Hi List,
 
  we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per
  default). So, the used query looks like:
 
  ?q=name:Stefanfq=5 10 12 15 16 [...]
 
  where the values are ids of users, which the current user is allowed to
 see
  - so long, nothing special. sometimes the filter-query includes user-ids
  from an different Type of User (let's say we have TypeA and TypeB) where
  TypeB contains more then 2k users. Then we hit the given Limit.
 
  Now the Question is .. is it possible to enable an
 Filter/Function/Feature
  in Solr, which it makes possible, that we don't need to send over alle
 the
  user ids from TypeB Users? Just to tell Solr include all TypeB Users in
  the
  (given) FilterQuery (or something in that direction)?
 
  If so, what's the Name of this Filter/Function/Feature? :)
 
  Don't hesitate to ask, if my question/description is weird!
 
  Thanks
  Stefan
 



 --
 Regards,

 Salman Akram



Re: FilterQuery reaching maxBooleanClauses, alternatives?

2011-01-17 Thread Salman Akram
You are welcome.

By new field I meant if you don't have a field for UserType already.

On Mon, Jan 17, 2011 at 6:22 PM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Thanks Salman,

 talking with others about problems really helps. Adding another FilterQuery
 is a bit too much - but combining both is working fine!

 not seen the wood for the trees =)
 Thanks, Stefan


 On Mon, Jan 17, 2011 at 2:07 PM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  You can index a field which can the User types e.g. UserType (possible
  values can be TypeA,TypeB and so on...) and then you can just do
 
  ?q=name:Stefanfq=UserType:TypeB
 
  BTW you can even increase the size of maxBooleanClauses but in this case
  definitely this is not a good idea. Also you would hit the max limit of
  HTTP
  GET so you will have to change it to POST. Better handle it with a new
  field.
 
  On Mon, Jan 17, 2011 at 5:57 PM, Stefan Matheis 
  matheis.ste...@googlemail.com wrote:
 
   Hi List,
  
   we are sometimes reaching the maxBooleanClauses Limit (which is 1024,
 per
   default). So, the used query looks like:
  
   ?q=name:Stefanfq=5 10 12 15 16 [...]
  
   where the values are ids of users, which the current user is allowed to
  see
   - so long, nothing special. sometimes the filter-query includes
 user-ids
   from an different Type of User (let's say we have TypeA and TypeB)
 where
   TypeB contains more then 2k users. Then we hit the given Limit.
  
   Now the Question is .. is it possible to enable an
  Filter/Function/Feature
   in Solr, which it makes possible, that we don't need to send over alle
  the
   user ids from TypeB Users? Just to tell Solr include all TypeB Users
 in
   the
   (given) FilterQuery (or something in that direction)?
  
   If so, what's the Name of this Filter/Function/Feature? :)
  
   Don't hesitate to ask, if my question/description is weird!
  
   Thanks
   Stefan
  
 
 
 
  --
  Regards,
 
  Salman Akram
 




-- 
Regards,

Salman Akram


Re: Query or FilterQuery for exact field match

2010-02-23 Thread Chris Hostetter

: I read that, but I'm outside of the typical usage I believe (as I have
: no additional parameters so I'm not getting a subset): in my case it
: seems the result would be in the queryResultCache anyway if I do a
: normal search , or am I missing something?

youre not missing anything -- each of the filters you care about will 
be in the filterCache, but each of the overall requests will wind up in 
the queryResultCache as well.

It's the kind of situation where you just have to do some performance 
testing to figure out which one makes more sense for you -- if you also 
facet on the filters you are interested in, then the q=*:*fq=brand:foo 
style queries might be better overall ... but if this is your one and only 
usecase then something like q=brand:foosort=_docid_ might be more 
efficient (only populate the queryResultCache, not the filterCache)


-Hoss



Query or FilterQuery for exact field match

2010-02-16 Thread gabriele renzi
Hi everyone,

in our app we sometimes use solr programmatically to retrieve all the
elements that have a certain value in a single-valued single-token
field ( brand:xxx).
Since we are not interested in scoring this results, I was thinking
that maybe this should be performed as a filterQuery (fq=brand:xxx),
and in that case I guess I shall be using a wildcard for the  query
(q=*:*), as I'd get an NPE on the missing parameter otherwise.

Does something like this even make sense? Is there a proper way to do
a query like this, or is the normal route of using q=brand:xxx already
the best way?

Thanks in advance for any answer.


-- 
blog en: http://www.riffraff.info
blog it: http://riffraff.blogsome.com


Re: Query or FilterQuery for exact field match

2010-02-16 Thread gabriele renzi
On Tue, Feb 16, 2010 at 2:04 PM, NarasimhaRaju rajux...@yahoo.com wrote:
 Hi,

 using filterQuery(fq) is more efficient because SolrIndexSearcher will make 
 use of filterCache
 and in your case it returns entire set from the cache instead of searching 
 from the entire index.
 more info about solrCaches at 
 http://wiki.apache.org/solr/SolrCaching#filterCache

I read that, but I'm outside of the typical usage I believe (as I have
no additional parameters so I'm not getting a subset): in my case it
seems the result would be in the queryResultCache anyway if I do a
normal search , or am I missing something?

Anyway, thanks for your answer.

-- 
blog en: http://www.riffraff.info
blog it: http://riffraff.blogsome.com


AW: Restricting Facet to FilterQuery in combination with mincount

2010-01-20 Thread Chantal Ackermann
Thank you, Chris!

That did clarify it. :-)
Cheers,
Chantal


Von: Chris Hostetter [hossman_luc...@fucit.org]
Gesendet: Dienstag, 19. Januar 2010 23:27
An: solr-user@lucene.apache.org
Betreff: Re: Restricting Facet to FilterQuery in combination with mincount

: Now, I was wondering whether it is possible to find that out. It would allow
: to show 0 counts of values that are produced by the query (q), and at the same
: time exclude all facet values that are already excluded by the filter query.
:
: Applying facetting to a subset (subselect / filterset) of the index not to
: everything - that might describe it, as well.

you can tag a filter query so that face.tfield knows to ignore that fq
when computing the constraint counts...

http://wiki.apache.org/solr/SimpleFacetParameters#LocalParams_for_faceting

...but i'm pretty sure that still won't give you what you are looking for.
In your mammal example it would just mean that the counts for your
name facet would ignore the fq=type:mammal restriction and be based
purely on the main q=area:water query ... so instead of excluding
salmon(0) from the results, and leaving lion(0) and dog(0) you would get
presumably start getting a positive count for salmon, but lin and dog
still wouldn't match

:   q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0
:  
:   would return something like
:   dolphin (20)
:   blue whale (20)
:   salmon (0) = not covered by filter query
:   lion (0)
:   dog (0)

...even if you sqaped the fq and q (which would alter your scores
drasticly) what taging and excluding changes is the *counts* associated
with a facet value -- there is no way to get some zeros to show while
other zeros don't.

Typically the driving force behind something like this is a hierarchical
taxonomy -- your animal example fitting nicely.  In those cases, you can
make your facets use the full hierarch (ie: mammal/lion, mammal/dog,
fish/salmon instead of just lion/dog/salmon) and you can use facet.prefix
to get the type of behavior you are talking about.


-Hoss



Re: Restricting Facet to FilterQuery in combination with mincount

2010-01-19 Thread Shalin Shekhar Mangar
On Wed, Jan 13, 2010 at 4:55 PM, Chantal Ackermann 
chantal.ackerm...@btelligent.de wrote:

 Hi all,

 is it possible to restrict the returned facets to only those that apply to
 the filter query but still use mincount=0? Keeping those that have a count
 of 0 but apply to the filter, and at the same time leaving out those that
 are not covered by the filter (and thus 0, as well).

 Some longer explanation of the question:

 Example (don't nail me down on biology here, it's just for illustration):
 q=type:mammalfacet.mincount=0facet.field=type

 returns facets for all values stored in the field type. Results would
 look like:

 mammal(2123)
 bird(0)
 dinosaur(0)
 fish(0)
 ...

 In this case setting facet.mincount=1 solves the problem. But consider:

 q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0

 would return something like
 dolphin (20)
 blue whale (20)
 salmon (0) = not covered by filter query
 lion (0)
 dog (0)
 ... (all sorts of animals, every possible value in field name)

 My question is: how can I exclude those facets from the result that are not
 covered by the filter query. In this example: how can I exclude the
 non-mammals from the facets but keep all those mammals that are not matched
 by the actual query parameter?


I've read this twice but the problem is still not clear to me. I guess you
will have to explain it better to get a meaningful response.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Restricting Facet to FilterQuery in combination with mincount

2010-01-19 Thread Chantal Ackermann

Hi Shalin,

thanks for taking your time (reading it twice!).

Rephrasing the question:
(suppose mincount=0 and facet.limit  all possible facet values)

Currently, the facet results include ALL values for that facet field.
Say I have a field color and when I look at the statistics (LUKE), I can 
see that my index contains altogether 7 different colors. This is 
comparable to a group/count/distinct query in a SQL db.


Querying for color as facet field with mincount=0 should thus return 7 
facet fields with various count results.
This fact (7 different counts returned for color) will not change no 
matter what the query (q) or the filter queries (fq) are - unless I 
change mincount.


Is that correct?

If so, then I was considering the cases why a facet count would be 0 
(always suppose mincount=0).


Case 1) No hit as defined by the query (q parameter) contains that 
specific facet value (e.g. the colors blue and green).
Case 2) This is like Case (1) but there is a filterquery on top, that 
excludes certain values from the facet field, so even before q is 
executed, it's clear that certain facet values are 0.
(e.g. the filter includes only hits with colors yellow and orange. So, 
by this filter, documents with the colors blue and green are already 
excluded from the set that is considered for the actual query (q).)
For me, this results in two different flavours of 0 counts: either the 
0 is the result of executing the query (q) or a result of a filterquery.


Now, I was wondering whether it is possible to find that out. It would 
allow to show 0 counts of values that are produced by the query (q), and 
at the same time exclude all facet values that are already excluded by 
the filter query.


Applying facetting to a subset (subselect / filterset) of the index not 
to everything - that might describe it, as well.



Does that make sense?
Thanks,
Chantal


Shalin Shekhar Mangar schrieb:

On Wed, Jan 13, 2010 at 4:55 PM, Chantal Ackermann 
chantal.ackerm...@btelligent.de wrote:


Hi all,

is it possible to restrict the returned facets to only those that apply to
the filter query but still use mincount=0? Keeping those that have a count
of 0 but apply to the filter, and at the same time leaving out those that
are not covered by the filter (and thus 0, as well).

Some longer explanation of the question:

Example (don't nail me down on biology here, it's just for illustration):
q=type:mammalfacet.mincount=0facet.field=type

returns facets for all values stored in the field type. Results would
look like:

mammal(2123)
bird(0)
dinosaur(0)
fish(0)
...

In this case setting facet.mincount=1 solves the problem. But consider:

q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0

would return something like
dolphin (20)
blue whale (20)
salmon (0) = not covered by filter query
lion (0)
dog (0)
... (all sorts of animals, every possible value in field name)

My question is: how can I exclude those facets from the result that are not
covered by the filter query. In this example: how can I exclude the
non-mammals from the facets but keep all those mammals that are not matched
by the actual query parameter?



I've read this twice but the problem is still not clear to me. I guess you
will have to explain it better to get a meaningful response.

--
Regards,
Shalin Shekhar Mangar.




Re: Restricting Facet to FilterQuery in combination with mincount

2010-01-19 Thread Chris Hostetter

: Now, I was wondering whether it is possible to find that out. It would allow
: to show 0 counts of values that are produced by the query (q), and at the same
: time exclude all facet values that are already excluded by the filter query.
: 
: Applying facetting to a subset (subselect / filterset) of the index not to
: everything - that might describe it, as well.

you can tag a filter query so that face.tfield knows to ignore that fq 
when computing the constraint counts...

http://wiki.apache.org/solr/SimpleFacetParameters#LocalParams_for_faceting

...but i'm pretty sure that still won't give you what you are looking for. 
In your mammal example it would just mean that the counts for your 
name facet would ignore the fq=type:mammal restriction and be based 
purely on the main q=area:water query ... so instead of excluding 
salmon(0) from the results, and leaving lion(0) and dog(0) you would get 
presumably start getting a positive count for salmon, but lin and dog 
still wouldn't match

:   q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0
:   
:   would return something like
:   dolphin (20)
:   blue whale (20)
:   salmon (0) = not covered by filter query
:   lion (0)
:   dog (0)

...even if you sqaped the fq and q (which would alter your scores 
drasticly) what taging and excluding changes is the *counts* associated 
with a facet value -- there is no way to get some zeros to show while 
other zeros don't.

Typically the driving force behind something like this is a hierarchical 
taxonomy -- your animal example fitting nicely.  In those cases, you can 
make your facets use the full hierarch (ie: mammal/lion, mammal/dog, 
fish/salmon instead of just lion/dog/salmon) and you can use facet.prefix 
to get the type of behavior you are talking about.


-Hoss



Restricting Facet to FilterQuery in combination with mincount

2010-01-13 Thread Chantal Ackermann

Hi all,

is it possible to restrict the returned facets to only those that apply 
to the filter query but still use mincount=0? Keeping those that have a 
count of 0 but apply to the filter, and at the same time leaving out 
those that are not covered by the filter (and thus 0, as well).



Some longer explanation of the question:


Example (don't nail me down on biology here, it's just for illustration):
q=type:mammalfacet.mincount=0facet.field=type

returns facets for all values stored in the field type. Results would 
look like:


mammal(2123)
bird(0)
dinosaur(0)
fish(0)
...

In this case setting facet.mincount=1 solves the problem. But consider:

q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0

would return something like
dolphin (20)
blue whale (20)
salmon (0) = not covered by filter query
lion (0)
dog (0)
... (all sorts of animals, every possible value in field name)

My question is: how can I exclude those facets from the result that are 
not covered by the filter query. In this example: how can I exclude the 
non-mammals from the facets but keep all those mammals that are not 
matched by the actual query parameter?


Thanks!
Chantal