Re: How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery
There might be something like fq=filter(foo:[2 TO 3]) OR filter(foo:[3 TO 100]) On Fri, Aug 24, 2018 at 2:23 PM zhenyuan wei wrote: > Hi All, > I am confuse about How to hit filterCache? > > If filterQuery is range [3 to 100] , but not cache in FilterCache, > and filterCache already exists filterQuery range [2 to 100], > > My question is " Dose this filterQuery range [3 to 100] will fetch DocSet > from FilterCache range[2 to 100]" ? > -- Sincerely yours Mikhail Khludnev
Re: How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery
On 8/24/2018 5:23 AM, zhenyuan wei wrote: I am confuse about How to hit filterCache? If filterQuery is range [3 to 100] , but not cache in FilterCache, and filterCache already exists filterQuery range [2 to 100], My question is " Dose this filterQuery range [3 to 100] will fetch DocSet from FilterCache range[2 to 100]" ? Each entry in the filterCache uses the query as its key. So for the first one, the key will be something like "field:[3 TO 100]" or whatever your fq parameter value was. When the second one is executed, it will have a different key, so it will not be found in the cache. Once it executes, it will be added to the cache as an additional entry. Thanks, Shawn
Re: How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery
Hi, No it will not and it does not make sense to - it would still have to apply filter on top of cached results since they can include values with 2. You can consider a query as entry into cache. Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 24 Aug 2018, at 13:23, zhenyuan wei wrote: > > Hi All, > I am confuse about How to hit filterCache? > > If filterQuery is range [3 to 100] , but not cache in FilterCache, > and filterCache already exists filterQuery range [2 to 100], > > My question is " Dose this filterQuery range [3 to 100] will fetch DocSet > from FilterCache range[2 to 100]" ?
How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery
Hi All, I am confuse about How to hit filterCache? If filterQuery is range [3 to 100] , but not cache in FilterCache, and filterCache already exists filterQuery range [2 to 100], My question is " Dose this filterQuery range [3 to 100] will fetch DocSet from FilterCache range[2 to 100]" ?
is it possible to consolidate filterquery cache strings
lets say I have a largish set of data (120M docs) and that I am partitioning my data by groups of states (using the state codes) Someone suggested that I could use the following format in my solrconfig.xml when defining the filterqueries work: listener event=newSearcher class=solr.QuerySenderListener arr name=queries lst str name=q*:*/str str name=fqState:AL/str str name=fqState:AK/str ... str name=fqState:WY/str /arr /listener Would that work, and if so how would I know that the cache is being hit? Or do I need to use the following traditional syntax instead: listener event=newSearcher class=solr.QuerySenderListener arr name=queries lst str name=q*:*/str str name=fqState:AL/str /str lst str name=q*:*/str str name=fqState:AK/str /str ... lst str name=q*:*/str str name=fqState:WY/str /str /arr /listener any help appreciated -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: is it possible to consolidate filterquery cache strings
note: by partitioning I mean that I have sharded the 120M docs into 9 Solr partitions (each on a separate server) -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005p4121012.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: is it possible to consolidate filterquery cache strings
: Would that work, and if so how would I know that the cache is being hit? It should work -- filters are evaluated independently, so the fact that you are using all of them in query query (vs all of them in individual queries) won't change anything as far as the filterCache goes. You can prove that it works by looking at the cache stats (available from the Admin UI) after opening a new searcher and verifying that they are all in the new caches. you can also then do a query for soemthing like q=foofq=State:AK and reload the cache stats and see a hit on your filterCcahe. : Or do I need to use the following traditional syntax instead: The only reason to break them all out like that is if you in addition to populating the *filterCache* you also want to populate the *queryResultCache* with ~50 queries for *:* each with a different fq applied. -Hoss http://www.lucidworks.com/
Re: is it possible to consolidate filterquery cache strings
would not breaking the FQs out by state be faster for warming up the fq caches? -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005p4121030.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problems with SolrEnitityProcessor + frange filterQuery
Hi Jack, Your suggestion works perfectly! Thank you very much!! it ended up being something like this: query=_query_:'status:1 AND NOT priority:\-1' AND _query_:'{!frange l=3000 u=5000}max(sum(suser_count), sum(user_count))' Regards, Dirceu On Thu, Sep 20, 2012 at 10:46 PM, Jack Krupansky j...@basetechnology.comwrote: Sorry, but it looks like the SolrEntityProcessor does a raw split on commas of its fq parameter, with no provision for escaping. You should be able to combine the fq into the query parameter as a nested query which does not have the split issue. -- Jack Krupansky -Original Message- From: Dirceu Vieira Sent: Thursday, September 20, 2012 4:16 PM To: solr-user@lucene.apache.org Subject: Re: Problems with SolrEnitityProcessor + frange filterQuery Hi guys, Has anybody got any idea about that? I'm really open for any suggestions Thanks! Dirceu On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira dirceu...@gmail.com wrote: Hi, I'm attempting to write a filter query for my SolrEntityProcessor using {frange} over a function. It works fine when I'm testing it on the admin, but once I move it into my data-config.xml the query blows up because of the commas in the function. The problem is that fq parameter can be a comma separated list, which means that if I have commas within my query, it'll try to split it into multiple filter queries. Does anybody knows a way of escaping the comma or another way I can work around that? I've been using SolrEntityProcessor to import filtered data from a core to another, here's the queries: query=status:1 AND NOT priority:\-1 fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count)) I'm using Solr-4.0.0-BETA. Best regards, -- Dirceu Vieira Júnior --**--**--- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr -- Dirceu Vieira Júnior --**--**--- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr -- Dirceu Vieira Júnior --- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr
Problems with SolrEnitityProcessor + frange filterQuery
Hi, I'm attempting to write a filter query for my SolrEntityProcessor using {frange} over a function. It works fine when I'm testing it on the admin, but once I move it into my data-config.xml the query blows up because of the commas in the function. The problem is that fq parameter can be a comma separated list, which means that if I have commas within my query, it'll try to split it into multiple filter queries. Does anybody knows a way of escaping the comma or another way I can work around that? I've been using SolrEntityProcessor to import filtered data from a core to another, here's the queries: query=status:1 AND NOT priority:\-1 fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count)) I'm using Solr-4.0.0-BETA. Best regards, -- Dirceu Vieira Júnior --- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr
Re: Problems with SolrEnitityProcessor + frange filterQuery
Hi guys, Has anybody got any idea about that? I'm really open for any suggestions Thanks! Dirceu On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira dirceu...@gmail.com wrote: Hi, I'm attempting to write a filter query for my SolrEntityProcessor using {frange} over a function. It works fine when I'm testing it on the admin, but once I move it into my data-config.xml the query blows up because of the commas in the function. The problem is that fq parameter can be a comma separated list, which means that if I have commas within my query, it'll try to split it into multiple filter queries. Does anybody knows a way of escaping the comma or another way I can work around that? I've been using SolrEntityProcessor to import filtered data from a core to another, here's the queries: query=status:1 AND NOT priority:\-1 fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count)) I'm using Solr-4.0.0-BETA. Best regards, -- Dirceu Vieira Júnior --- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr -- Dirceu Vieira Júnior --- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr
Re: Problems with SolrEnitityProcessor + frange filterQuery
Sorry, but it looks like the SolrEntityProcessor does a raw split on commas of its fq parameter, with no provision for escaping. You should be able to combine the fq into the query parameter as a nested query which does not have the split issue. -- Jack Krupansky -Original Message- From: Dirceu Vieira Sent: Thursday, September 20, 2012 4:16 PM To: solr-user@lucene.apache.org Subject: Re: Problems with SolrEnitityProcessor + frange filterQuery Hi guys, Has anybody got any idea about that? I'm really open for any suggestions Thanks! Dirceu On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira dirceu...@gmail.com wrote: Hi, I'm attempting to write a filter query for my SolrEntityProcessor using {frange} over a function. It works fine when I'm testing it on the admin, but once I move it into my data-config.xml the query blows up because of the commas in the function. The problem is that fq parameter can be a comma separated list, which means that if I have commas within my query, it'll try to split it into multiple filter queries. Does anybody knows a way of escaping the comma or another way I can work around that? I've been using SolrEntityProcessor to import filtered data from a core to another, here's the queries: query=status:1 AND NOT priority:\-1 fq={!frange l=3000 u=5000}max(sum(suser_count), sum(user_count)) I'm using Solr-4.0.0-BETA. Best regards, -- Dirceu Vieira Júnior --- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr -- Dirceu Vieira Júnior --- +47 9753 2473 dirceuvjr.blogspot.com twitter.com/dirceuvjr
RE: OR-FilterQuery
q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? 1. These two opts have the different scoring. 2. if you hit same fq=id:(1 OR 2 OR 3...) many times you have a benefit due to reading docset from heap instead of searching on disk. OK, understood. Thank you.
RE: OR-FilterQuery
In other words, there's no attempt to decompose the fq clause and store parts of it in the cache, it's exact-match or nothing. Ah ok, thank you.
Re: OR-FilterQuery
On Mon, Feb 13, 2012 at 11:17 PM, spr...@gmx.eu wrote: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? 1. These two opts have the different scoring. 2. if you hit same fq=id:(1 OR 2 OR 3...) many times you have a benefit due to reading docset from heap instead of searching on disk. Is the Filter Cache used for the OR'ed fq? Filter cache is used for whatever filter. I guess I didn't get you. Can't you rephrase your question? Thank you -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: OR-FilterQuery
bq: Is the Filter Cache used for the OR'ed fq? The filter cache is actually pretty simple conceptually. It's just a map where the key is the fq and the value is the set of documents that satisfy that fq (we'll skip the implementation here, just think of it as the list of all the docs that the fq selects). Solr doesn't attempt to do much with the key, just think of it as a single string. Whether or not an fq is reused from the cache depends upon whether the key is in the map. So fq=id:(1 OR 2 OR 3) will just look to see if id:(1 OR 2 OR 3) is a key. If so, it'll just use the document list stored in the cache. It won't match id:(1 OR 2) or id:(2) or id:1 OR id:2 OR id:3 In other words, there's no attempt to decompose the fq clause and store parts of it in the cache, it's exact-match or nothing. Hope that helps Erick On Mon, Feb 13, 2012 at 2:17 PM, spr...@gmx.eu wrote: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
Re: OR-FilterQuery
Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial fqQParserPlugin, which will be just // lazily through User/Generic Cache q = new FilteredQuery (new MatchAllDocsQuery(), new CachingWrapperFilter(new QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V); return q; it will use per segment bitset at contrast to Solr's fq which caches for top level reader. WDYT? On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote: Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: OR-FilterQuery
Hi Mikhail, thanks for kicking in some brainstorming-code! The given thread is almost a year old and I was working with Solr in my freetime to see where it fails to behave/perform as I expect/wish. I found out that if you got a lot of different access-patterns for a filter-query, you might end up with either a big cache to make things fast or with lower performance (impact depends on usecase and circumstances). Scenario: You got a permission-field and the client is able to filter by one to three permission-values. That is: fq=foo:user fq=foo:moderator fq=foo:manager If you can not control/guarantee the order of the fq's values, you could end up with a lot of mess which all returns the same. Example: fq=permission:user OR permission:moderator OR permission:manager fq=permission:user OR permission:manager OR permission:moderator fq=permission:moderator OR permission:user OR permission:manager ... They all return the same but where cached seperately which leads to the fact that you are wasting memory a lot. Furthermore, if your access pattern will lead to a lot of different fq's on a small set of distinct values, it may make more sense to cache each filter-query for itself from a memory-consuming point of view (may cost a little bit performance). That beeing said, if you cache a filter for foo:user, foo:moderator and foo:manager you can combine those filters with AND, OR, NOT or whatever without recomputing every filter over and over again which would be the case if your filter-cache is not large enough. However, I never compared the performance differences (in terms of speed) of a cached filter-query like foo:bar OR foo:baz With a combination of two cached filter-queries like foo:bar foo:baz combined by a logical OR. That's how the background looks like. Unfortunately I didn't had the time to implement this in the past. Back to your post: Looks like a cool idea and is almost what I had in mind! I would formulate an easier syntax so that one is able to parse each fq-clause on its own to cache the CachingWrapperFilter to reuse it again. it will use per segment bitset at contrast to Solr's fq which caches for top level reader. Could you explain why this bitset would be per-segment based, please? I don't see a reason why this *have* to be so. What is the benefit you are seeing? Kind regards, Em Am 14.02.2012 19:33, schrieb Mikhail Khludnev: Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial fqQParserPlugin, which will be just // lazily through User/Generic Cache q = new FilteredQuery (new MatchAllDocsQuery(), new CachingWrapperFilter(new QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V); return q; it will use per segment bitset at contrast to Solr's fq which caches for top level reader. WDYT? On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote: Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
Re: OR-FilterQuery
Whoa! fq=id(1 OR 2) is not the same thing at all as fq=id:1fq=id:2 Assuming that any document had one and only one ID, the second clause would return exactly 0 documents, each and every time. Multiple fq clauses are essentially set intersections. So the first query is the set of all documents where id is 1 or 2 the second is the intersection of two sets of documents, one set with an id of 1 and one with an id of 2. Not the same thing at all. There's no support for the concept of (fq=id:1 OR fq=id:2) Best Erick On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote: Hi Mikhail, thanks for kicking in some brainstorming-code! The given thread is almost a year old and I was working with Solr in my freetime to see where it fails to behave/perform as I expect/wish. I found out that if you got a lot of different access-patterns for a filter-query, you might end up with either a big cache to make things fast or with lower performance (impact depends on usecase and circumstances). Scenario: You got a permission-field and the client is able to filter by one to three permission-values. That is: fq=foo:user fq=foo:moderator fq=foo:manager If you can not control/guarantee the order of the fq's values, you could end up with a lot of mess which all returns the same. Example: fq=permission:user OR permission:moderator OR permission:manager fq=permission:user OR permission:manager OR permission:moderator fq=permission:moderator OR permission:user OR permission:manager ... They all return the same but where cached seperately which leads to the fact that you are wasting memory a lot. Furthermore, if your access pattern will lead to a lot of different fq's on a small set of distinct values, it may make more sense to cache each filter-query for itself from a memory-consuming point of view (may cost a little bit performance). That beeing said, if you cache a filter for foo:user, foo:moderator and foo:manager you can combine those filters with AND, OR, NOT or whatever without recomputing every filter over and over again which would be the case if your filter-cache is not large enough. However, I never compared the performance differences (in terms of speed) of a cached filter-query like foo:bar OR foo:baz With a combination of two cached filter-queries like foo:bar foo:baz combined by a logical OR. That's how the background looks like. Unfortunately I didn't had the time to implement this in the past. Back to your post: Looks like a cool idea and is almost what I had in mind! I would formulate an easier syntax so that one is able to parse each fq-clause on its own to cache the CachingWrapperFilter to reuse it again. it will use per segment bitset at contrast to Solr's fq which caches for top level reader. Could you explain why this bitset would be per-segment based, please? I don't see a reason why this *have* to be so. What is the benefit you are seeing? Kind regards, Em Am 14.02.2012 19:33, schrieb Mikhail Khludnev: Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial fqQParserPlugin, which will be just // lazily through User/Generic Cache q = new FilteredQuery (new MatchAllDocsQuery(), new CachingWrapperFilter(new QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V); return q; it will use per segment bitset at contrast to Solr's fq which caches for top level reader. WDYT? On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote: Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
Re: OR-FilterQuery
BTW, you're not the first person who would like this capability, see: https://issues.apache.org/jira/browse/SOLR-1223 But the fact that this JIRA was originally opened in in June of 2009 and hasn't been implemented yet indicates that it's not super-high priority. Best Erick On Tue, Feb 14, 2012 at 4:33 PM, Erick Erickson erickerick...@gmail.com wrote: Whoa! fq=id(1 OR 2) is not the same thing at all as fq=id:1fq=id:2 Assuming that any document had one and only one ID, the second clause would return exactly 0 documents, each and every time. Multiple fq clauses are essentially set intersections. So the first query is the set of all documents where id is 1 or 2 the second is the intersection of two sets of documents, one set with an id of 1 and one with an id of 2. Not the same thing at all. There's no support for the concept of (fq=id:1 OR fq=id:2) Best Erick On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote: Hi Mikhail, thanks for kicking in some brainstorming-code! The given thread is almost a year old and I was working with Solr in my freetime to see where it fails to behave/perform as I expect/wish. I found out that if you got a lot of different access-patterns for a filter-query, you might end up with either a big cache to make things fast or with lower performance (impact depends on usecase and circumstances). Scenario: You got a permission-field and the client is able to filter by one to three permission-values. That is: fq=foo:user fq=foo:moderator fq=foo:manager If you can not control/guarantee the order of the fq's values, you could end up with a lot of mess which all returns the same. Example: fq=permission:user OR permission:moderator OR permission:manager fq=permission:user OR permission:manager OR permission:moderator fq=permission:moderator OR permission:user OR permission:manager ... They all return the same but where cached seperately which leads to the fact that you are wasting memory a lot. Furthermore, if your access pattern will lead to a lot of different fq's on a small set of distinct values, it may make more sense to cache each filter-query for itself from a memory-consuming point of view (may cost a little bit performance). That beeing said, if you cache a filter for foo:user, foo:moderator and foo:manager you can combine those filters with AND, OR, NOT or whatever without recomputing every filter over and over again which would be the case if your filter-cache is not large enough. However, I never compared the performance differences (in terms of speed) of a cached filter-query like foo:bar OR foo:baz With a combination of two cached filter-queries like foo:bar foo:baz combined by a logical OR. That's how the background looks like. Unfortunately I didn't had the time to implement this in the past. Back to your post: Looks like a cool idea and is almost what I had in mind! I would formulate an easier syntax so that one is able to parse each fq-clause on its own to cache the CachingWrapperFilter to reuse it again. it will use per segment bitset at contrast to Solr's fq which caches for top level reader. Could you explain why this bitset would be per-segment based, please? I don't see a reason why this *have* to be so. What is the benefit you are seeing? Kind regards, Em Am 14.02.2012 19:33, schrieb Mikhail Khludnev: Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial fqQParserPlugin, which will be just // lazily through User/Generic Cache q = new FilteredQuery (new MatchAllDocsQuery(), new CachingWrapperFilter(new QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V); return q; it will use per segment bitset at contrast to Solr's fq which caches for top level reader. WDYT? On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote: Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
Re: OR-FilterQuery
Hi Erick, Whoa! fq=id(1 OR 2) is not the same thing at all as fq=id:1fq=id:2 Ahm, who said they would be the same? :) I mean, you are completely right in what you are saying but it seems to me that we are talking about two different things. I was talking about caching each filter-criteria instead of the whole filter-query to recombine the cached filter-criteria based on the boolean-operators the client sends. In other words: currently fq=id:1 OR id:2 results into ONE cached filter-entry. fq=id:2 OR id:1 results into ANOTHER cached filter-entry fq=id:2 AND id:1 results into (surprise, surprise) a third filter-entry (although this example does not make sense). My idea was to cache each filter-criteria, that means caching the bitset for id:1 and the bitset for id:2 to recombine both bitsets via AND, OR, NOT etc. whenever this is neccessary. This way one could save memory (and maybe computing-time as well) which definitely makes sense when you got a way smaller set of filter-criterias while having a much larger set of possible (and used) combinations of each filter-criteria with a small number of repetitions per combination (which would destroy the benefit of caching). Don't you agree? Kind regards, Em Am 14.02.2012 22:33, schrieb Erick Erickson: Whoa! fq=id(1 OR 2) is not the same thing at all as fq=id:1fq=id:2 Assuming that any document had one and only one ID, the second clause would return exactly 0 documents, each and every time. Multiple fq clauses are essentially set intersections. So the first query is the set of all documents where id is 1 or 2 the second is the intersection of two sets of documents, one set with an id of 1 and one with an id of 2. Not the same thing at all. There's no support for the concept of (fq=id:1 OR fq=id:2) Best Erick On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote: Hi Mikhail, thanks for kicking in some brainstorming-code! The given thread is almost a year old and I was working with Solr in my freetime to see where it fails to behave/perform as I expect/wish. I found out that if you got a lot of different access-patterns for a filter-query, you might end up with either a big cache to make things fast or with lower performance (impact depends on usecase and circumstances). Scenario: You got a permission-field and the client is able to filter by one to three permission-values. That is: fq=foo:user fq=foo:moderator fq=foo:manager If you can not control/guarantee the order of the fq's values, you could end up with a lot of mess which all returns the same. Example: fq=permission:user OR permission:moderator OR permission:manager fq=permission:user OR permission:manager OR permission:moderator fq=permission:moderator OR permission:user OR permission:manager ... They all return the same but where cached seperately which leads to the fact that you are wasting memory a lot. Furthermore, if your access pattern will lead to a lot of different fq's on a small set of distinct values, it may make more sense to cache each filter-query for itself from a memory-consuming point of view (may cost a little bit performance). That beeing said, if you cache a filter for foo:user, foo:moderator and foo:manager you can combine those filters with AND, OR, NOT or whatever without recomputing every filter over and over again which would be the case if your filter-cache is not large enough. However, I never compared the performance differences (in terms of speed) of a cached filter-query like foo:bar OR foo:baz With a combination of two cached filter-queries like foo:bar foo:baz combined by a logical OR. That's how the background looks like. Unfortunately I didn't had the time to implement this in the past. Back to your post: Looks like a cool idea and is almost what I had in mind! I would formulate an easier syntax so that one is able to parse each fq-clause on its own to cache the CachingWrapperFilter to reuse it again. it will use per segment bitset at contrast to Solr's fq which caches for top level reader. Could you explain why this bitset would be per-segment based, please? I don't see a reason why this *have* to be so. What is the benefit you are seeing? Kind regards, Em Am 14.02.2012 19:33, schrieb Mikhail Khludnev: Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial
Re: OR-FilterQuery
Ah, OK, I misread your post apparently. And yes, what you suggest would result in some efficiencies, but at present I don't think there's any syntax that allows one to combine filter queries as you suggest. There was some discussion about it in the JIRA I referenced, but no action that I could see. That is, efficiencies in some circumstances, though I think it would be hard to predict. For instance, imagine a set of 100 entries in an FQ. And no, I'm not making things up, I've seen applications where this makes sense. Splitting that out into 100 separate entries in the filterCache would use up a lot of space. Likewise, I suspect that the actual process of creating the heuristics that were able to analyze an incoming filter query and do the right thing in terms of splitting it up and recombining it would be pretty hairy. Local parameters for instance, and let's throw in dereferencing too G... So I suspect that this is one of those features that is quite easy to see the benefits of in the simple case, but pretty quickly becomes a nightmare to actually implement correctly, but that's mostly a guess. And before putting the work into it, I think modeling the actual benefits would be wise, as well as convincing myself that there are enough cases where this *would* be beneficial. I mean Solr does a pretty reasonable job of caching these anyway, and with the non-cached filters it's not clear to me that the benefits are sufficient... Good luck, though, if you want to tackle it! Erick On Tue, Feb 14, 2012 at 4:54 PM, Em mailformailingli...@yahoo.de wrote: Hi Erick, Whoa! fq=id(1 OR 2) is not the same thing at all as fq=id:1fq=id:2 Ahm, who said they would be the same? :) I mean, you are completely right in what you are saying but it seems to me that we are talking about two different things. I was talking about caching each filter-criteria instead of the whole filter-query to recombine the cached filter-criteria based on the boolean-operators the client sends. In other words: currently fq=id:1 OR id:2 results into ONE cached filter-entry. fq=id:2 OR id:1 results into ANOTHER cached filter-entry fq=id:2 AND id:1 results into (surprise, surprise) a third filter-entry (although this example does not make sense). My idea was to cache each filter-criteria, that means caching the bitset for id:1 and the bitset for id:2 to recombine both bitsets via AND, OR, NOT etc. whenever this is neccessary. This way one could save memory (and maybe computing-time as well) which definitely makes sense when you got a way smaller set of filter-criterias while having a much larger set of possible (and used) combinations of each filter-criteria with a small number of repetitions per combination (which would destroy the benefit of caching). Don't you agree? Kind regards, Em Am 14.02.2012 22:33, schrieb Erick Erickson: Whoa! fq=id(1 OR 2) is not the same thing at all as fq=id:1fq=id:2 Assuming that any document had one and only one ID, the second clause would return exactly 0 documents, each and every time. Multiple fq clauses are essentially set intersections. So the first query is the set of all documents where id is 1 or 2 the second is the intersection of two sets of documents, one set with an id of 1 and one with an id of 2. Not the same thing at all. There's no support for the concept of (fq=id:1 OR fq=id:2) Best Erick On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote: Hi Mikhail, thanks for kicking in some brainstorming-code! The given thread is almost a year old and I was working with Solr in my freetime to see where it fails to behave/perform as I expect/wish. I found out that if you got a lot of different access-patterns for a filter-query, you might end up with either a big cache to make things fast or with lower performance (impact depends on usecase and circumstances). Scenario: You got a permission-field and the client is able to filter by one to three permission-values. That is: fq=foo:user fq=foo:moderator fq=foo:manager If you can not control/guarantee the order of the fq's values, you could end up with a lot of mess which all returns the same. Example: fq=permission:user OR permission:moderator OR permission:manager fq=permission:user OR permission:manager OR permission:moderator fq=permission:moderator OR permission:user OR permission:manager ... They all return the same but where cached seperately which leads to the fact that you are wasting memory a lot. Furthermore, if your access pattern will lead to a lot of different fq's on a small set of distinct values, it may make more sense to cache each filter-query for itself from a memory-consuming point of view (may cost a little bit performance). That beeing said, if you cache a filter for foo:user, foo:moderator and foo:manager you can combine those filters with AND, OR, NOT or whatever without recomputing every filter over and over again
Re: OR-FilterQuery
On Tue, Feb 14, 2012 at 11:13 PM, Em mailformailingli...@yahoo.de wrote: Hi Mikhail, it will use per segment bitset at contrast to Solr's fq which caches for top level reader. Could you explain why this bitset would be per-segment based, please? I don't see a reason why this *have* to be so. it's just how org.apache.lucene.search.CachingWrapperFilter works. The first out-of-the box stuff which I've found. as an top-level segment alternative we need org.apache.solr.search.SolrIndexSearcher.getDocSet(Query). btw, one more top-level snippet class FQParser extends QParser{ Query parse(...){ return new SolrConstantScoreQuery( solrIndexSearcher.getDocSet( subQuery(localParam.get(V)) ).getTopFilter()) } } What is the benefit you are seeing? It seems like two different POVs: Lucene prefer per segment caching to have fast incremental updates, but maybe 'because it's good but not in worst case' (I guess I've heard it there http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011/many-facets-apache-solr) Solr prefer top-reader caches. Kind regards, Em Am 14.02.2012 19:33, schrieb Mikhail Khludnev: Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial fqQParserPlugin, which will be just // lazily through User/Generic Cache q = new FilteredQuery (new MatchAllDocsQuery(), new CachingWrapperFilter(new QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V); return q; it will use per segment bitset at contrast to Solr's fq which caches for top level reader. WDYT? On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote: Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: OR-FilterQuery
Hi Mikhail, it's just how org.apache.lucene.search.CachingWrapperFilter works. The first out-of-the box stuff which I've found. Thanks for your explanation and snippets - I thought this was configurable. Regards, Em Am 15.02.2012 06:16, schrieb Mikhail Khludnev: On Tue, Feb 14, 2012 at 11:13 PM, Em mailformailingli...@yahoo.de wrote: Hi Mikhail, it will use per segment bitset at contrast to Solr's fq which caches for top level reader. Could you explain why this bitset would be per-segment based, please? I don't see a reason why this *have* to be so. it's just how org.apache.lucene.search.CachingWrapperFilter works. The first out-of-the box stuff which I've found. as an top-level segment alternative we need org.apache.solr.search.SolrIndexSearcher.getDocSet(Query). btw, one more top-level snippet class FQParser extends QParser{ Query parse(...){ return new SolrConstantScoreQuery( solrIndexSearcher.getDocSet( subQuery(localParam.get(V)) ).getTopFilter()) } } What is the benefit you are seeing? It seems like two different POVs: Lucene prefer per segment caching to have fast incremental updates, but maybe 'because it's good but not in worst case' (I guess I've heard it there http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011/many-facets-apache-solr) Solr prefer top-reader caches. Kind regards, Em Am 14.02.2012 19:33, schrieb Mikhail Khludnev: Hi Em, I briefly read the thread. Are you talking about combing of cached clauses of BooleanQuery, instead of evaluating whole BQ as a filter? I found something like that in API (but only in API) http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean) Am I get you right? Why do you need it, btw? If I'm .. I have idea how to do it in two mins: q=+f:text +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)... Right leg will be a BooleanQuery with SHOULD clauses backed on cached queries (see below). if you are not scarred by the syntax yet you can implement trivial fqQParserPlugin, which will be just // lazily through User/Generic Cache q = new FilteredQuery (new MatchAllDocsQuery(), new CachingWrapperFilter(new QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V); return q; it will use per segment bitset at contrast to Solr's fq which caches for top level reader. WDYT? On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote: Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
OR-FilterQuery
Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
Re: OR-FilterQuery
Hi, have a look at: http://search-lucene.com/m/Z8lWGEiKoI I think not much had changed since then. Regards, Em Am 13.02.2012 20:17, schrieb spr...@gmx.eu: Hi, how efficent is such an query: q=some text fq=id:(1 OR 2 OR 3...) Should I better use q:some text AND id:(1 OR 2 OR 3...)? Is the Filter Cache used for the OR'ed fq? Thank you
Re: filterQuery (fq=) vs q differences other than scoring.
Hmmm, are you talking about SOLR--2429? Some context would help here... But if you are, that capability was added to deal with situations where calculating the fq for the entire corpus *then* applying it to the query results was too expensive. So when you specify one of these high cost filters, Solr calculates the set of docs that staisfies the initial query, complete with relevance scores. Then all the lower-cost fqs are applied which will be cached (assuming they aren't identified as high cost, the the high cost fq is applied to the results set. That is, each document that has made it through the initial query selection and all of the lower-cost fqs has the high-cost fq value (i.e. inclusion/exclusion) calculated and the doc is removed from the result set if it should be. I haven't dived into the code to really understand the bit about calculated in parallel... Best Erick On Dec 9, 2011 4:53 PM, Andrew Lundgren lundg...@familysearch.org wrote: I know that fq's are used to improve performance by reducing the data set that you score. I have read the documentation that says that non-cached fq's are created in parallel to your query, but would like to know more about how that is done. Does it do a match on all the FQ's, then AND the resulting doc sets and then once that is done score the query based on the resulting subset of documents? -- Andrew Lundgren lundg...@familysearch.org NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
filterQuery (fq=) vs q differences other than scoring.
I know that fq's are used to improve performance by reducing the data set that you score. I have read the documentation that says that non-cached fq's are created in parallel to your query, but would like to know more about how that is done. Does it do a match on all the FQ's, then AND the resulting doc sets and then once that is done score the query based on the resulting subset of documents? -- Andrew Lundgren lundg...@familysearch.org NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
Re: custom filterquery
: pricing. I have written a functionquery to get the pricing, which works : fine as part of the search query, but doesn't seem to be doing anything when : I try to use it in a filter query. I wrote my pricing function query based how are you trying to use it in a filter query? function queries by definition match all documents -- the function value just determines the score. If you want to filter on a function query you have to use something like the frange parser to specify that only certian function values should match... https://lucene.apache.org/solr/api/org/apache/solr/search/FunctionRangeQParserPlugin.html -Hoss
custom filterquery
Hello, I am writing software for an e-commerce site. Different customers can have different selections of product depending on what is priced out for them, so to get the faceting counts correct I need to filter the values based on the pricing. I have written a functionquery to get the pricing, which works fine as part of the search query, but doesn't seem to be doing anything when I try to use it in a filter query. I wrote my pricing function query based on http://www.supermind.org/blog/756/how-to-write-a-custom-solr-functionquery, and I can see the parser part getting logged from the filter query, but nothing ever calls getValues on my ValueSource. If I use my function query as part of the main query, getValues is getting called. Can anyone point me in the right direction to get this working in the filter query? Jon Wagoner
FilterQuery and Ors
I'm looking for a way to do a filter query and Ors. I've done a bit of googling and found an open jira but nothing indicating this is possible. I'm looking to do something like the search at http://www.lucidimagination.com/search/?q=test where you can do multi selects for the facets. I've read about it at http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParamsso I have the tag/exclusion working but if I select two items from a facet group (say age from 1 to 10 and age from 10 to 20) I get nothing because nothing meets both of those criteria. I can obviously write something custom to build an OR out of this but that seems less elegant. Any guidance would be appreciated
Re: FilterQuery and Ors
try fq=age:[1 TO 10] OR age:[10 TO 20] I'm pretty sure fq=age:([1 TO 10] OR [10 TO 20]) will work too. But you're right, multiple fq clauses are intersections, so specifying more than one fq clause on the SAME field results in what you're seeing. Best Erick On Wed, Jun 8, 2011 at 5:34 PM, Jamie Johnson jej2...@gmail.com wrote: I'm looking for a way to do a filter query and Ors. I've done a bit of googling and found an open jira but nothing indicating this is possible. I'm looking to do something like the search at http://www.lucidimagination.com/search/?q=test where you can do multi selects for the facets. I've read about it at http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParamsso I have the tag/exclusion working but if I select two items from a facet group (say age from 1 to 10 and age from 10 to 20) I get nothing because nothing meets both of those criteria. I can obviously write something custom to build an OR out of this but that seems less elegant. Any guidance would be appreciated
Regarding filterquery
Hi, I am a newbie to solr. I could see that the queries are not cached. Would like to apply filterCache to queries in ruby. Can anyone provide me the syntax for this please? Thanks.
RE: Regarding filterquery
Uncomment solrconfig.xml at the following location. !-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. useFilterForSortedQuerytrue/useFilterForSortedQuery -- Josh B. -Original Message- From: soumya rao [mailto:soumrao...@gmail.com] Sent: Wednesday, April 13, 2011 1:59 PM To: solr-user@lucene.apache.org Subject: Regarding filterquery Hi, I am a newbie to solr. I could see that the queries are not cached. Would like to apply filterCache to queries in ruby. Can anyone provide me the syntax for this please? Thanks. The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies
Re: Regarding filterquery
Thanks for the reply Josh. And where should I make changes in ruby to add filters? Soumya On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair joshuabouch...@wasserstrom.com wrote: Uncomment solrconfig.xml at the following location. !-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. useFilterForSortedQuerytrue/useFilterForSortedQuery -- Josh B. -Original Message- From: soumya rao [mailto:soumrao...@gmail.com] Sent: Wednesday, April 13, 2011 1:59 PM To: solr-user@lucene.apache.org Subject: Regarding filterquery Hi, I am a newbie to solr. I could see that the queries are not cached. Would like to apply filterCache to queries in ruby. Can anyone provide me the syntax for this please? Thanks. The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies
Re: Regarding filterquery
You should just ask me. Sent from my iPhone On Apr 13, 2011, at 11:27 AM, soumya rao soumrao...@gmail.com wrote: Thanks for the reply Josh. And where should I make changes in ruby to add filters? Soumya On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair joshuabouch...@wasserstrom.com wrote: Uncomment solrconfig.xml at the following location. !-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. useFilterForSortedQuerytrue/useFilterForSortedQuery -- Josh B. -Original Message- From: soumya rao [mailto:soumrao...@gmail.com] Sent: Wednesday, April 13, 2011 1:59 PM To: solr-user@lucene.apache.org Subject: Regarding filterquery Hi, I am a newbie to solr. I could see that the queries are not cached. Would like to apply filterCache to queries in ruby. Can anyone provide me the syntax for this please? Thanks. The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies
RE: Regarding filterquery
You have to specify the query. In the query you will have fq parameter which means facet query. http://wiki.apache.org/solr/solr-ruby -Original Message- From: soumya rao [mailto:soumrao...@gmail.com] Sent: Wednesday, April 13, 2011 2:27 PM To: solr-user@lucene.apache.org Subject: Re: Regarding filterquery Thanks for the reply Josh. And where should I make changes in ruby to add filters? Soumya On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair joshuabouch...@wasserstrom.com wrote: Uncomment solrconfig.xml at the following location. !-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. useFilterForSortedQuerytrue/useFilterForSortedQuery -- Josh B. -Original Message- From: soumya rao [mailto:soumrao...@gmail.com] Sent: Wednesday, April 13, 2011 1:59 PM To: solr-user@lucene.apache.org Subject: Regarding filterquery Hi, I am a newbie to solr. I could see that the queries are not cached. Would like to apply filterCache to queries in ruby. Can anyone provide me the syntax for this please? Thanks. The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies
FilterQuery OR statement
Trying to figure out how I can run something similar to this for the fq parameter Field1 in ( 1, 2, 3 4 ) AND Field2 in ( 4, 5, 6, 7 ) I found some examples on the net that looked like this: fq=+field1:(1 2 3 4) +field2(4 5 6 7) but that yields no results.
Re: FilterQuery OR statement
Trying to figure out how I can run something similar to this for the fq parameter Field1 in ( 1, 2, 3 4 ) AND Field2 in ( 4, 5, 6, 7 ) I found some examples on the net that looked like this: fq=+field1:(1 2 3 4) +field2(4 5 6 7) but that yields no results. May be your default operator is set to AND in schema.xml? If yes, try using +field2(4 OR 5 OR 6 OR 7)
Re: FilterQuery OR statement
--- On Thu, 3/3/11, Ahmet Arslan iori...@yahoo.com wrote: From: Ahmet Arslan iori...@yahoo.com Subject: Re: FilterQuery OR statement To: solr-user@lucene.apache.org Date: Thursday, March 3, 2011, 8:05 PM Trying to figure out how I can run something similar to this for the fq parameter Field1 in ( 1, 2, 3 4 ) AND Field2 in ( 4, 5, 6, 7 ) I found some examples on the net that looked like this: fq=+field1:(1 2 3 4) +field2(4 5 6 7) but that yields no results. May be your default operator is set to AND in schema.xml? If yes, try using +field2(4 OR 5 OR 6 OR 7) Actually you can use local params for that. http://wiki.apache.org/solr/LocalParams fq={!q.op=OR df=field1}1 2 3 4fq={!q.op=OR df=field2}4 5 6 7
Re: FilterQuery OR statement
That worked, thought I tried it before, not sure why it didn't before. Also, is there a way to query without a q parameter? I'm just trying to pull back all of the field results where field1:(1 OR 2 OR 3) etc. so I figured I'd use the FQ param for caching purposes because those queries will likely be run a lot, but if I leave the Q parameter off i get a null pointer error. On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslan iori...@yahoo.com wrote: Trying to figure out how I can run something similar to this for the fq parameter Field1 in ( 1, 2, 3 4 ) AND Field2 in ( 4, 5, 6, 7 ) I found some examples on the net that looked like this: fq=+field1:(1 2 3 4) +field2(4 5 6 7) but that yields no results. May be your default operator is set to AND in schema.xml? If yes, try using +field2(4 OR 5 OR 6 OR 7)
Re: FilterQuery OR statement
You might also consider splitting your two seperate AND clauses into two seperate fq's: fq=field1:(1 OR 2 OR 3 OR 4) fq=field2:(4 OR 5 OR 6 OR 7) That will cache the two seperate clauses seperately in the field cache, which is probably preferable in general, without knowing more about your use characteristics. ALSO, instead of either supplying the OR explicitly as above, OR changing the default operator in schema.xml for everything, I believe it would work to supply it as a local param: fq={q.op=OR}field1:(1 2 3 4) If you want to do that. AND, your question, can you search without a 'q'? No, but you can search with a 'q' that selects all documents, to be limited by the fq's. q=[* TO *] On 3/3/2011 1:14 PM, Tanner Postert wrote: That worked, thought I tried it before, not sure why it didn't before. Also, is there a way to query without a q parameter? I'm just trying to pull back all of the field results where field1:(1 OR 2 OR 3) etc. so I figured I'd use the FQ param for caching purposes because those queries will likely be run a lot, but if I leave the Q parameter off i get a null pointer error. On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslaniori...@yahoo.com wrote: Trying to figure out how I can run something similar to this for the fq parameter Field1 in ( 1, 2, 3 4 ) AND Field2 in ( 4, 5, 6, 7 ) I found some examples on the net that looked like this: fq=+field1:(1 2 3 4) +field2(4 5 6 7) but that yields no results. May be your default operator is set to AND in schema.xml? If yes, try using +field2(4 OR 5 OR 6 OR 7)
FilterQuery reaching maxBooleanClauses, alternatives?
Hi List, we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per default). So, the used query looks like: ?q=name:Stefanfq=5 10 12 15 16 [...] where the values are ids of users, which the current user is allowed to see - so long, nothing special. sometimes the filter-query includes user-ids from an different Type of User (let's say we have TypeA and TypeB) where TypeB contains more then 2k users. Then we hit the given Limit. Now the Question is .. is it possible to enable an Filter/Function/Feature in Solr, which it makes possible, that we don't need to send over alle the user ids from TypeB Users? Just to tell Solr include all TypeB Users in the (given) FilterQuery (or something in that direction)? If so, what's the Name of this Filter/Function/Feature? :) Don't hesitate to ask, if my question/description is weird! Thanks Stefan
Re: FilterQuery reaching maxBooleanClauses, alternatives?
You can index a field which can the User types e.g. UserType (possible values can be TypeA,TypeB and so on...) and then you can just do ?q=name:Stefanfq=UserType:TypeB BTW you can even increase the size of maxBooleanClauses but in this case definitely this is not a good idea. Also you would hit the max limit of HTTP GET so you will have to change it to POST. Better handle it with a new field. On Mon, Jan 17, 2011 at 5:57 PM, Stefan Matheis matheis.ste...@googlemail.com wrote: Hi List, we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per default). So, the used query looks like: ?q=name:Stefanfq=5 10 12 15 16 [...] where the values are ids of users, which the current user is allowed to see - so long, nothing special. sometimes the filter-query includes user-ids from an different Type of User (let's say we have TypeA and TypeB) where TypeB contains more then 2k users. Then we hit the given Limit. Now the Question is .. is it possible to enable an Filter/Function/Feature in Solr, which it makes possible, that we don't need to send over alle the user ids from TypeB Users? Just to tell Solr include all TypeB Users in the (given) FilterQuery (or something in that direction)? If so, what's the Name of this Filter/Function/Feature? :) Don't hesitate to ask, if my question/description is weird! Thanks Stefan -- Regards, Salman Akram
Re: FilterQuery reaching maxBooleanClauses, alternatives?
Thanks Salman, talking with others about problems really helps. Adding another FilterQuery is a bit too much - but combining both is working fine! not seen the wood for the trees =) Thanks, Stefan On Mon, Jan 17, 2011 at 2:07 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: You can index a field which can the User types e.g. UserType (possible values can be TypeA,TypeB and so on...) and then you can just do ?q=name:Stefanfq=UserType:TypeB BTW you can even increase the size of maxBooleanClauses but in this case definitely this is not a good idea. Also you would hit the max limit of HTTP GET so you will have to change it to POST. Better handle it with a new field. On Mon, Jan 17, 2011 at 5:57 PM, Stefan Matheis matheis.ste...@googlemail.com wrote: Hi List, we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per default). So, the used query looks like: ?q=name:Stefanfq=5 10 12 15 16 [...] where the values are ids of users, which the current user is allowed to see - so long, nothing special. sometimes the filter-query includes user-ids from an different Type of User (let's say we have TypeA and TypeB) where TypeB contains more then 2k users. Then we hit the given Limit. Now the Question is .. is it possible to enable an Filter/Function/Feature in Solr, which it makes possible, that we don't need to send over alle the user ids from TypeB Users? Just to tell Solr include all TypeB Users in the (given) FilterQuery (or something in that direction)? If so, what's the Name of this Filter/Function/Feature? :) Don't hesitate to ask, if my question/description is weird! Thanks Stefan -- Regards, Salman Akram
Re: FilterQuery reaching maxBooleanClauses, alternatives?
You are welcome. By new field I meant if you don't have a field for UserType already. On Mon, Jan 17, 2011 at 6:22 PM, Stefan Matheis matheis.ste...@googlemail.com wrote: Thanks Salman, talking with others about problems really helps. Adding another FilterQuery is a bit too much - but combining both is working fine! not seen the wood for the trees =) Thanks, Stefan On Mon, Jan 17, 2011 at 2:07 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: You can index a field which can the User types e.g. UserType (possible values can be TypeA,TypeB and so on...) and then you can just do ?q=name:Stefanfq=UserType:TypeB BTW you can even increase the size of maxBooleanClauses but in this case definitely this is not a good idea. Also you would hit the max limit of HTTP GET so you will have to change it to POST. Better handle it with a new field. On Mon, Jan 17, 2011 at 5:57 PM, Stefan Matheis matheis.ste...@googlemail.com wrote: Hi List, we are sometimes reaching the maxBooleanClauses Limit (which is 1024, per default). So, the used query looks like: ?q=name:Stefanfq=5 10 12 15 16 [...] where the values are ids of users, which the current user is allowed to see - so long, nothing special. sometimes the filter-query includes user-ids from an different Type of User (let's say we have TypeA and TypeB) where TypeB contains more then 2k users. Then we hit the given Limit. Now the Question is .. is it possible to enable an Filter/Function/Feature in Solr, which it makes possible, that we don't need to send over alle the user ids from TypeB Users? Just to tell Solr include all TypeB Users in the (given) FilterQuery (or something in that direction)? If so, what's the Name of this Filter/Function/Feature? :) Don't hesitate to ask, if my question/description is weird! Thanks Stefan -- Regards, Salman Akram -- Regards, Salman Akram
Re: Query or FilterQuery for exact field match
: I read that, but I'm outside of the typical usage I believe (as I have : no additional parameters so I'm not getting a subset): in my case it : seems the result would be in the queryResultCache anyway if I do a : normal search , or am I missing something? youre not missing anything -- each of the filters you care about will be in the filterCache, but each of the overall requests will wind up in the queryResultCache as well. It's the kind of situation where you just have to do some performance testing to figure out which one makes more sense for you -- if you also facet on the filters you are interested in, then the q=*:*fq=brand:foo style queries might be better overall ... but if this is your one and only usecase then something like q=brand:foosort=_docid_ might be more efficient (only populate the queryResultCache, not the filterCache) -Hoss
Query or FilterQuery for exact field match
Hi everyone, in our app we sometimes use solr programmatically to retrieve all the elements that have a certain value in a single-valued single-token field ( brand:xxx). Since we are not interested in scoring this results, I was thinking that maybe this should be performed as a filterQuery (fq=brand:xxx), and in that case I guess I shall be using a wildcard for the query (q=*:*), as I'd get an NPE on the missing parameter otherwise. Does something like this even make sense? Is there a proper way to do a query like this, or is the normal route of using q=brand:xxx already the best way? Thanks in advance for any answer. -- blog en: http://www.riffraff.info blog it: http://riffraff.blogsome.com
Re: Query or FilterQuery for exact field match
On Tue, Feb 16, 2010 at 2:04 PM, NarasimhaRaju rajux...@yahoo.com wrote: Hi, using filterQuery(fq) is more efficient because SolrIndexSearcher will make use of filterCache and in your case it returns entire set from the cache instead of searching from the entire index. more info about solrCaches at http://wiki.apache.org/solr/SolrCaching#filterCache I read that, but I'm outside of the typical usage I believe (as I have no additional parameters so I'm not getting a subset): in my case it seems the result would be in the queryResultCache anyway if I do a normal search , or am I missing something? Anyway, thanks for your answer. -- blog en: http://www.riffraff.info blog it: http://riffraff.blogsome.com
AW: Restricting Facet to FilterQuery in combination with mincount
Thank you, Chris! That did clarify it. :-) Cheers, Chantal Von: Chris Hostetter [hossman_luc...@fucit.org] Gesendet: Dienstag, 19. Januar 2010 23:27 An: solr-user@lucene.apache.org Betreff: Re: Restricting Facet to FilterQuery in combination with mincount : Now, I was wondering whether it is possible to find that out. It would allow : to show 0 counts of values that are produced by the query (q), and at the same : time exclude all facet values that are already excluded by the filter query. : : Applying facetting to a subset (subselect / filterset) of the index not to : everything - that might describe it, as well. you can tag a filter query so that face.tfield knows to ignore that fq when computing the constraint counts... http://wiki.apache.org/solr/SimpleFacetParameters#LocalParams_for_faceting ...but i'm pretty sure that still won't give you what you are looking for. In your mammal example it would just mean that the counts for your name facet would ignore the fq=type:mammal restriction and be based purely on the main q=area:water query ... so instead of excluding salmon(0) from the results, and leaving lion(0) and dog(0) you would get presumably start getting a positive count for salmon, but lin and dog still wouldn't match : q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0 : : would return something like : dolphin (20) : blue whale (20) : salmon (0) = not covered by filter query : lion (0) : dog (0) ...even if you sqaped the fq and q (which would alter your scores drasticly) what taging and excluding changes is the *counts* associated with a facet value -- there is no way to get some zeros to show while other zeros don't. Typically the driving force behind something like this is a hierarchical taxonomy -- your animal example fitting nicely. In those cases, you can make your facets use the full hierarch (ie: mammal/lion, mammal/dog, fish/salmon instead of just lion/dog/salmon) and you can use facet.prefix to get the type of behavior you are talking about. -Hoss
Re: Restricting Facet to FilterQuery in combination with mincount
On Wed, Jan 13, 2010 at 4:55 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi all, is it possible to restrict the returned facets to only those that apply to the filter query but still use mincount=0? Keeping those that have a count of 0 but apply to the filter, and at the same time leaving out those that are not covered by the filter (and thus 0, as well). Some longer explanation of the question: Example (don't nail me down on biology here, it's just for illustration): q=type:mammalfacet.mincount=0facet.field=type returns facets for all values stored in the field type. Results would look like: mammal(2123) bird(0) dinosaur(0) fish(0) ... In this case setting facet.mincount=1 solves the problem. But consider: q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0 would return something like dolphin (20) blue whale (20) salmon (0) = not covered by filter query lion (0) dog (0) ... (all sorts of animals, every possible value in field name) My question is: how can I exclude those facets from the result that are not covered by the filter query. In this example: how can I exclude the non-mammals from the facets but keep all those mammals that are not matched by the actual query parameter? I've read this twice but the problem is still not clear to me. I guess you will have to explain it better to get a meaningful response. -- Regards, Shalin Shekhar Mangar.
Re: Restricting Facet to FilterQuery in combination with mincount
Hi Shalin, thanks for taking your time (reading it twice!). Rephrasing the question: (suppose mincount=0 and facet.limit all possible facet values) Currently, the facet results include ALL values for that facet field. Say I have a field color and when I look at the statistics (LUKE), I can see that my index contains altogether 7 different colors. This is comparable to a group/count/distinct query in a SQL db. Querying for color as facet field with mincount=0 should thus return 7 facet fields with various count results. This fact (7 different counts returned for color) will not change no matter what the query (q) or the filter queries (fq) are - unless I change mincount. Is that correct? If so, then I was considering the cases why a facet count would be 0 (always suppose mincount=0). Case 1) No hit as defined by the query (q parameter) contains that specific facet value (e.g. the colors blue and green). Case 2) This is like Case (1) but there is a filterquery on top, that excludes certain values from the facet field, so even before q is executed, it's clear that certain facet values are 0. (e.g. the filter includes only hits with colors yellow and orange. So, by this filter, documents with the colors blue and green are already excluded from the set that is considered for the actual query (q).) For me, this results in two different flavours of 0 counts: either the 0 is the result of executing the query (q) or a result of a filterquery. Now, I was wondering whether it is possible to find that out. It would allow to show 0 counts of values that are produced by the query (q), and at the same time exclude all facet values that are already excluded by the filter query. Applying facetting to a subset (subselect / filterset) of the index not to everything - that might describe it, as well. Does that make sense? Thanks, Chantal Shalin Shekhar Mangar schrieb: On Wed, Jan 13, 2010 at 4:55 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi all, is it possible to restrict the returned facets to only those that apply to the filter query but still use mincount=0? Keeping those that have a count of 0 but apply to the filter, and at the same time leaving out those that are not covered by the filter (and thus 0, as well). Some longer explanation of the question: Example (don't nail me down on biology here, it's just for illustration): q=type:mammalfacet.mincount=0facet.field=type returns facets for all values stored in the field type. Results would look like: mammal(2123) bird(0) dinosaur(0) fish(0) ... In this case setting facet.mincount=1 solves the problem. But consider: q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0 would return something like dolphin (20) blue whale (20) salmon (0) = not covered by filter query lion (0) dog (0) ... (all sorts of animals, every possible value in field name) My question is: how can I exclude those facets from the result that are not covered by the filter query. In this example: how can I exclude the non-mammals from the facets but keep all those mammals that are not matched by the actual query parameter? I've read this twice but the problem is still not clear to me. I guess you will have to explain it better to get a meaningful response. -- Regards, Shalin Shekhar Mangar.
Re: Restricting Facet to FilterQuery in combination with mincount
: Now, I was wondering whether it is possible to find that out. It would allow : to show 0 counts of values that are produced by the query (q), and at the same : time exclude all facet values that are already excluded by the filter query. : : Applying facetting to a subset (subselect / filterset) of the index not to : everything - that might describe it, as well. you can tag a filter query so that face.tfield knows to ignore that fq when computing the constraint counts... http://wiki.apache.org/solr/SimpleFacetParameters#LocalParams_for_faceting ...but i'm pretty sure that still won't give you what you are looking for. In your mammal example it would just mean that the counts for your name facet would ignore the fq=type:mammal restriction and be based purely on the main q=area:water query ... so instead of excluding salmon(0) from the results, and leaving lion(0) and dog(0) you would get presumably start getting a positive count for salmon, but lin and dog still wouldn't match : q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0 : : would return something like : dolphin (20) : blue whale (20) : salmon (0) = not covered by filter query : lion (0) : dog (0) ...even if you sqaped the fq and q (which would alter your scores drasticly) what taging and excluding changes is the *counts* associated with a facet value -- there is no way to get some zeros to show while other zeros don't. Typically the driving force behind something like this is a hierarchical taxonomy -- your animal example fitting nicely. In those cases, you can make your facets use the full hierarch (ie: mammal/lion, mammal/dog, fish/salmon instead of just lion/dog/salmon) and you can use facet.prefix to get the type of behavior you are talking about. -Hoss
Restricting Facet to FilterQuery in combination with mincount
Hi all, is it possible to restrict the returned facets to only those that apply to the filter query but still use mincount=0? Keeping those that have a count of 0 but apply to the filter, and at the same time leaving out those that are not covered by the filter (and thus 0, as well). Some longer explanation of the question: Example (don't nail me down on biology here, it's just for illustration): q=type:mammalfacet.mincount=0facet.field=type returns facets for all values stored in the field type. Results would look like: mammal(2123) bird(0) dinosaur(0) fish(0) ... In this case setting facet.mincount=1 solves the problem. But consider: q=area:waterfq=type:mammalfacet.field=namefacet.mincount=0 would return something like dolphin (20) blue whale (20) salmon (0) = not covered by filter query lion (0) dog (0) ... (all sorts of animals, every possible value in field name) My question is: how can I exclude those facets from the result that are not covered by the filter query. In this example: how can I exclude the non-mammals from the facets but keep all those mammals that are not matched by the actual query parameter? Thanks! Chantal