Re: Query timeAllowed and its behavior.

2015-08-29 Thread Erick Erickson
See: https://issues.apache.org/jira/browse/SOLR-7990


On Sat, Aug 29, 2015 at 1:04 PM, Erick Erickson erickerick...@gmail.com wrote:
 OK, belay that. On a whim I decided to look at what happens if
 I changed things around to use an fq clause. It's apparently
 not the queryResultCache that's the problem, it's the filterCache.

 Raising a JIRA soon. But I'm not sure where things are going
 wrong, the filterCache stats aren't indicating a problem but the
 number of returned docs is definitely wrong.

 Best,
 Erick

 On Sat, Aug 29, 2015 at 12:29 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 Hmmm, I took a whack at trying to create a unit test for this
 and I can't get it to fail. The test works like this

 index 100 docs
 send a query that exceeds timeAllowed
 check that the stats on the queryResultCache show no inserts
 check that partial results are indicated
 check that the number of docs found  100
 re-send the same query with long timeAllowed
 check that there has been a single insert into the queryResultCache
 check that there are still no hits on the queryResultCache
 check that the number of docs found == 100

 I do see one anomaly, that is after the second call the response
 _still_ indicates
 partial results, but this isn't quite the same thing.

 Are you sure that some other layer isn't caching things?  What do you see if
 you look at the admin/plugins-stats/queryResultCachehits before and after
 the calls? If it's truly the queryResultCache, you should se no
 additional insert
 for the call that exceeds timeAllowed, and for the call that completes before
 timeAllowed expires you should see an additional insert but no increment in
 the hit count for that cache.

 Best,
 Erick

 On Fri, Aug 28, 2015 at 10:48 PM, Shawn Heisey apa...@elyograg.org wrote:
 On 8/28/2015 10:47 PM, William Bell wrote:
 As we reported, we are having issues with timeAllowed on 5.2.1. If we set a
 timeAllowed=1 and then run the same query with timeAllowed=3 we get the
 # of rows that was returned on the first query.

 It appears the results are cached when exceeding the timeAllowed, like the
 results are correct - when they are truncated.

 SEEMS LIKE A BUG TO ME.

 That sounds like a bug to me, too.

 Is there any indication in the results the first time that the query was
 aborted before it finished?  If Solr can detect that it aborted the
 query, it should not be caching the results.

 Thanks,
 Shawn



Re: Query timeAllowed and its behavior.

2015-08-29 Thread Erick Erickson
OK, belay that. On a whim I decided to look at what happens if
I changed things around to use an fq clause. It's apparently
not the queryResultCache that's the problem, it's the filterCache.

Raising a JIRA soon. But I'm not sure where things are going
wrong, the filterCache stats aren't indicating a problem but the
number of returned docs is definitely wrong.

Best,
Erick

On Sat, Aug 29, 2015 at 12:29 PM, Erick Erickson
erickerick...@gmail.com wrote:
 Hmmm, I took a whack at trying to create a unit test for this
 and I can't get it to fail. The test works like this

 index 100 docs
 send a query that exceeds timeAllowed
 check that the stats on the queryResultCache show no inserts
 check that partial results are indicated
 check that the number of docs found  100
 re-send the same query with long timeAllowed
 check that there has been a single insert into the queryResultCache
 check that there are still no hits on the queryResultCache
 check that the number of docs found == 100

 I do see one anomaly, that is after the second call the response
 _still_ indicates
 partial results, but this isn't quite the same thing.

 Are you sure that some other layer isn't caching things?  What do you see if
 you look at the admin/plugins-stats/queryResultCachehits before and after
 the calls? If it's truly the queryResultCache, you should se no
 additional insert
 for the call that exceeds timeAllowed, and for the call that completes before
 timeAllowed expires you should see an additional insert but no increment in
 the hit count for that cache.

 Best,
 Erick

 On Fri, Aug 28, 2015 at 10:48 PM, Shawn Heisey apa...@elyograg.org wrote:
 On 8/28/2015 10:47 PM, William Bell wrote:
 As we reported, we are having issues with timeAllowed on 5.2.1. If we set a
 timeAllowed=1 and then run the same query with timeAllowed=3 we get the
 # of rows that was returned on the first query.

 It appears the results are cached when exceeding the timeAllowed, like the
 results are correct - when they are truncated.

 SEEMS LIKE A BUG TO ME.

 That sounds like a bug to me, too.

 Is there any indication in the results the first time that the query was
 aborted before it finished?  If Solr can detect that it aborted the
 query, it should not be caching the results.

 Thanks,
 Shawn



Re: Query timeAllowed and its behavior.

2015-08-28 Thread William Bell
As we reported, we are having issues with timeAllowed on 5.2.1. If we set a
timeAllowed=1 and then run the same query with timeAllowed=3 we get the
# of rows that was returned on the first query.

It appears the results are cached when exceeding the timeAllowed, like the
results are correct - when they are truncated.

SEEMS LIKE A BUG TO ME.

On Tue, Aug 25, 2015 at 5:16 AM, Jonathon Marks (BLOOMBERG/ LONDON) 
jmark...@bloomberg.net wrote:

 timeAllowed applies to the time taken by the collector in each shard
 (TimeLimitingCollector). Once timeAllowed is exceeded the collector
 terminates early, returning any partial results it has and freeing the
 resources it was using.
 From Solr 5.0 timeAllowed also applies to the query expansion phase and
 SolrClient request retry.

 From: solr-user@lucene.apache.org At: Aug 25 2015 10:18:07
 Subject: Re:Query timeAllowed and its behavior.

 Hi,

 Kindly help me understand the query time allowed attribute. The following
 is set in solrconfig.xml.
 int name=timeAllowed30/int

 Does this setting stop the query from running after the timeAllowed is
 reached? If not is there a way to stop it as it will occupy resources in
 background for no benefit.

 Thanks,
 Modassar





-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Query timeAllowed and its behavior.

2015-08-28 Thread Shawn Heisey
On 8/28/2015 10:47 PM, William Bell wrote:
 As we reported, we are having issues with timeAllowed on 5.2.1. If we set a
 timeAllowed=1 and then run the same query with timeAllowed=3 we get the
 # of rows that was returned on the first query.
 
 It appears the results are cached when exceeding the timeAllowed, like the
 results are correct - when they are truncated.
 
 SEEMS LIKE A BUG TO ME.

That sounds like a bug to me, too.

Is there any indication in the results the first time that the query was
aborted before it finished?  If Solr can detect that it aborted the
query, it should not be caching the results.

Thanks,
Shawn



Re: Query timeAllowed and its behavior.

2015-08-25 Thread Shawn Heisey
On 8/25/2015 3:18 AM, Modassar Ather wrote:
 Kindly help me understand the query time allowed attribute. The following
 is set in solrconfig.xml.
 int name=timeAllowed30/int

 Does this setting stop the query from running after the timeAllowed is
 reached? If not is there a way to stop it as it will occupy resources in
 background for no benefit.

That is certainly the *goal* of timeAllowed ... but mostly it serves as
a way to try and offer a guarantee that a query will not take longer
than a certain amount of time, so your user application will receive a
response, which might be an error or negative response, within that
stated timeframe.  Multithreaded programming is tricky in the best
circumstances.  If you introduce the idea of killing threads into the
mix, it becomes REALLY complicated.  I would not be very surprised to
learn that parts of the query which run in parallel, such as the filter
queries, continue to run in the background and populate caches even if
the user query has been aborted because of timeAllowed.

You could open a feature request issue in Jira, but I suspect that
aborting *everything* for timeAllowed is a really hard problem that
nobody wants to tackle.  If you can figure out how to solve it, your
patch will be reviewed and possibly committed.

Thanks,
Shawn



Re: Query timeAllowed and its behavior.

2015-08-25 Thread Modassar Ather
Thanks for your response Jonathon.

Please correct me if I am wrong in following points.
   -query actually ceases to run once time allowed is reached and releases
all the resources.
   -query expansion is stopped and the query is terminated from execution
releasing all the resources.

Thanks,
Modassar

On Tue, Aug 25, 2015 at 4:46 PM, Jonathon Marks (BLOOMBERG/ LONDON) 
jmark...@bloomberg.net wrote:

 timeAllowed applies to the time taken by the collector in each shard
 (TimeLimitingCollector). Once timeAllowed is exceeded the collector
 terminates early, returning any partial results it has and freeing the
 resources it was using.
 From Solr 5.0 timeAllowed also applies to the query expansion phase and
 SolrClient request retry.

 From: solr-user@lucene.apache.org At: Aug 25 2015 10:18:07
 Subject: Re:Query timeAllowed and its behavior.

 Hi,

 Kindly help me understand the query time allowed attribute. The following
 is set in solrconfig.xml.
 int name=timeAllowed30/int

 Does this setting stop the query from running after the timeAllowed is
 reached? If not is there a way to stop it as it will occupy resources in
 background for no benefit.

 Thanks,
 Modassar