Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-04-15 Thread Salman Akram
Looking at this, sharding seems to be best and simple option to handle such
queries.


On Wed, Apr 2, 2014 at 1:26 AM, Mikhail Khludnev mkhlud...@griddynamics.com
 wrote:

 Hello Salman,
 Let's me drop few thoughts on

 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E

 There two aspects of this question:
 1. dealing with long running processing (thread divergence actions
 http://docs.oracle.com/javase/specs/jls/se5.0/html/memory.html#65310) and
 2. an actual time checking.
 terminating or aborting thread (2.) are just a way to tracking time
 externally, and send interrupt() which the thread should react on, which
 they don't do now, and we returning to the core issue (1.)

 Solr's time allowed is to the proper way to handle this things, the only
 problem is that expect that the only core search is long running, but in
 your case rewriting MultiTermQuery-s takes a huge time.
 Let's consider this problem. First of all MultiTermQuery.rewrite() is the
 nearly design issue, after heavy rewrite occurs, it's thrown away, after
 search is done. I think the most straightforward way is to address this
 issue by caching these expensive queries. Solr does it well
 http://wiki.apache.org/solr/CommonQueryParameters#fq However, only for
 http://en.wikipedia.org/wiki/Conjunctive_normal_form like queries, there
 is
 a workaround allows to cache disjunction legs see
 http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html
 If you still want to run expensively rewritten queries you need to
 implement timeout check (similar to TimeLimitingCollector) for TermsEnum
 returned from MultiTermQuery.getTermsEnum(), wrapping an actual TermsEnums
 is the good way, to apply queries injecting time limiting wrapper
 TermsEnum, you might consider override methods like
 SolrQueryParserBase.newWildcardQuery(Term) or post process the query three
 after parsing.



 On Mon, Mar 31, 2014 at 2:24 PM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  Anyone?
 
 
  On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
   With reference to this thread
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E
 I
  wanted to know if there was any response to that or if Chris Harris
   himself can comment on what he ended up doing, that would be great!
  
  
   --
   Regards,
  
   Salman Akram
  
  
 
 
  --
  Regards,
 
  Salman Akram
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com




-- 
Regards,

Salman Akram


Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-04-15 Thread Steve Davids
I have also experienced a similar problem on our cluster, I went ahead and 
opened SOLR-5986 to track the issue. I know Apache Blur has implemented a 
mechanism to kill these long running term enumerations, would be fantastic if 
Solr can get a similar mechanism.

-Steve

On Apr 15, 2014, at 5:23 AM, Salman Akram salman.ak...@northbaysolutions.net 
wrote:

 Looking at this, sharding seems to be best and simple option to handle such
 queries.
 
 
 On Wed, Apr 2, 2014 at 1:26 AM, Mikhail Khludnev mkhlud...@griddynamics.com
 wrote:
 
 Hello Salman,
 Let's me drop few thoughts on
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E
 
 There two aspects of this question:
 1. dealing with long running processing (thread divergence actions
 http://docs.oracle.com/javase/specs/jls/se5.0/html/memory.html#65310) and
 2. an actual time checking.
 terminating or aborting thread (2.) are just a way to tracking time
 externally, and send interrupt() which the thread should react on, which
 they don't do now, and we returning to the core issue (1.)
 
 Solr's time allowed is to the proper way to handle this things, the only
 problem is that expect that the only core search is long running, but in
 your case rewriting MultiTermQuery-s takes a huge time.
 Let's consider this problem. First of all MultiTermQuery.rewrite() is the
 nearly design issue, after heavy rewrite occurs, it's thrown away, after
 search is done. I think the most straightforward way is to address this
 issue by caching these expensive queries. Solr does it well
 http://wiki.apache.org/solr/CommonQueryParameters#fq However, only for
 http://en.wikipedia.org/wiki/Conjunctive_normal_form like queries, there
 is
 a workaround allows to cache disjunction legs see
 http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html
 If you still want to run expensively rewritten queries you need to
 implement timeout check (similar to TimeLimitingCollector) for TermsEnum
 returned from MultiTermQuery.getTermsEnum(), wrapping an actual TermsEnums
 is the good way, to apply queries injecting time limiting wrapper
 TermsEnum, you might consider override methods like
 SolrQueryParserBase.newWildcardQuery(Term) or post process the query three
 after parsing.
 
 
 
 On Mon, Mar 31, 2014 at 2:24 PM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:
 
 Anyone?
 
 
 On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:
 
 With reference to this thread
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E
 I
 wanted to know if there was any response to that or if Chris Harris
 himself can comment on what he ended up doing, that would be great!
 
 
 --
 Regards,
 
 Salman Akram
 
 
 
 
 --
 Regards,
 
 Salman Akram
 
 
 
 
 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics
 
 http://www.griddynamics.com
 mkhlud...@griddynamics.com
 
 
 
 
 -- 
 Regards,
 
 Salman Akram



Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-04-01 Thread Salman Akram
So you too never got any response...


On Mon, Mar 31, 2014 at 6:57 PM, Luis Lebolo luis.leb...@gmail.com wrote:

 Hi Salman,

 I was interested in something similar, take a look at the following thread:

 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCADSoL-i04aYrsOo2%3DGcaFqsQ3mViF%2Bhn24ArDtT%3D7kpALtVHzA%40mail.gmail.com%3E#archives

 I never followed through, however.

 -Luis


 On Mon, Mar 31, 2014 at 6:24 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  Anyone?
 
 
  On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
   With reference to this thread
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E
 I
  wanted to know if there was any response to that or if Chris Harris
   himself can comment on what he ended up doing, that would be great!
  
  
   --
   Regards,
  
   Salman Akram
  
  
 
 
  --
  Regards,
 
  Salman Akram
 




-- 
Regards,

Salman Akram


Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-04-01 Thread Luis Lebolo
I got responses, but no easy solution to allow me to directly cancel a
request. The responses did point to:

   - timeAllowed query parameter that returns partial results -
   
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-ThetimeAllowedParameter
   - A possible hack that I never followed through -
   
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCANGii8eaSouePGxa7JfvOBhrnJUL++Ct4rQha2pxMefvaWhH=g...@mail.gmail.com%3E

Maybe one of those will help you? If they do, make sure to report back!

-Luis


On Tue, Apr 1, 2014 at 3:13 AM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:

 So you too never got any response...


 On Mon, Mar 31, 2014 at 6:57 PM, Luis Lebolo luis.leb...@gmail.com
 wrote:

  Hi Salman,
 
  I was interested in something similar, take a look at the following
 thread:
 
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCADSoL-i04aYrsOo2%3DGcaFqsQ3mViF%2Bhn24ArDtT%3D7kpALtVHzA%40mail.gmail.com%3E#archives
 
  I never followed through, however.
 
  -Luis
 
 
  On Mon, Mar 31, 2014 at 6:24 AM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
   Anyone?
  
  
   On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
   salman.ak...@northbaysolutions.net wrote:
  
With reference to this thread
  
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E
  I
   wanted to know if there was any response to that or if Chris Harris
himself can comment on what he ended up doing, that would be great!
   
   
--
Regards,
   
Salman Akram
   
   
  
  
   --
   Regards,
  
   Salman Akram
  
 



 --
 Regards,

 Salman Akram



Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-04-01 Thread Mikhail Khludnev
Hello Salman,
Let's me drop few thoughts on
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E

There two aspects of this question:
1. dealing with long running processing (thread divergence actions
http://docs.oracle.com/javase/specs/jls/se5.0/html/memory.html#65310) and
2. an actual time checking.
terminating or aborting thread (2.) are just a way to tracking time
externally, and send interrupt() which the thread should react on, which
they don't do now, and we returning to the core issue (1.)

Solr's time allowed is to the proper way to handle this things, the only
problem is that expect that the only core search is long running, but in
your case rewriting MultiTermQuery-s takes a huge time.
Let's consider this problem. First of all MultiTermQuery.rewrite() is the
nearly design issue, after heavy rewrite occurs, it's thrown away, after
search is done. I think the most straightforward way is to address this
issue by caching these expensive queries. Solr does it well
http://wiki.apache.org/solr/CommonQueryParameters#fq However, only for
http://en.wikipedia.org/wiki/Conjunctive_normal_form like queries, there is
a workaround allows to cache disjunction legs see
http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html
If you still want to run expensively rewritten queries you need to
implement timeout check (similar to TimeLimitingCollector) for TermsEnum
returned from MultiTermQuery.getTermsEnum(), wrapping an actual TermsEnums
is the good way, to apply queries injecting time limiting wrapper
TermsEnum, you might consider override methods like
SolrQueryParserBase.newWildcardQuery(Term) or post process the query three
after parsing.



On Mon, Mar 31, 2014 at 2:24 PM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:

 Anyone?


 On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  With reference to this thread
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3EI
 wanted to know if there was any response to that or if Chris Harris
  himself can comment on what he ended up doing, that would be great!
 
 
  --
  Regards,
 
  Salman Akram
 
 


 --
 Regards,

 Salman Akram




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-03-31 Thread Salman Akram
Anyone?


On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:

 With reference to this 
 threadhttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3EI
  wanted to know if there was any response to that or if Chris Harris
 himself can comment on what he ended up doing, that would be great!


 --
 Regards,

 Salman Akram




-- 
Regards,

Salman Akram


Re: More Robust Search Timeouts (to Kill Zombie Queries)?

2014-03-31 Thread Luis Lebolo
Hi Salman,

I was interested in something similar, take a look at the following thread:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCADSoL-i04aYrsOo2%3DGcaFqsQ3mViF%2Bhn24ArDtT%3D7kpALtVHzA%40mail.gmail.com%3E#archives

I never followed through, however.

-Luis


On Mon, Mar 31, 2014 at 6:24 AM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:

 Anyone?


 On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  With reference to this thread
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3EI
 wanted to know if there was any response to that or if Chris Harris
  himself can comment on what he ended up doing, that would be great!
 
 
  --
  Regards,
 
  Salman Akram
 
 


 --
 Regards,

 Salman Akram