Re: More Robust Search Timeouts (to Kill Zombie Queries)?
Looking at this, sharding seems to be best and simple option to handle such queries. On Wed, Apr 2, 2014 at 1:26 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Hello Salman, Let's me drop few thoughts on http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E There two aspects of this question: 1. dealing with long running processing (thread divergence actions http://docs.oracle.com/javase/specs/jls/se5.0/html/memory.html#65310) and 2. an actual time checking. terminating or aborting thread (2.) are just a way to tracking time externally, and send interrupt() which the thread should react on, which they don't do now, and we returning to the core issue (1.) Solr's time allowed is to the proper way to handle this things, the only problem is that expect that the only core search is long running, but in your case rewriting MultiTermQuery-s takes a huge time. Let's consider this problem. First of all MultiTermQuery.rewrite() is the nearly design issue, after heavy rewrite occurs, it's thrown away, after search is done. I think the most straightforward way is to address this issue by caching these expensive queries. Solr does it well http://wiki.apache.org/solr/CommonQueryParameters#fq However, only for http://en.wikipedia.org/wiki/Conjunctive_normal_form like queries, there is a workaround allows to cache disjunction legs see http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html If you still want to run expensively rewritten queries you need to implement timeout check (similar to TimeLimitingCollector) for TermsEnum returned from MultiTermQuery.getTermsEnum(), wrapping an actual TermsEnums is the good way, to apply queries injecting time limiting wrapper TermsEnum, you might consider override methods like SolrQueryParserBase.newWildcardQuery(Term) or post process the query three after parsing. On Mon, Mar 31, 2014 at 2:24 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this thread http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E I wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com -- Regards, Salman Akram
Re: More Robust Search Timeouts (to Kill Zombie Queries)?
I have also experienced a similar problem on our cluster, I went ahead and opened SOLR-5986 to track the issue. I know Apache Blur has implemented a mechanism to kill these long running term enumerations, would be fantastic if Solr can get a similar mechanism. -Steve On Apr 15, 2014, at 5:23 AM, Salman Akram salman.ak...@northbaysolutions.net wrote: Looking at this, sharding seems to be best and simple option to handle such queries. On Wed, Apr 2, 2014 at 1:26 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Hello Salman, Let's me drop few thoughts on http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E There two aspects of this question: 1. dealing with long running processing (thread divergence actions http://docs.oracle.com/javase/specs/jls/se5.0/html/memory.html#65310) and 2. an actual time checking. terminating or aborting thread (2.) are just a way to tracking time externally, and send interrupt() which the thread should react on, which they don't do now, and we returning to the core issue (1.) Solr's time allowed is to the proper way to handle this things, the only problem is that expect that the only core search is long running, but in your case rewriting MultiTermQuery-s takes a huge time. Let's consider this problem. First of all MultiTermQuery.rewrite() is the nearly design issue, after heavy rewrite occurs, it's thrown away, after search is done. I think the most straightforward way is to address this issue by caching these expensive queries. Solr does it well http://wiki.apache.org/solr/CommonQueryParameters#fq However, only for http://en.wikipedia.org/wiki/Conjunctive_normal_form like queries, there is a workaround allows to cache disjunction legs see http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html If you still want to run expensively rewritten queries you need to implement timeout check (similar to TimeLimitingCollector) for TermsEnum returned from MultiTermQuery.getTermsEnum(), wrapping an actual TermsEnums is the good way, to apply queries injecting time limiting wrapper TermsEnum, you might consider override methods like SolrQueryParserBase.newWildcardQuery(Term) or post process the query three after parsing. On Mon, Mar 31, 2014 at 2:24 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this thread http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E I wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com -- Regards, Salman Akram
Re: More Robust Search Timeouts (to Kill Zombie Queries)?
So you too never got any response... On Mon, Mar 31, 2014 at 6:57 PM, Luis Lebolo luis.leb...@gmail.com wrote: Hi Salman, I was interested in something similar, take a look at the following thread: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCADSoL-i04aYrsOo2%3DGcaFqsQ3mViF%2Bhn24ArDtT%3D7kpALtVHzA%40mail.gmail.com%3E#archives I never followed through, however. -Luis On Mon, Mar 31, 2014 at 6:24 AM, Salman Akram salman.ak...@northbaysolutions.net wrote: Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this thread http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E I wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram -- Regards, Salman Akram
Re: More Robust Search Timeouts (to Kill Zombie Queries)?
I got responses, but no easy solution to allow me to directly cancel a request. The responses did point to: - timeAllowed query parameter that returns partial results - https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-ThetimeAllowedParameter - A possible hack that I never followed through - http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCANGii8eaSouePGxa7JfvOBhrnJUL++Ct4rQha2pxMefvaWhH=g...@mail.gmail.com%3E Maybe one of those will help you? If they do, make sure to report back! -Luis On Tue, Apr 1, 2014 at 3:13 AM, Salman Akram salman.ak...@northbaysolutions.net wrote: So you too never got any response... On Mon, Mar 31, 2014 at 6:57 PM, Luis Lebolo luis.leb...@gmail.com wrote: Hi Salman, I was interested in something similar, take a look at the following thread: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCADSoL-i04aYrsOo2%3DGcaFqsQ3mViF%2Bhn24ArDtT%3D7kpALtVHzA%40mail.gmail.com%3E#archives I never followed through, however. -Luis On Mon, Mar 31, 2014 at 6:24 AM, Salman Akram salman.ak...@northbaysolutions.net wrote: Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this thread http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E I wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram -- Regards, Salman Akram
Re: More Robust Search Timeouts (to Kill Zombie Queries)?
Hello Salman, Let's me drop few thoughts on http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E There two aspects of this question: 1. dealing with long running processing (thread divergence actions http://docs.oracle.com/javase/specs/jls/se5.0/html/memory.html#65310) and 2. an actual time checking. terminating or aborting thread (2.) are just a way to tracking time externally, and send interrupt() which the thread should react on, which they don't do now, and we returning to the core issue (1.) Solr's time allowed is to the proper way to handle this things, the only problem is that expect that the only core search is long running, but in your case rewriting MultiTermQuery-s takes a huge time. Let's consider this problem. First of all MultiTermQuery.rewrite() is the nearly design issue, after heavy rewrite occurs, it's thrown away, after search is done. I think the most straightforward way is to address this issue by caching these expensive queries. Solr does it well http://wiki.apache.org/solr/CommonQueryParameters#fq However, only for http://en.wikipedia.org/wiki/Conjunctive_normal_form like queries, there is a workaround allows to cache disjunction legs see http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html If you still want to run expensively rewritten queries you need to implement timeout check (similar to TimeLimitingCollector) for TermsEnum returned from MultiTermQuery.getTermsEnum(), wrapping an actual TermsEnums is the good way, to apply queries injecting time limiting wrapper TermsEnum, you might consider override methods like SolrQueryParserBase.newWildcardQuery(Term) or post process the query three after parsing. On Mon, Mar 31, 2014 at 2:24 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this thread http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3EI wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: More Robust Search Timeouts (to Kill Zombie Queries)?
Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this threadhttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3EI wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram
Re: More Robust Search Timeouts (to Kill Zombie Queries)?
Hi Salman, I was interested in something similar, take a look at the following thread: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201401.mbox/%3CCADSoL-i04aYrsOo2%3DGcaFqsQ3mViF%2Bhn24ArDtT%3D7kpALtVHzA%40mail.gmail.com%3E#archives I never followed through, however. -Luis On Mon, Mar 31, 2014 at 6:24 AM, Salman Akram salman.ak...@northbaysolutions.net wrote: Anyone? On Wed, Mar 26, 2014 at 7:55 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: With reference to this thread http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3EI wanted to know if there was any response to that or if Chris Harris himself can comment on what he ended up doing, that would be great! -- Regards, Salman Akram -- Regards, Salman Akram