Hi Nikolas,
The restart of node is not helping , the node keeps trying to recover and
always fails:

here is the log :
2019-07-31 06:10:08.049 INFO
 (coreZkRegister-1-thread-1-processing-n:replica_host:8983_solr
x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698)
x:parts_shard30_replica_n2697 o.a.s.c.ZkController Core needs to
recover:parts_shard30_replica_n2697

2019-07-31 06:10:08.050 INFO
 (updateExecutor-3-thread-1-processing-n:replica_host:8983_solr
x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698)
x:parts_shard30_replica_n2697 o.a.s.u.DefaultSolrCoreState Running recovery

2019-07-31 06:10:08.056 INFO
 (recoveryExecutor-4-thread-1-processing-n:replica_host:8983_solr
x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698)
x:parts_shard30_replica_n2697 o.a.s.c.RecoveryStrategy Starting recovery
process. recoveringAfterStartup=true

2019-07-31 06:10:08.261 INFO
 (recoveryExecutor-4-thread-1-processing-n:replica_host:8983_solr
x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698)
x:parts_shard30_replica_n2697 o.a.s.c.RecoveryStrategy startupVersions
size=49956 range=[1640550593276674048 to 1640542396328443904]

2019-07-31 06:10:08.328 INFO  (qtp689401025-58)  o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/info/key params={omitHeader=true&wt=json}
status=0 QTime=0

2019-07-31 06:10:09.276 INFO
 (recoveryExecutor-4-thread-1-processing-n:replica_host:8983_solr
x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698)
x:parts_shard30_replica_n2697 o.a.s.c.RecoveryStrategy Failed to connect
leader http://hostname:8983/solr on recovery, try again

The ping request query is being called from solr itself and not via some
script,so there is no way to stop it .

code where the time is hardcoded to 1 sec:

try (HttpSolrClient httpSolrClient = new
HttpSolrClient.Builder(leaderReplica.getCoreUrl())
          .withSocketTimeout(1000)
          .withConnectionTimeout(1000)

.withHttpClient(cc.getUpdateShardHandler().getRecoveryOnlyHttpClient())
          .build()) {
        SolrPingResponse resp = httpSolrClient.ping();
        return leaderReplica;
      } catch (IOException e) {
        log.info("Failed to connect leader {} on recovery, try again",
leaderReplica.getBaseUrl());
        Thread.sleep(500);
      } catch (Exception e) {
        if (e.getCause() instanceof IOException) {
          log.info("Failed to connect leader {} on recovery, try again",
leaderReplica.getBaseUrl());
          Thread.sleep(500);
        } else {
          return leaderReplica;
        }
      }



On Mon, Aug 5, 2019 at 1:19 PM Nicolas Franck <nicolas.fra...@ugent.be>
wrote:

> If the ping request handler is taking too long,
> and the server is not recovering automatically,
> there is not much you can do automatically on that server.
> You have to intervene manually, and restart Solr on that node.
>
> First of all: the ping is just an internal check. If it takes too long
> to respond, the requester (i.e. the script calling it), should stop
> the request, and mark that node as problematic. If there are
> for example memory problems every subsequent request will only enhance
> the problem, and Solr cannot recover from that.
>
> > On 5 Aug 2019, at 06:15, dinesh naik <dineshkumarn...@gmail.com> wrote:
> >
> > Thanks john,Erick and Furknan.
> >
> > I have already defined the ping request handler in solrconfig.xml as
> below:
> > <requestHandler name="/admin/ping" class="solr.PingRequestHandler"> <lst
> > name="invariants"> <str name="qt">/select</str><!-- handler to delegate
> to
> > --> <str name="q">_root_:abc</str> </lst> </requestHandler>
> >
> > My question is regarding the custom query being used. Here i am querying
> > for field _root_ which is available in all of my cluster and defined as a
> > string field. The result for _root_:abc might not get me any match as
> > well(i am ok with not finding any matches, the query should not be taking
> > 10-15 seconds for getting the response).
> >
> > If the response comes within 1 second , then the core recovery issue is
> > solved, hence need your suggestion if using _root_ field in custom query
> is
> > fine?
> >
> >
> > On Mon, Aug 5, 2019 at 2:49 AM Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> >
> >> Hi,
> >>
> >> You can change invariants i.e. *qt* and *q* of a *PingRequestHandler*:
> >>
> >> <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
> >>   <lst name="invariants">
> >>     <str name="qt">/search</str><!-- handler to delegate to -->
> >>     <str name="q">some test query</str>
> >>   </lst>
> >> </requestHandler>
> >>
> >> Check documentation fore more info:
> >>
> >>
> https://lucene.apache.org/solr/7_6_0//solr-core/org/apache/solr/handler/PingRequestHandler.html
> >>
> >> Kind Regards,
> >> Furkan KAMACI
> >>
> >> On Sat, Aug 3, 2019 at 4:17 PM Erick Erickson <erickerick...@gmail.com>
> >> wrote:
> >>
> >>> You can also (I think) explicitly define the ping request handler in
> >>> solrconfig.xml to do something else.
> >>>
> >>>> On Aug 2, 2019, at 9:50 AM, Jörn Franke <jornfra...@gmail.com> wrote:
> >>>>
> >>>> Not sure if this is possible, but why not create a query handler in
> >> Solr
> >>> with any custom query and you use that as ping replacement ?
> >>>>
> >>>>> Am 02.08.2019 um 15:48 schrieb dinesh naik <
> dineshkumarn...@gmail.com
> >>> :
> >>>>>
> >>>>> Hi all,
> >>>>> I have few clusters with huge data set and whenever a node goes down
> >> its
> >>>>> not able to recover due to below reasons:
> >>>>>
> >>>>> 1. ping request handler is taking more than 10-15 seconds to respond.
> >>> The
> >>>>> ping requesthandler however, expects it will return in less than 1
> >>> second
> >>>>> and fails a requestrecovery if it is not responded to in this time.
> >>>>> Therefore recoveries never would start.
> >>>>>
> >>>>> 2. soft commit is very low ie. 5 sec. This is a business requirement
> >> so
> >>>>> not much can be done here.
> >>>>>
> >>>>> As the standard/default admin/ping request handler is using *:*
> >> queries
> >>> ,
> >>>>> the response time is much higher, and i am looking for an option to
> >>> change
> >>>>> the same so that the ping handler returns the results within few
> >>>>> miliseconds.
> >>>>>
> >>>>> here is an example for standard query time:
> >>>>>
> >>>>> ----snip---
> >>>>> curl "
> >>>>>
> >>>
> >>
> http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false&debug=timing
> >>>>> "
> >>>>> {
> >>>>> "responseHeader":{
> >>>>>  "zkConnected":true,
> >>>>>  "status":0,
> >>>>>  "QTime":16620,
> >>>>>  "params":{
> >>>>>    "q":"*:*",
> >>>>>    "distrib":"false",
> >>>>>    "debug":"timing",
> >>>>>    "indent":"on",
> >>>>>    "rows":"0",
> >>>>>    "wt":"json"}},
> >>>>> "response":{"numFound":1329638799,"start":0,"docs":[]
> >>>>> },
> >>>>> "debug":{
> >>>>>  "timing":{
> >>>>>    "time":16620.0,
> >>>>>    "prepare":{
> >>>>>      "time":0.0,
> >>>>>      "query":{
> >>>>>        "time":0.0},
> >>>>>      "facet":{
> >>>>>        "time":0.0},
> >>>>>      "facet_module":{
> >>>>>        "time":0.0},
> >>>>>      "mlt":{
> >>>>>        "time":0.0},
> >>>>>      "highlight":{
> >>>>>        "time":0.0},
> >>>>>      "stats":{
> >>>>>        "time":0.0},
> >>>>>      "expand":{
> >>>>>        "time":0.0},
> >>>>>      "terms":{
> >>>>>        "time":0.0},
> >>>>>      "block-expensive-queries":{
> >>>>>        "time":0.0},
> >>>>>      "slow-query-logger":{
> >>>>>        "time":0.0},
> >>>>>      "debug":{
> >>>>>        "time":0.0}},
> >>>>>    "process":{
> >>>>>      "time":16619.0,
> >>>>>      "query":{
> >>>>>        "time":16619.0},
> >>>>>      "facet":{
> >>>>>        "time":0.0},
> >>>>>      "facet_module":{
> >>>>>        "time":0.0},
> >>>>>      "mlt":{
> >>>>>        "time":0.0},
> >>>>>      "highlight":{
> >>>>>        "time":0.0},
> >>>>>      "stats":{
> >>>>>        "time":0.0},
> >>>>>      "expand":{
> >>>>>        "time":0.0},
> >>>>>      "terms":{
> >>>>>        "time":0.0},
> >>>>>      "block-expensive-queries":{
> >>>>>        "time":0.0},
> >>>>>      "slow-query-logger":{
> >>>>>        "time":0.0},
> >>>>>      "debug":{
> >>>>>        "time":0.0}}}}}
> >>>>>
> >>>>>
> >>>>> ----snap----
> >>>>>
> >>>>> can we use query: _root_:abc in the ping request handler ? Tried this
> >>> query
> >>>>> and its returning the results within few miliseconds and also the
> >> nodes
> >>> are
> >>>>> able to recover without any issue.
> >>>>>
> >>>>> we want to use _root_ field for querying as this field is available
> in
> >>> all
> >>>>> our clusters with below definition:
> >>>>> <field name="_root_" type="string" omitNorms="true" indexed="true"
> >>>>> termOffsets="false" stored="false" termPayloads="false"
> termPositions=
> >>>>> "false" docValues="false" termVectors="false"/>
> >>>>> Could you please let me know if using _root_ for querying in
> >>>>> pingRequestHandler will cause any problem?
> >>>>>
> >>>>> <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
> >> <lst
> >>>>> name="invariants"> <str name="qt">/select</str><!-- handler to
> >> delegate
> >>> to
> >>>>> --> <str name="q">_root_:abc</str> </lst> </requestHandler>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards,
> >>>>> Dinesh Naik
> >>>
> >>>
> >>
> >
> >
> > --
> > Best Regards,
> > Dinesh Naik
>
>

-- 
Best Regards,
Dinesh Naik

Reply via email to