[ 
https://issues.apache.org/jira/browse/SOLR-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995734#comment-15995734
 ] 

Mikhail Khludnev commented on SOLR-9867:
----------------------------------------

{code}
   [junit4]   2> 35073 INFO  (qtp1031286021-158) [n:localhost:32820_solr    ] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections 
params={replicationFactor=2&maxShardsPerNode=4&collection.configName=testCloudExamplePrompt&name=testCloudExamplePrompt&action=CREATE&numShards=2&wt=json}
 status=0 QTime=5117
   [junit4]   2> 35108 INFO  (qtp1031286021-207) [n:localhost:32820_solr 
c:testCloudExamplePrompt s:shard2 r:core_node1 
x:testCloudExamplePrompt_shard2_replica1] o.a.s.h.SolrConfigHandler Executed 
config commands successfully and persisted to ZK 
[{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}]
   [junit4]   2> 35111 INFO  (qtp1031286021-207) [n:localhost:32820_solr 
c:testCloudExamplePrompt s:shard2 r:core_node1 
x:testCloudExamplePrompt_shard2_replica1] o.a.s.h.SolrConfigHandler Waiting up 
to 30 secs for 4 replicas to set the property overlay to be of version 0 for 
collection testCloudExamplePrompt
   [junit4]   2> 35112 INFO  (Thread-81) [n:localhost:32820_solr    ] 
o.a.s.c.SolrCore config update listener called for core 
testCloudExamplePrompt_shard2_replica2
   [junit4]   2> 35115 INFO  
(solrHandlerExecutor-81-thread-1-processing-n:localhost:32820_solr 
x:testCloudExamplePrompt_shard2_replica1 s:shard2 c:testCloudExamplePrompt 
r:core_node1) [n:localhost:32820_solr c:testCloudExamplePrompt s:shard2 
r:core_node1 x:testCloudExamplePrompt_shard2_replica1] 
o.a.s.h.SolrConfigHandler Time elapsed : 0 secs, maxWait 30
   [junit4]   2> 35115 INFO  (Thread-81) [n:localhost:32820_solr    ] 
o.a.s.c.SolrCore core reload testCloudExamplePrompt_shard2_replica2
{code}
Collection has been created, param update is sent, Zk listener {{(Thread-81)}} 
starts core reload 
{code}
   [junit4]   2> 39099 INFO  (qtp1031286021-155) [n:localhost:32820_solr    ] 
o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.core.testCloudExamplePrompt.shard2.replica2, tag=149464127
   [junit4]   2> 39099 INFO  (qtp1031286021-155) [n:localhost:32820_solr    ] 
o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.collection.testCloudExamplePrompt.shard2.leader, tag=149464127
   [junit4]   2> 39106 INFO  
(zkCallback-16-thread-1-processing-n:localhost:32820_solr) 
[n:localhost:32820_solr    ] o.a.s.c.c.ZkStateReader A cluster state change: 
[WatchedEvent state:SyncConnected type:NodeDataChanged 
path:/collections/testCloudExamplePrompt/state.json] for collection 
[testCloudExamplePrompt] has occurred - updating... (live nodes size: [1])
   [junit4]   2> 39106 INFO  (qtp1031286021-221) [n:localhost:32820_solr    ] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores 
params={deleteInstanceDir=true&core=testCloudExamplePrompt_shard1_replica2&qt=/admin/cores&deleteDataDir=true&action=UNLOAD&wt=javabin&version=2}
 status=0 QTime=107
   [junit4]   2> 39107 INFO  (Thread-81) [n:localhost:32820_solr 
c:testCloudExamplePrompt s:shard2 r:core_node3 
x:testCloudExamplePrompt_shard2_replica2] o.a.s.m.r.SolrJmxReporter JMX 
monitoring for 'solr.core.testCloudExamplePrompt.shard1.replica2' (registry 
'solr.core.testCloudExamplePrompt.shard1.replica2') enabled at server: 
com.sun.jmx.mbeanserver.JmxMBeanServer@167c2a09
   [junit4]   2> 39108 INFO  (qtp1031286021-187) [n:localhost:32820_solr    ] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores 
params={deleteInstanceDir=true&core=testCloudExamplePrompt_shard2_replica1&qt=/admin/cores&deleteDataDir=true&action=UNLOAD&wt=javabin&version=2}
 status=0 QTime=77
   [junit4]   2> 39109 WARN  
(zkCallback-16-thread-1-processing-n:localhost:32820_solr) 
[n:localhost:32820_solr    ] o.a.s.c.LeaderElector Our node is no longer in 
line to be leader
   [junit4]   2> 39109 WARN  
(zkCallback-16-thread-2-processing-n:localhost:32820_solr) 
[n:localhost:32820_solr    ] o.a.s.c.LeaderElector Our node is no longer in 
line to be leader
   [junit4]   2> 39113 WARN  (Thread-81) [n:localhost:32820_solr 
c:testCloudExamplePrompt s:shard2 r:core_node3 
x:testCloudExamplePrompt_shard2_replica2] o.a.s.c.ZkController listener throws 
error
   [junit4]   2> org.apache.solr.common.SolrException: Unable to reload core 
[testCloudExamplePrompt_shard1_replica2]
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1197)
   [junit4]   2>        at 
org.apache.solr.core.SolrCore.lambda$getConfListener$18(SolrCore.java:2953)
   [junit4]   2>        at 
org.apache.solr.cloud.ZkController.lambda$fireEventListeners$4(ZkController.java:2350)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:748)
   [junit4]   2> Caused by: java.lang.NullPointerException
   [junit4]   2>        at 
org.apache.solr.metrics.SolrMetricManager.loadShardReporters(SolrMetricManager.java:1032)
   [junit4]   2>        at 
org.apache.solr.metrics.SolrCoreMetricManager.loadReporters(SolrCoreMetricManager.java:89)
   [junit4]   2>        at 
org.apache.solr.core.SolrCore.<init>(SolrCore.java:897)
   [junit4]   2>        at 
org.apache.solr.core.SolrCore.reload(SolrCore.java:648)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1184)
   [junit4]   2>        ... 3 more
   [junit4]   2> 39114 INFO  (Thread-81) [n:localhost:32820_solr 
c:testCloudExamplePrompt s:shard2 r:core_node3 
x:testCloudExamplePrompt_shard2_replica2] o.a.s.c.SolrCore config update 
listener called for core testCloudExamplePrompt_shard2_replica1
   [junit4]   2> 39114 INFO  (Thread-81) [n:localhost:32820_solr 
c:testCloudExamplePrompt s:shard2 r:core_node3 
x:testCloudExamplePrompt_shard2_replica2] o.a.s.c.SolrCore config update 
listener called for core testCloudExamplePrompt_shard1_replica1
   [junit4]   2> 39122 INFO  (qtp1031286021-155) [n:localhost:32820_solr    ] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores 
params={deleteInstanceDir=true&core=testCloudExamplePrompt_shard2_replica2&qt=/admin/cores&deleteDataDir=true&action=UNLOAD&wt=javabin&version=2}
 status=0 QTime=89
   [junit4]   2> 39122 INFO  (qtp1031286021-208) [n:localhost:32820_solr    ] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores 
params={deleteInstanceDir=true&core=testCloudExamplePrompt_shard1_replica1&qt=/admin/cores&deleteDataDir=true&action=UNLOAD&wt=javabin&version=2}
 status=0 QTime=116
   [junit4]   2> 39840 INFO  (qtp1031286021-207) [n:localhost:32820_solr    ] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections 
params={name=testCloudExamplePrompt&action=DELETE&wt=json} status=0 QTime=867
{code}
Core unloading {{(qtp1031286021-155)}} closed Metrics's registry and right 
after that reloading core tries to register in it. This causes NPE, which leaks 
the "old" core, don't know why. 

I don't consider it as a blocker for commit. WDYT?  

> The Solr examples can not always be started after being stopped due to race 
> with loading core.
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9867
>                 URL: https://issues.apache.org/jira/browse/SOLR-9867
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Mark Miller
>            Assignee: Mikhail Khludnev
>            Priority: Critical
>             Fix For: 6.6, master (7.0)
>
>         Attachments: SDF init and doFilter in parallel.png, 
> SOLR-9867-ignore-whitespace.patch, SOLR-9867.patch, SOLR-9867.patch, 
> SOLR-9867.patch, SOLR-9867.patch, SOLR-9867.patch, SOLR-9867-test.patch, 
> stdout_90
>
>
> I'm having trouble when I start up the schemaless example after shutting down.
> I first tracked this down to the fact that the run example tool is getting an 
> error when it tries to create the SolrCore (again, it already exists) and so 
> it deletes the cores instance dir which leads to tlog and index lock errors 
> in Solr.
> The reason it seems to be trying to create the core when it already exists is 
> that the run example tool uses a core status call to check existence and 
> because the core is loading, we don't consider it as existing. I added a 
> check to look for core.properties.
> That seemed to let me start up, but my first requests failed because the core 
> was still loading. It appears CoreContainer#getCore  is supposed to be 
> blocking so you don't have this problem, but there must be an issue, because 
> it is not blocking.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to