Darren Spehr created SOLR-6995:
----------------------------------
Summary: On a new cluster setup, new nodes never move out of the
'down' state.
Key: SOLR-6995
URL: https://issues.apache.org/jira/browse/SOLR-6995
Project: Solr
Issue Type: Bug
Components: SolrCloud
Affects Versions: 4.10.3
Reporter: Darren Spehr
Priority: Blocker
This is related to a question I posted on
[stackoverflow|http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs]
a day ago.
When deploying a new cluster, new nodes never proceed to an 'active' state. I
have used both a custom (yet simple) deployment, as well as using the example
framework you get when downloading the Solr distribution. Further inspection of
the logs, and comparing them to the output of a 4.10.1 start up, it looks like
the nodes never move on to leader election.
For an apples-to-apples comparison I moved the solr/collection1 directory to
solr/ttPoiMDS, and modified the core.properties appropriately. I also removed
the 'conf' directory, assuming that it would pick this up from our ZooKeeper
cluster.
Here is the start up:
{code}
java
-Dcollection.configName=ttPoiMDS
-Djava.util.logging.config.file=etc/logging.properties
-DzkHost=our_zk_host_1:<zk_port>
-Djetty.port=8983
-jar start.jar
{code}
Here is the output in the logs:
{panel}
0 [main] INFO org.eclipse.jetty.server.Server – jetty-8.1.10.v20130312
32 [main] INFO org.eclipse.jetty.deploy.providers.ScanningAppProvider –
Deployment monitor /site/pkgs/solr/solr-4.10.3/example/contexts at interval 0
38 [main] INFO org.eclipse.jetty.deploy.DeploymentManager – Deployable
added: /site/pkgs/solr/solr-4.10.3/example/contexts/solr-jetty-context.xml
114 [main] INFO org.eclipse.jetty.webapp.WebInfConfiguration – Extract
jar:file:/site/pkgs/solr/solr-4.10.3/example/webapps/solr.war!/ to
/site/pkgs/solr/solr-4.10.3/example/solr-webapp/webapp
1527 [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor – NO
JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
1597 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
SolrDispatchFilter.init()
1616 [main] INFO org.apache.solr.core.SolrResourceLoader – JNDI not
configured for solr (NoInitialContextEx)
1616 [main] INFO org.apache.solr.core.SolrResourceLoader – solr home
defaulted to 'solr/' (could not find system property or JNDI)
1617 [main] INFO org.apache.solr.core.SolrResourceLoader – new
SolrResourceLoader for directory: 'solr/'
1748 [main] INFO org.apache.solr.core.ConfigSolr – Loading container
configuration from /site/pkgs/solr/solr-4.10.3/example/solr/solr.xml
1884 [main] INFO org.apache.solr.core.CoresLocator – Config-defined core root
directory: /site/pkgs/solr/solr-4.10.3/example/solr
1890 [main] INFO org.apache.solr.core.CoreContainer – New CoreContainer
378496804
1890 [main] INFO org.apache.solr.core.CoreContainer – Loading cores into
CoreContainer [instanceDir=solr/]
1905 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting socketTimeout to: 0
1905 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting urlScheme to: null
1909 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting connTimeout to: 0
1911 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting maxConnectionsPerHost to: 20
1912 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting corePoolSize to: 0
1912 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting maximumPoolSize to: 2147483647
1912 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting maxThreadIdleTime to: 5
1912 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting sizeOfQueue to: -1
1912 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory –
Setting fairnessPolicy to: false
2054 [main] INFO org.apache.solr.update.UpdateShardHandler – Creating
UpdateShardHandler HTTP client with params:
socketTimeout=0&connTimeout=0&retry=false
2056 [main] INFO org.apache.solr.logging.LogWatcher – SLF4J impl is
org.slf4j.impl.Log4jLoggerFactory
2057 [main] INFO org.apache.solr.logging.LogWatcher – Registering Log
Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
2058 [main] INFO org.apache.solr.core.CoreContainer – Host Name:
2059 [main] INFO org.apache.solr.core.ZkContainer – Zookeeper
client=our_zk_host_1:2195
2140 [main] INFO org.apache.solr.common.cloud.ConnectionManager – Waiting for
client to connect to ZooKeeper
2220 [main-EventThread] INFO org.apache.solr.common.cloud.ConnectionManager –
Watcher org.apache.solr.common.cloud.ConnectionManager@36dadde6
name:ZooKeeperConnection Watcher:our_zk_host_1:2195 got event WatchedEvent
state:SyncConnected type:None path:null path:null type:None
2220 [main] INFO org.apache.solr.common.cloud.ConnectionManager – Client is
connected to ZooKeeper
2293 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer/queue
2447 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer/collection-queue-work
2609 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer/collection-map-running
2758 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer/collection-map-completed
2919 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer/collection-map-failure
3093 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/live_nodes
3173 [main] INFO org.apache.solr.cloud.ZkController – Register node as live
in ZooKeeper:/live_nodes/our_solr_host:8983_solr
3201 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/live_nodes/our_solr_host:8983_solr
3371 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer_elect
3477 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer_elect/election
3577 [main] INFO org.apache.solr.cloud.Overseer – Overseer (id=null) closing
3659 [main] INFO org.apache.solr.cloud.ElectionContext – I am going to be the
leader our_solr_host:8983_solr
3662 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer_elect/leader
3763 [main] INFO org.apache.solr.cloud.Overseer – Overseer
(id=93155855038349319-our_solr_host:8983_solr-n_0000000000) starting
3887 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/overseer/queue-work
4409 [main] INFO org.apache.solr.cloud.OverseerAutoReplicaFailoverThread –
Starting OverseerAutoReplicaFailoverThread
autoReplicaFailoverWorkLoopDelay=10000
autoReplicaFailoverWaitAfterExpiration=30000
autoReplicaFailoverBadNodeExpiration=60000
4438
[OverseerCollectionProcessor-93155855038349319-our_solr_host:8983_solr-n_0000000000]
INFO org.apache.solr.cloud.OverseerCollectionProcessor – Process current
queue of collection creations
4456 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath:
/clusterstate.json
4559 [main] INFO org.apache.solr.common.cloud.ZkStateReader – Updating
cluster state from ZooKeeper...
4731
[OverseerStateUpdate-93155855038349319-our_solr_host:8983_solr-n_0000000000]
INFO org.apache.solr.cloud.Overseer – Starting to work on the main queue
4736 [main] INFO org.apache.solr.core.CoresLocator – Looking for core
definitions underneath /site/pkgs/solr/solr-4.10.3/example/solr
4744 [main] INFO org.apache.solr.core.CoresLocator – Found core ttPoiMDS in
/site/pkgs/solr/solr-4.10.3/example/solr/ttPoiMDS/
4744 [main] INFO org.apache.solr.core.CoresLocator – Found 1 core definitions
4767 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.cloud.ZkController –
publishing core=ttPoiMDS state=down collection=ttPoiMDS
4767 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.cloud.ZkController –
numShards not found on descriptor - reading it from system property
4796 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.cloud.ZkController –
look for our core node name
4797 [zkCallback-2-thread-1] INFO org.apache.solr.cloud.DistributedQueue –
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
4897
[OverseerStateUpdate-93155855038349319-our_solr_host:8983_solr-n_0000000000]
INFO org.apache.solr.cloud.Overseer – Update state numShards=null message={
"operation":"state",
"shard":null,
"roles":null,
"state":"down",
"core":"ttPoiMDS",
"collection":"ttPoiMDS",
"node_name":"our_solr_host:8983_solr",
"base_url":"http://our_solr_host:8983/solr"}
4899
[OverseerStateUpdate-93155855038349319-our_solr_host:8983_solr-n_0000000000]
INFO org.apache.solr.cloud.Overseer – Assigning new node to shard shard=shard1
5052 [zkCallback-2-thread-1] INFO org.apache.solr.common.cloud.ZkStateReader
– A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged
path:/clusterstate.json, has occurred - updating... (live nodes size: 1)
5800 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.cloud.ZkController –
waiting to find shard id in clusterstate for ttPoiMDS
5801 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.cloud.ZkController –
Check for collection zkNode:ttPoiMDS
5924 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.cloud.ZkController –
Collection zkNode exists
5925 [coreLoadExecutor-6-thread-1] INFO
org.apache.solr.common.cloud.ZkStateReader – Load collection config
from:/collections/ttPoiMDS
5970 [coreLoadExecutor-6-thread-1] INFO
org.apache.solr.common.cloud.ZkStateReader – path=/collections/ttPoiMDS
configName=ttPoiMDS specified config exists in ZooKeeper
5970 [coreLoadExecutor-6-thread-1] INFO
org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for
directory: '/site/pkgs/solr/solr-4.10.3/example/solr/ttPoiMDS/'
6054 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.core.SolrConfig –
Adding specified lib dirs to ClassLoader
6055 [coreLoadExecutor-6-thread-1] WARN
org.apache.solr.core.SolrResourceLoader – Can't find (or read) directory to
add to classloader: ../../lib/mq (resolved as:
/site/pkgs/solr/solr-4.10.3/example/solr/ttPoiMDS/../../lib/mq).
6139 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.core.SolrConfig –
Using Lucene MatchVersion: 4.10.3
6238 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.core.Config – Loaded
SolrConfig: solrconfig.xml
6320 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.schema.IndexSchema –
Reading Solr Schema from /configs/ttPoiMDS/schema.xml
6347 [coreLoadExecutor-6-thread-1] INFO org.apache.solr.schema.IndexSchema –
[ttPoiMDS] Schema name=poiMDS
6582 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
user.dir=/site/pkgs/solr/solr-4.10.3/example
6582 [main] INFO org.apache.solr.servlet.SolrDispatchFilter –
SolrDispatchFilter.init() done
6614 [main] INFO org.eclipse.jetty.server.AbstractConnector – Started
[email protected]:8983
{panel}
Startup then seems to hang. The Solr UI is available, but it shows that no
cores are available. In the cloud view the configuration from ZooKeeper is
present.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]