SolrCloud: Specifying shardId not working correctly, although the failures are
inconsistent.
--------------------------------------------------------------------------------------------
Key: SOLR-3376
URL: https://issues.apache.org/jira/browse/SOLR-3376
Project: Solr
Issue Type: Bug
Components: SolrCloud
Affects Versions: 4.0
Reporter: Erick Erickson
I'm seeing some odd results when specifying "shardId" parameter. I'm trying the
4-node, 2-shard example from the Wiki and specifying shardIds like this:
dir shardId start order runnng ZK port
example 1 1 y 8983
example2 2 2 y 7574
example3 1 3 y 8900
example4 2 4 y 7500
And I'm waiting a bit between starting various examples to let ZK settle down.
Once all of them are started, I was looking at
http://localhost:8983/solr/#/~cloud?view=graph to check out what that looks
like (pretty cool IMO, especially since I didn't have to do it). The problem
was that shard 2 only reported a single instance, while shard1 showed the two
instances I was expecting. I'm running with 3 embedded ZK instances, just for
yucks. Interestingly the node that didn't show up was the only node that was
NOT running ZK.
When I removed all the "shardId" parameters, nuked zoo_data from all
directories and just started them up (with numShards=2 on the bootstrap ZK
node), all 4 nodes showed up just fine.
When starting with shardId specified and trying to go straight to the admin
interface on the node that wasn't showing up, I'd get odd errors like "This
interface requires that you activate the admin request handlers, add the
following configuration to your solrconfig.xml:". I also couldn't search
directly on that machine, "http://localhost:7574/solr/select?q=*:*" returns a
404 error.
Command starting server that's giving me trouble: java -Xmx1G -Djetty.port=7500
-DzkHost=localhost:9983,localhost:8574,localhost:9900 -DshardId=2 -jar start.jar
Command for one that works fine: java -Xmx1G
-Djetty.port=8900 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900
-DshardId=1 -jar start.jar
Sami Siren and he reports similar issues via e-mail conversation. Sami says
that ZK 3.5 apparently (without exhaustive tests) fixed the problem for him,
but when I tried ZK 3.5 I saw the same issue. Of course with all the recent
stuff with Ivy, I may have screwed up when/where the JARs were.
So then I went back to ZK 3.4 and couldn't reproduce the problem. Which seems
highly suspicious to me. It was failing every time before with 3.4, so it
sounds like gremlins.
And then I re-did the stuff with ZK 3.5 and it works fine there too now.
Siiiiggggh. Mostly this is a placeholder to insure we try this, I guarantee
that sys admins will want to assign specific machines to specific shards, so
this'll get used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]