/clusterstate.json seems to clearly state that all 3 nodes are alive, have ranges, and are active.
Still, it would seem that java is still not properly installed. ZooKeeper is dropping zookeeper.out in the /bin directory, which says this, among other things: Server environment:java.home=/usr/local/java/jdk1.7.0_40/jre Server environment:java.class.path=/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../build/classes:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../build/lib/*.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../conf: Server environment:java.library.path= /usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib There is no /usr/java/... It's really a mystery where zookeeper is getting these values; everything else seems right. But, for me, here's the amazing chunk of traces (cleaned up a bit) Accepted socket connection from /127.0.0.1:39065 Client attempting to establish new session at /127.0.0.1:39065 Established session 0x1421197e6e90002 with negotiated timeout 15000 for client /127.0.0.1:39065 Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0x1 zxid:0xc0 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0x3 zxid:0xc1 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:delete cxid:0xe zxid:0xc2 txntype:-1 reqpath:n/a Error Path:/live_nodes/127.0.1.1:7590_solr Error:KeeperErrorCode = NoNode for /live_nodes/127.0.1.1:7590_solr Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:delete cxid:0x9f zxid:0xcd txntype:-1 reqpath:n/a Error Path:/collections/collection1/leaders/shard3 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard3 2013-10-31 21:01:19,344 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0xa0 zxid:0xce txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0xaa zxid:0xd1 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Accepted socket connection from /10.1.10.180:55528 Client attempting to establish new session at /10.1.10.180:55528 Established session 0x1421197e6e90003 with negotiated timeout 10000 for client /10.1.10.180:55528 WARN Exception causing close of session 0x1421197e6e90003 due to java.io.IOException: Connection reset by peer Closed socket connection for client /10.1.10.180:55528 which had sessionid 0x1421197e6e90003 Sockets from 10.1.10.180 are my windoz box shipping solr documents. I am not sure how I am using 55528 unless that's a solrj behavior. Connection reset by peer would suggest something in my code, but my code is a clone of code supplied in a Solr training course. Must be good. Right? I also have no clue what is /127.0.0.1:39065 -- that's not one of my nodes. The quest continues. On Fri, Nov 1, 2013 at 9:21 AM, Jack Park <jackp...@topicquests.org> wrote: > Alan, > That was brilliant! > My test harness was behind a couple of notches. > > Hah! So, now we open yet another can of strange looking creatures, namely: > > No live SolrServers available to handle this > request:[http://127.0.1.1:8983/solr/collection1] > at > org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:347) > > 3 times, once for each URL I passed into the server. Here is the code: > > String zkurl = "10.1.10.178:2181"; > String solrurla = "10.1.10.178:8983"; > String solrurlb = "10.1.10.178:7574"; > String solrurlc = "10.1.10.178:7590"; > > LBHttpSolrServer sv = new LBHttpSolrServer(solrurla,solrurlb,solrurlc); > CloudSolrServer server = new CloudSolrServer(zkurl,sv); > server.setDefaultCollection("collection1"); > > I am struggling to imagine how 10.1.10.178 got translated to 127.0.1.1 > and the port assignments ignored for each URL passed in. > > That error message seems well known to search engines. One suggestion > is to check the zookeeper logs. According to the zookeeper's log4j > properties, there should be a zookeeper.log in the zookeeper > directory. There is no such log. I went to /etc/zookeeper/Version_2 > and looked at log.1 (binary) but could see hints that this might be > where the 127.0.1.1 is coming from: zookeeper sending such an error > message back. This would suggest that, somehow or other, my nodes are > not properly registering themselves, though no error messages were > tossed when each node was booted. > > solr.log for node1 only reflects queries from the admin page. > > That's what I am working on now. > Thanks! > > On Fri, Nov 1, 2013 at 6:03 AM, Alan Woodward <a...@flax.co.uk> wrote: >> Unknown document router errors are usually caused by using different solr >> and solrj versions - which version of solr and solrj are you using? >> >> Alan Woodward >> www.flax.co.uk >> >> >> On 1 Nov 2013, at 04:19, Jack Park wrote: >> >>> After digging deeper (slow for a *nix newbee), I uncovered issues with >>> the java installation. A step in installation of Oracle Java has it >>> that you -install "java" with the path to <dir>/bin/java. That done, >>> zookeeper seems to be running. >>> >>> I booted three cores (on the same box) -- this is the simple one-box >>> 3-node cloud test, and used the test code from the Lucidworks course >>> to send over and read some documents. That failed with this: >>> Unknown document router '{name=compositeId}' >>> >>> Lots more research. >>> Closer... >>> >>> On Thu, Oct 31, 2013 at 5:44 PM, Jack Park <jackp...@topicquests.org> wrote: >>>> Latest zookeeper is installed on an Ubuntu server box. >>>> Java is 1.7 latest build. >>>> whereis points to java just fine. >>>> /etc/zookeeper is empty. >>>> >>>> boot zookeeper from /bin as sudo ./zkServer.sh start >>>> Console says "Started" >>>> /etc/zookeeper now has a .pid file >>>> In another console, ./zkServer.sh status returns: >>>> "It's probably not running" >>>> >>>> An interesting fact: the log4j.properties file says there should be a >>>> zookeeper.log file in "."; there is no log file. When I do a text >>>> search in the zookeeper source code for where it picks up the >>>> log4j.properties, nothing is found. >>>> >>>> Fascinating, what? This must be a common beginner's question, not >>>> well covered in web-search for my context. Does it ring any bells? >>>> >>>> Many thanks. >>>> Jack >>