I think there's a change in the behavior of SolrCloud vs. what's in the wiki, but I was hoping someone could confirm for me. I checked JIRA and there were a couple of issues requesting partial results if one server comes down, but that doesn't seem to be the issue here. I also checked CHANGES.txt and don't see anything that seems to apply.

I'm running "Example B: Simple two shard cluster with shard replicas" from the wiki at https://wiki.apache.org/solr/SolrCloud and everything starts out as expected. However, when I get to the part about fail over behavior is when things get a little wonky.

I added data to the shard running on 7475. If I kill 7500, a query to any of the other servers works fine. But if I kill 7475, rather than getting zero results on a search to 8983 or 8900, I get a 503 error:

<response>
   <lst name="responseHeader">
      <int name="status">503</int>
      <int name="QTime">5</int>
      <lst name="params">
         <str name="q">*:*</str>
      </lst>
   </lst>
   <lst name="error">
      <str name="msg">no servers hosting shard:</str>
      <int name="code">503</int>
   </lst>
</response>

I don't see any errors in the consoles.

Also, if I kill 8983, which includes the Zookeeper server, everything dies, rather than just staying in a steady state; the other servers continually show:

Nov 03, 2012 11:39:34 AM org.apache.zookeeper.ClientCnxn$SendThread startConnect
NFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9983
ov 03, 2012 11:39:35 AM org.apache.zookeeper.ClientCnxn$SendThread run
ARNING: Session 0x13ac6cf87890002 for server null, unexpected error, closing socket connection and attempting reconnect
ava.net.ConnectException: Connection refused: no further information
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
       at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143)

ov 03, 2012 11:39:35 AM org.apache.zookeeper.ClientCnxn$SendThread startConnect

over and over again, and a call to any of the servers shows a connection error to 8983.

This is the current 4.0.0 release, running on Windows 7.

If this is the proper behavior and the wiki needs updating, fine; I just need to know. Otherwise if anybody has any clues as to what I may be missing, I'd be grateful. :)

Thanks...

---  Nick

Reply via email to