[jira] [Updated] (SOLR-6923) AutoAddReplicas should consult live nodes also to see if a state has changed

2015-02-28 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-6923:

  Component/s: SolrCloud
Fix Version/s: 4.10.5

I am marking this for 4.10.5 whenever that happens. I fixed the bug I reported 
in my last comment with SOLR-7178.

 AutoAddReplicas should consult live nodes also to see if a state has changed
 

 Key: SOLR-6923
 URL: https://issues.apache.org/jira/browse/SOLR-6923
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Varun Thacker
Assignee: Mark Miller
 Fix For: 4.10.5, 5.0, Trunk

 Attachments: SOLR-6923.patch


 - I did the following 
 {code}
 ./solr start -e cloud -noprompt
 kill -9 pid-of-node2 //Not the node which is running ZK
 {code}
 - /live_nodes reflects that the node is gone.
 - This is the only message which gets logged on the node1 server after 
 killing node2
 {code}
 45812 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN  
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream exception
 EndOfStreamException: Unable to read additional data from client sessionid 
 0x14ac40f26660001, likely client has closed socket
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 - The graph shows the node2 as 'Gone' state
 - clusterstate.json keeps showing the replica as 'active'
 {code}
 {collection1:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{
   core_node1:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8983_solr,
 base_url:http://169.254.113.194:8983/solr;,
 leader:true},
   core_node2:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8984_solr,
 base_url:http://169.254.113.194:8984/solr,
 maxShardsPerNode:1,
 router:{name:compositeId},
 replicationFactor:1,
 autoAddReplicas:false,
 autoCreated:true}}
 {code}
 One immediate problem I can see is that AutoAddReplicas doesn't work since 
 the clusterstate.json never changes. There might be more features which are 
 affected by this.
 On first thought I think we can handle this - The shard leader could listen 
 to changes on /live_nodes and if it has replicas that were on that node, mark 
 it as 'down' in the clusterstate.json?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6923) AutoAddReplicas should consult live nodes also to see if a state has changed

2015-01-13 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-6923:
--
Assignee: Anshum Gupta

 AutoAddReplicas should consult live nodes also to see if a state has changed
 

 Key: SOLR-6923
 URL: https://issues.apache.org/jira/browse/SOLR-6923
 Project: Solr
  Issue Type: Bug
Reporter: Varun Thacker
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: SOLR-6923.patch


 - I did the following 
 {code}
 ./solr start -e cloud -noprompt
 kill -9 pid-of-node2 //Not the node which is running ZK
 {code}
 - /live_nodes reflects that the node is gone.
 - This is the only message which gets logged on the node1 server after 
 killing node2
 {code}
 45812 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN  
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream exception
 EndOfStreamException: Unable to read additional data from client sessionid 
 0x14ac40f26660001, likely client has closed socket
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 - The graph shows the node2 as 'Gone' state
 - clusterstate.json keeps showing the replica as 'active'
 {code}
 {collection1:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{
   core_node1:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8983_solr,
 base_url:http://169.254.113.194:8983/solr;,
 leader:true},
   core_node2:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8984_solr,
 base_url:http://169.254.113.194:8984/solr,
 maxShardsPerNode:1,
 router:{name:compositeId},
 replicationFactor:1,
 autoAddReplicas:false,
 autoCreated:true}}
 {code}
 One immediate problem I can see is that AutoAddReplicas doesn't work since 
 the clusterstate.json never changes. There might be more features which are 
 affected by this.
 On first thought I think we can handle this - The shard leader could listen 
 to changes on /live_nodes and if it has replicas that were on that node, mark 
 it as 'down' in the clusterstate.json?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6923) AutoAddReplicas should consult live nodes also to see if a state has changed

2015-01-12 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-6923:

Attachment: SOLR-6923.patch

Simple patch which checks against live nodes before short circuiting.

SharedFSAutoReplicaFailoverTest passes. 

 AutoAddReplicas should consult live nodes also to see if a state has changed
 

 Key: SOLR-6923
 URL: https://issues.apache.org/jira/browse/SOLR-6923
 Project: Solr
  Issue Type: Bug
Reporter: Varun Thacker
 Attachments: SOLR-6923.patch


 - I did the following 
 {code}
 ./solr start -e cloud -noprompt
 kill -9 pid-of-node2 //Not the node which is running ZK
 {code}
 - /live_nodes reflects that the node is gone.
 - This is the only message which gets logged on the node1 server after 
 killing node2
 {code}
 45812 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN  
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream exception
 EndOfStreamException: Unable to read additional data from client sessionid 
 0x14ac40f26660001, likely client has closed socket
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 - The graph shows the node2 as 'Gone' state
 - clusterstate.json keeps showing the replica as 'active'
 {code}
 {collection1:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{
   core_node1:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8983_solr,
 base_url:http://169.254.113.194:8983/solr;,
 leader:true},
   core_node2:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8984_solr,
 base_url:http://169.254.113.194:8984/solr,
 maxShardsPerNode:1,
 router:{name:compositeId},
 replicationFactor:1,
 autoAddReplicas:false,
 autoCreated:true}}
 {code}
 One immediate problem I can see is that AutoAddReplicas doesn't work since 
 the clusterstate.json never changes. There might be more features which are 
 affected by this.
 On first thought I think we can handle this - The shard leader could listen 
 to changes on /live_nodes and if it has replicas that were on that node, mark 
 it as 'down' in the clusterstate.json?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6923) AutoAddReplicas should consult live nodes also to see if a state has changed

2015-01-12 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-6923:
---
Fix Version/s: Trunk
   5.0

 AutoAddReplicas should consult live nodes also to see if a state has changed
 

 Key: SOLR-6923
 URL: https://issues.apache.org/jira/browse/SOLR-6923
 Project: Solr
  Issue Type: Bug
Reporter: Varun Thacker
 Fix For: 5.0, Trunk

 Attachments: SOLR-6923.patch


 - I did the following 
 {code}
 ./solr start -e cloud -noprompt
 kill -9 pid-of-node2 //Not the node which is running ZK
 {code}
 - /live_nodes reflects that the node is gone.
 - This is the only message which gets logged on the node1 server after 
 killing node2
 {code}
 45812 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN  
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream exception
 EndOfStreamException: Unable to read additional data from client sessionid 
 0x14ac40f26660001, likely client has closed socket
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 - The graph shows the node2 as 'Gone' state
 - clusterstate.json keeps showing the replica as 'active'
 {code}
 {collection1:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{
   core_node1:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8983_solr,
 base_url:http://169.254.113.194:8983/solr;,
 leader:true},
   core_node2:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8984_solr,
 base_url:http://169.254.113.194:8984/solr,
 maxShardsPerNode:1,
 router:{name:compositeId},
 replicationFactor:1,
 autoAddReplicas:false,
 autoCreated:true}}
 {code}
 One immediate problem I can see is that AutoAddReplicas doesn't work since 
 the clusterstate.json never changes. There might be more features which are 
 affected by this.
 On first thought I think we can handle this - The shard leader could listen 
 to changes on /live_nodes and if it has replicas that were on that node, mark 
 it as 'down' in the clusterstate.json?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6923) AutoAddReplicas should consult live nodes also to see if a state has changed

2015-01-11 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-6923:

Summary: AutoAddReplicas should consult live nodes also to see if a state 
has changed  (was: kill -9 doesn't change the replica state in 
clusterstate.json)

 AutoAddReplicas should consult live nodes also to see if a state has changed
 

 Key: SOLR-6923
 URL: https://issues.apache.org/jira/browse/SOLR-6923
 Project: Solr
  Issue Type: Bug
Reporter: Varun Thacker

 - I did the following 
 {code}
 ./solr start -e cloud -noprompt
 kill -9 pid-of-node2 //Not the node which is running ZK
 {code}
 - /live_nodes reflects that the node is gone.
 - This is the only message which gets logged on the node1 server after 
 killing node2
 {code}
 45812 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN  
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream exception
 EndOfStreamException: Unable to read additional data from client sessionid 
 0x14ac40f26660001, likely client has closed socket
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 - The graph shows the node2 as 'Gone' state
 - clusterstate.json keeps showing the replica as 'active'
 {code}
 {collection1:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{
   core_node1:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8983_solr,
 base_url:http://169.254.113.194:8983/solr;,
 leader:true},
   core_node2:{
 state:active,
 core:collection1,
 node_name:169.254.113.194:8984_solr,
 base_url:http://169.254.113.194:8984/solr,
 maxShardsPerNode:1,
 router:{name:compositeId},
 replicationFactor:1,
 autoAddReplicas:false,
 autoCreated:true}}
 {code}
 One immediate problem I can see is that AutoAddReplicas doesn't work since 
 the clusterstate.json never changes. There might be more features which are 
 affected by this.
 On first thought I think we can handle this - The shard leader could listen 
 to changes on /live_nodes and if it has replicas that were on that node, mark 
 it as 'down' in the clusterstate.json?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org