[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882316#comment-13882316
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1561499 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1561499 ]

SOLR-5577: Harden leaking Timer thread.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882315#comment-13882315
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1561497 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1561497 ]

SOLR-5577: Harden leaking Timer thread.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876925#comment-13876925
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1559844 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1559844 ]

SOLR-5577: Protect against NPE race with close - spotted by Greg Chanan

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876941#comment-13876941
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1559847 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1559847 ]

SOLR-5577: Protect against NPE race with close - spotted by Greg Chanan

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876013#comment-13876013
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1559587 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1559587 ]

SOLR-5577: Harden leaking Timer thread.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876014#comment-13876014
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1559588 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1559588 ]

SOLR-5577: Harden leaking Timer thread.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876015#comment-13876015
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1559589 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1559589 ]

SOLR-5577: Harden leaking Timer thread.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869681#comment-13869681
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557769 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1557769 ]

SOLR-5577: Remove CHANGES entry from 4.7

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869703#comment-13869703
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557773 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1557773 ]

SOLR-5577: Remove CHANGES entry from 4.7

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867853#comment-13867853
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557137 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1557137 ]

SOLR-5577: timer thread should be run as daemon

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867852#comment-13867852
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557136 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1557136 ]

SOLR-5577: timer thread should be run as daemon

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867865#comment-13867865
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557140 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557140 ]

SOLR-5577: Likely ZooKeeper expiration should not slow down updates a given 
amount, but instead cut off updates after a given time.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867913#comment-13867913
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557154 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557154 ]

SOLR-5577: Don't start a timer on disconnect if we are already closed

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867920#comment-13867920
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557157 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557157 ]

SOLR-5577: don't start timer thread in constructor

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867916#comment-13867916
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557155 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557155 ]

SOLR-5577: fix possible race on canceling timer on close

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867926#comment-13867926
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557160 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557160 ]

SOLR-5577: don't start timer thread on expiration

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867927#comment-13867927
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1557161 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557161 ]

SOLR-5577: timer thread should be run as daemon

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-06 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863229#comment-13863229
 ] 

Ramkumar Aiyengar commented on SOLR-5577:
-

Mark, any reason why we start the disconnect timer when we get an EXPIRED? 
Shouldn't we just set cancel the timer and set likelyExpired to true?

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863257#comment-13863257
 ] 

Mark Miller commented on SOLR-5577:
---

Thanks Ramkumar! I used some refactoring logic to extract a method for setting 
connected=false - didn't catch that one. Not necessary.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863332#comment-13863332
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1556002 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1556002 ]

SOLR-5577: don't start timer thread on expiration

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863335#comment-13863335
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1556003 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1556003 ]

SOLR-5577: don't start timer thread on expiration

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862629#comment-13862629
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1555610 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1555610 ]

SOLR-5577: don't start timer thread in constructor

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862630#comment-13862630
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1555611 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1555611 ]

SOLR-5577: don't start timer thread in constructor

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861794#comment-13861794
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1555204 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1555204 ]

SOLR-5577: fix possible race on canceling timer on close

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2014-01-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861807#comment-13861807
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1555207 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1555207 ]

SOLR-5577: fix possible race on canceling timer on close

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858108#comment-13858108
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1553912 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1553912 ]

SOLR-5577: Likely ZooKeeper expiration should not slow down updates a given 
amount, but instead cut off updates after a given time.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858109#comment-13858109
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1553914 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1553914 ]

SOLR-5577: Likely ZooKeeper expiration should not slow down updates a given 
amount, but instead cut off updates after a given time.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858191#comment-13858191
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1553982 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1553982 ]

SOLR-5577: Don't start a timer on disconnect if we are already closed

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858190#comment-13858190
 ] 

ASF subversion and git services commented on SOLR-5577:
---

Commit 1553981 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1553981 ]

SOLR-5577: Don't start a timer on disconnect if we are already closed

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-26 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857233#comment-13857233
 ] 

Mark Miller commented on SOLR-5577:
---

I think perhaps we should wait up to 80 or 90% of the session time-out before 
we start rejecting updates.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: SOLR-5577.patch


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856720#comment-13856720
 ] 

Mark Miller commented on SOLR-5577:
---

You would have to be dc'd from zk for the length of the session timeout before 
a new leader election would start - which is why it should be safe to accept 
updates for about up to the session timeout after losing the zk connection. We 
just have to fix the lazy impl. 

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-24 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856403#comment-13856403
 ] 

Ramkumar Aiyengar commented on SOLR-5577:
-

Currently when we get disconnected from ZK, do we know if that is due to the 
session expiration? From what I can see, and I am probably wrong, we get to 
know that the session expired only when we next try to reconnect. If so, and if 
we were the leader, the session expiry would have triggered a re-election 
before we reconnect, and we shouldn't be processing them as the leader in the 
first place -- so we might end up with two nodes thinking they are the leader 
for the period from the disconnect to when we try to reconnect and get a 
session expiry.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855835#comment-13855835
 ] 

Mark Miller commented on SOLR-5577:
---

Our model should be able to handle this better.

Some off the cough remarks:

* Our model should be fine with turning off updates only after the connection 
with zk is lost for a while, rather than the moment it's noticed. 

* Even if we didn't want to relax the above, we should be able to handle this 
case better - if the issue is that ZooKeeper is actually unavailable, we won't 
get new leaders or anything anyway, so no reason to be too concerned about 
turning off updates. Not sure how easy that is to detect though.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855874#comment-13855874
 ] 

Anshum Gupta commented on SOLR-5577:


A shard might update it's state while zk is away (and no one else knows about 
it) and perhaps we're trying to avoid any such cases by rejecting the updates 
for as long as zk is unavailable. There might be concerns about consistency if 
we get any less strict, or so I think.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855877#comment-13855877
 ] 

Ramkumar Aiyengar commented on SOLR-5577:
-

The proposal here is to only relax updating the collection for a while when ZK 
connection is lost. If a shard updates its state, wouldn't the Overseer have to 
process the state update? That would still continue to fail till ZK comes back 
up.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855908#comment-13855908
 ] 

Mark Miller commented on SOLR-5577:
---

bq.  Solr document updates were slowed down during this time window.

That's interesting - slowed down? Not rejected? That is more surprising to 
me...need to do some code review.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855910#comment-13855910
 ] 

Mark Miller commented on SOLR-5577:
---

bq.  If a shard updates its state, wouldn't the Overseer have to process the 
state update? 

Correct. And there is already some window as well - I'm not sure it matters 
that the window is a little larger due to the the possibly very small change in 
probability of it being an issue. We have to think about it carefully, but for 
the most part, this is just a preventative measure to make sure some node is 
not going rogue for a long period of time with a cached, stale cluster state 
and no connection to ZooKeeper for some reason, but perhaps a connection to 
other nodes.

I think we always intended to think about ways to relax it, but when putting 
things together initially, it was faster/easier to just lock this down as much 
as possible to start.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855912#comment-13855912
 ] 

Mark Miller commented on SOLR-5577:
---

P.S. I don't know that it's the right solution for the issue yet either - just 
spit balling at this point.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855916#comment-13855916
 ] 

Mark Miller commented on SOLR-5577:
---

bq. That's interesting - slowed down? Not rejected? That is more surprising to 
me...need to do some code review.

Okay, bad memory. Looking at the code, this makes sense. Those updates wait up 
to the session expiration before they would end up erroring...a slowdown makes 
sense.

We have to be able to improve this.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855919#comment-13855919
 ] 

Mark Miller commented on SOLR-5577:
---

So I think we already kind of have this window setup for this case - up to the 
session timeout, which I think makes sense. The problem is in how it's 
implemented. It shouldn't hold up updates for that long, it should simply only 
accept them for that long.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org