[jira] [Commented] (CASSANDRA-8188) don't block SocketThread for MessagingService

2014-12-01 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230319#comment-14230319
 ] 

Brandon Williams commented on CASSANDRA-8188:
-

Also fine with it in 2.0, and backported.

 don't block SocketThread for MessagingService
 -

 Key: CASSANDRA-8188
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: yangwei
Assignee: yangwei
 Fix For: 2.0.12, 2.1.2

 Attachments: 
 0001-don-t-block-SocketThread-for-MessagingService.patch, handshake.stack.txt


 We have two datacenters A and B.
 The node in A cannot handshake version with nodes in B, logs in A as follow:
 {noformat}
   INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java 
 (line 395) Cannot handshake version with B
 TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 
 368) unable to connect to /B
   java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
 {noformat}
 
 The jstack output of nodes in B shows it blocks in inputStream.readInt 
 resulting in SocketThread not accept socket any more, logs as follow:
 {noformat}
  java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 - locked 0x0007963747e8 (a java.lang.Object)
 at 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
 - locked 0x000796374848 (a java.lang.Object)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.InputStream.read(InputStream.java:101)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.DataInputStream.readInt(DataInputStream.java:387)
 at 
 org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
 {noformat}

 In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp 
 three-way handshake phase because tcp implementation drops the last ack when 
 the backlog queue is full.
 In nodes of B ss -tl shows Recv-Q 51 Send-Q 50.
 
 In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times 
 the listen queue of a socket overflowed” are both increasing.
 This patch sets read timeout to 2 * 
 OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8188) don't block SocketThread for MessagingService

2014-11-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223362#comment-14223362
 ] 

Jonathan Ellis commented on CASSANDRA-8188:
---

I'd be okay with adding this to 2.0.12.  Brandon?

 don't block SocketThread for MessagingService
 -

 Key: CASSANDRA-8188
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: yangwei
Assignee: yangwei
 Fix For: 2.1.2

 Attachments: 
 0001-don-t-block-SocketThread-for-MessagingService.patch, handshake.stack.txt


 We have two datacenters A and B.
 The node in A cannot handshake version with nodes in B, logs in A as follow:
 {noformat}
   INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java 
 (line 395) Cannot handshake version with B
 TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 
 368) unable to connect to /B
   java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
 {noformat}
 
 The jstack output of nodes in B shows it blocks in inputStream.readInt 
 resulting in SocketThread not accept socket any more, logs as follow:
 {noformat}
  java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 - locked 0x0007963747e8 (a java.lang.Object)
 at 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
 - locked 0x000796374848 (a java.lang.Object)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.InputStream.read(InputStream.java:101)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.DataInputStream.readInt(DataInputStream.java:387)
 at 
 org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
 {noformat}

 In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp 
 three-way handshake phase because tcp implementation drops the last ack when 
 the backlog queue is full.
 In nodes of B ss -tl shows Recv-Q 51 Send-Q 50.
 
 In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times 
 the listen queue of a socket overflowed” are both increasing.
 This patch sets read timeout to 2 * 
 OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8188) don't block SocketThread for MessagingService

2014-11-20 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219550#comment-14219550
 ] 

Chris Burroughs commented on CASSANDRA-8188:


I'm investigating an issue with our dual DC clusters.  There was a network 
outage after witch each cluster spit out 'Cannot handshake version with ' for 
minutes (after network connectivity was restored). I didn't get a stack trace 
in time but this appears similar.  Is there something that makes this ticket 
2.1.x specific?

 don't block SocketThread for MessagingService
 -

 Key: CASSANDRA-8188
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: yangwei
Assignee: yangwei
 Fix For: 2.1.2

 Attachments: 0001-don-t-block-SocketThread-for-MessagingService.patch


 We have two datacenters A and B.
 The node in A cannot handshake version with nodes in B, logs in A as follow:
 {noformat}
   INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java 
 (line 395) Cannot handshake version with B
 TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 
 368) unable to connect to /B
   java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
 {noformat}
 
 The jstack output of nodes in B shows it blocks in inputStream.readInt 
 resulting in SocketThread not accept socket any more, logs as follow:
 {noformat}
  java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 - locked 0x0007963747e8 (a java.lang.Object)
 at 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
 - locked 0x000796374848 (a java.lang.Object)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.InputStream.read(InputStream.java:101)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.DataInputStream.readInt(DataInputStream.java:387)
 at 
 org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
 {noformat}

 In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp 
 three-way handshake phase because tcp implementation drops the last ack when 
 the backlog queue is full.
 In nodes of B ss -tl shows Recv-Q 51 Send-Q 50.
 
 In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times 
 the listen queue of a socket overflowed” are both increasing.
 This patch sets read timeout to 2 * 
 OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8188) don't block SocketThread for MessagingService

2014-10-27 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185796#comment-14185796
 ] 

Vijay commented on CASSANDRA-8188:
--

I had the same solution as a part of 
https://issues.apache.org/jira/secure/attachment/12623900/0001-CASSANDRA-6590.patch,
 but [~brandon.williams] was seeing some wiredness. I was not able to replicate 
the that though.

 don't block SocketThread for MessagingService
 -

 Key: CASSANDRA-8188
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: yangwei
Assignee: yangwei
 Attachments: 0001-don-t-block-SocketThread-for-MessagingService.patch


 We have two datacenters A and B.
 The node in A cannot handshake version with nodes in B, logs in A as follow:
 {noformat}
   INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java 
 (line 395) Cannot handshake version with B
 TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 
 368) unable to connect to /B
   java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
 {noformat}
 
 The jstack output of nodes in B shows it blocks in inputStream.readInt 
 resulting in SocketThread not accept socket any more, logs as follow:
 {noformat}
  java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 - locked 0x0007963747e8 (a java.lang.Object)
 at 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
 - locked 0x000796374848 (a java.lang.Object)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.InputStream.read(InputStream.java:101)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.DataInputStream.readInt(DataInputStream.java:387)
 at 
 org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
 {noformat}

 In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp 
 three-way handshake phase because tcp implementation drops the last ack when 
 the backlog queue is full.
 In nodes of B ss -tl shows Recv-Q 51 Send-Q 50.
 
 In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times 
 the listen queue of a socket overflowed” are both increasing.
 This patch sets read timeout to 2 * 
 OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)