[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243864#comment-14243864
 ] 

Rakesh R commented on ZOOKEEPER-2091:
-

bq.I am confused how it reads the returning result without finishing sending 
the entire packet?
As I remember the situation happened in one of the cluster due to the heavy 
system resource usage like cpu/memory was high(due to map reduce job execution).
It was continuously hitting the below exception and all the operations was 
failing.
{code}
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1127)
java.io.IOException: Nothing in the queue, but got 425467
at 
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:788)
{code}

Now with the patch, the cluster is running and no issue reported till now.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper-trunk-ibm6 - Build # 704 - Still Failing

2014-12-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-ibm6/704/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 361384 lines...]
[junit] 2014-12-12 09:32:18,574 [myid:] - INFO  [main:MBeanRegistry@119] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port11221]
[junit] 2014-12-12 09:32:18,575 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-12-12 09:32:18,576 [myid:] - INFO  [main:JMXEnv@142] - 
ensureOnly:[]
[junit] 2014-12-12 09:32:18,578 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2014-12-12 09:32:18,578 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2014-12-12 09:32:18,579 [myid:] - INFO  
[main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 32 worker threads, and 64 
kB direct buffers.
[junit] 2014-12-12 09:32:18,579 [myid:] - INFO  
[main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2014-12-12 09:32:18,580 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2014-12-12 09:32:18,580 [myid:] - INFO  [main:ZooKeeperServer@781] 
- minSessionTimeout set to 6000
[junit] 2014-12-12 09:32:18,580 [myid:] - INFO  [main:ZooKeeperServer@790] 
- maxSessionTimeout set to 6
[junit] 2014-12-12 09:32:18,581 [myid:] - INFO  [main:ZooKeeperServer@152] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test5240311933835984256.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test5240311933835984256.junit.dir/version-2
[junit] 2014-12-12 09:32:18,582 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test5240311933835984256.junit.dir/version-2/snapshot.b
[junit] 2014-12-12 09:32:18,585 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test5240311933835984256.junit.dir/version-2/snapshot.b
[junit] 2014-12-12 09:32:18,587 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-12-12 09:32:18,588 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:57199
[junit] 2014-12-12 09:32:18,589 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@835] - Processing stat command from 
/127.0.0.1:57199
[junit] 2014-12-12 09:32:18,589 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@684] - Stat command output
[junit] 2014-12-12 09:32:18,590 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@1006] - Closed socket connection for client 
/127.0.0.1:57199 (no session established for client)
[junit] 2014-12-12 09:32:18,590 [myid:] - INFO  [main:JMXEnv@224] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2014-12-12 09:32:18,593 [myid:] - INFO  [main:JMXEnv@241] - 
expect:InMemoryDataTree
[junit] 2014-12-12 09:32:18,594 [myid:] - INFO  [main:JMXEnv@245] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2014-12-12 09:32:18,594 [myid:] - INFO  [main:JMXEnv@241] - 
expect:StandaloneServer_port
[junit] 2014-12-12 09:32:18,594 [myid:] - INFO  [main:JMXEnv@245] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2014-12-12 09:32:18,594 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@55] - Memory used 5433
[junit] 2014-12-12 09:32:18,595 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@60] - Number of threads 40
[junit] 2014-12-12 09:32:18,595 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@65] - FINISHED TEST METHOD testQuota
[junit] 2014-12-12 09:32:18,595 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2014-12-12 09:32:18,628 [myid:] - INFO  [main:ZooKeeper@968] - 
Session: 0x14a3dd7760f closed
[junit] 2014-12-12 09:32:18,628 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@529] - EventThread shut down
[junit] 2014-12-12 09:32:18,628 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2014-12-12 09:32:18,629 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2014-12-12 09:32:18,629 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method

ZooKeeper-trunk - Build # 2529 - Failure

2014-12-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/2529/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 330090 lines...]
[junit] 2014-12-12 11:10:53,928 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@835] - Processing stat command from 
/127.0.0.1:40400
[junit] 2014-12-12 11:10:53,929 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@684] - Stat command output
[junit] 2014-12-12 11:10:53,929 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@1006] - Closed socket connection for client 
/127.0.0.1:40400 (no session established for client)
[junit] 2014-12-12 11:10:53,930 [myid:] - INFO  [main:JMXEnv@224] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2014-12-12 11:10:53,932 [myid:] - INFO  [main:JMXEnv@241] - 
expect:InMemoryDataTree
[junit] 2014-12-12 11:10:53,932 [myid:] - INFO  [main:JMXEnv@245] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2014-12-12 11:10:53,932 [myid:] - INFO  [main:JMXEnv@241] - 
expect:StandaloneServer_port
[junit] 2014-12-12 11:10:53,932 [myid:] - INFO  [main:JMXEnv@245] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2014-12-12 11:10:53,933 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@55] - Memory used 83951
[junit] 2014-12-12 11:10:53,933 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@60] - Number of threads 24
[junit] 2014-12-12 11:10:53,933 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@65] - FINISHED TEST METHOD testQuota
[junit] 2014-12-12 11:10:53,933 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2014-12-12 11:10:53,973 [myid:] - INFO  [main:ZooKeeper@968] - 
Session: 0x14a3e31b8ba closed
[junit] 2014-12-12 11:10:53,974 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2014-12-12 11:10:53,974 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@529] - EventThread shut down
[junit] 2014-12-12 11:10:53,974 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2014-12-12 11:10:53,974 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2014-12-12 11:10:53,974 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2014-12-12 11:10:53,974 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2014-12-12 11:10:53,975 [myid:] - INFO  [main:ZooKeeperServer@443] 
- shutting down
[junit] 2014-12-12 11:10:53,976 [myid:] - INFO  
[main:SessionTrackerImpl@231] - Shutting down
[junit] 2014-12-12 11:10:53,976 [myid:] - INFO  
[main:PrepRequestProcessor@971] - Shutting down
[junit] 2014-12-12 11:10:53,976 [myid:] - INFO  
[main:SyncRequestProcessor@191] - Shutting down
[junit] 2014-12-12 11:10:53,976 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@155] - PrepRequestProcessor exited loop!
[junit] 2014-12-12 11:10:53,977 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited!
[junit] 2014-12-12 11:10:53,977 [myid:] - INFO  
[main:FinalRequestProcessor@476] - shutdown of request processor complete
[junit] 2014-12-12 11:10:53,978 [myid:] - INFO  [main:MBeanRegistry@119] - 
Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree]
[junit] 2014-12-12 11:10:53,978 [myid:] - INFO  [main:MBeanRegistry@119] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port11221]
[junit] 2014-12-12 11:10:53,979 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-12-12 11:10:53,980 [myid:] - INFO  [main:JMXEnv@142] - 
ensureOnly:[]
[junit] 2014-12-12 11:10:53,984 [myid:] - INFO  [main:ClientBase@545] - 
fdcount after test is: 46 at start it was 33
[junit] 2014-12-12 11:10:53,985 [myid:] - INFO  [main:ClientBase@547] - 
sleeping for 20 secs
[junit] 2014-12-12 11:10:53,986 [myid:] - INFO  [main:ZKTestCase$1@65] - 
SUCCEEDED testQuota
[junit] 2014-12-12 11:10:53,986 [myid:] - INFO  [main:ZKTestCase$1@60] - 
FINISHED testQuota
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.702 sec
[junit] 2014-12-12 11:10:54,000 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@157] - SessionTrackerImpl exited loop!
[junit] 2014-12-12 11:10:54,000 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@157] - SessionTrackerImpl exited loop!
[junit] 2014-12-12 11:10:54,001 

Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread Rakesh R


 On Dec. 11, 2014, 7:37 p.m., Rakesh R wrote:
  src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java, line 108
  https://reviews.apache.org/r/27244/diff/28/?file=789550#file789550line108
 
  bootstrap.setOption(connectTimeoutMillis, 
  System.getProperty(connectTimeoutMillis));
  
  What do you think about providing 'clientConnectTimeoutMillis' as a 
  configurable item. By default its 10secs.
 
 Hongchao Deng wrote:
 Sounds good. But currently the timeout thing is handled outside in 
 ClientCnxn sendThread. This SSL thing will gradually drive transformation to 
 Netty style connection :)

ok. agreed.


 On Dec. 11, 2014, 7:37 p.m., Rakesh R wrote:
  src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java, line 99
  https://reviews.apache.org/r/27244/diff/28/?file=789550#file789550line99
 
  Could you tell me when this situation can come? Is this added for safer 
  side?
 
 Hongchao Deng wrote:
 I will remove it. I had this code for safe check when I didn't know 
 ClientCnxn so well.

ok. fine


 On Dec. 11, 2014, 7:37 p.m., Rakesh R wrote:
  src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java, line 63
  https://reviews.apache.org/r/27244/diff/28/?file=789550#file789550line63
 
  Can you give thread name for this thread pool like,
  
  Executors.newCachedThreadPool(new ThreadFactoryBuilder()
   .setNameFormat(ZooKeeper-NIOBoss-%d).build()), 
  Executors.newCachedThreadPool(new ThreadFactoryBuilder()
  .setNameFormat(ZooKeeper-NIOWorker-%d).build())
 
 Hongchao Deng wrote:
 ThreadFactoryBuilder is from Guava... We didn't have it yet.

oh yes. Good if we can name the thread, otw leave it:)


- Rakesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/#review64767
---


On Dec. 12, 2014, 12:14 a.m., Hongchao Deng wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27244/
 ---
 
 (Updated Dec. 12, 2014, 12:14 a.m.)
 
 
 Review request for zookeeper.
 
 
 Repository: zookeeper-git
 
 
 Description
 ---
 
 ZOOKEEPER-2069
 
 
 Diffs
 -
 
   build.xml bb5ff4f 
   src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
   src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
   src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
   src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27244/diff/
 
 
 Testing
 ---
 
 1. use LinkedBlockingDeque.
 
 
 Thanks,
 
 Hongchao Deng
 




ZooKeeper_branch35_jdk7 - Build # 139 - Still Failing

2014-12-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_jdk7/139/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 336860 lines...]
 [exec] mv -f .deps/zktest_mt-TestWatchers.Tpo 
.deps/zktest_mt-TestWatchers.Po
 [exec] g++ -DHAVE_CONFIG_H -I. 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c  
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/include 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/generated  
-DUSE_STATIC_LIB -DTHREADED 
-DZKSERVER_CMD=\/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/zkServer.sh\
 -DZOO_IPV6_ENABLED -g -O2 -MT zktest_mt-TestClient.o -MD -MP -MF 
.deps/zktest_mt-TestClient.Tpo -c -o zktest_mt-TestClient.o `test -f 
'tests/TestClient.cc' || echo 
'/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/'`tests/TestClient.cc
 [exec] mv -f .deps/zktest_mt-TestClient.Tpo .deps/zktest_mt-TestClient.Po
 [exec] g++ -DHAVE_CONFIG_H -I. 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c  
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/include 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/generated  
-DUSE_STATIC_LIB -DTHREADED 
-DZKSERVER_CMD=\/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/zkServer.sh\
 -DZOO_IPV6_ENABLED -g -O2 -MT zktest_mt-ZooKeeperQuorumServer.o -MD -MP -MF 
.deps/zktest_mt-ZooKeeperQuorumServer.Tpo -c -o 
zktest_mt-ZooKeeperQuorumServer.o `test -f 'tests/ZooKeeperQuorumServer.cc' || 
echo 
'/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/'`tests/ZooKeeperQuorumServer.cc
 [exec] mv -f .deps/zktest_mt-ZooKeeperQuorumServer.Tpo 
.deps/zktest_mt-ZooKeeperQuorumServer.Po
 [exec] g++ -DHAVE_CONFIG_H -I. 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c  
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/include 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/generated  
-DUSE_STATIC_LIB -DTHREADED 
-DZKSERVER_CMD=\/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/zkServer.sh\
 -DZOO_IPV6_ENABLED -g -O2 -MT zktest_mt-TestReadOnlyClient.o -MD -MP -MF 
.deps/zktest_mt-TestReadOnlyClient.Tpo -c -o zktest_mt-TestReadOnlyClient.o 
`test -f 'tests/TestReadOnlyClient.cc' || echo 
'/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/'`tests/TestReadOnlyClient.cc
 [exec] mv -f .deps/zktest_mt-TestReadOnlyClient.Tpo 
.deps/zktest_mt-TestReadOnlyClient.Po
 [exec] g++ -DHAVE_CONFIG_H -I. 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c  
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/include 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests 
-I/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/generated  
-DUSE_STATIC_LIB -DTHREADED 
-DZKSERVER_CMD=\/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/zkServer.sh\
 -DZOO_IPV6_ENABLED -g -O2 -MT zktest_mt-PthreadMocks.o -MD -MP -MF 
.deps/zktest_mt-PthreadMocks.Tpo -c -o zktest_mt-PthreadMocks.o `test -f 
'tests/PthreadMocks.cc' || echo 
'/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/'`tests/PthreadMocks.cc
 [exec] mv -f .deps/zktest_mt-PthreadMocks.Tpo 
.deps/zktest_mt-PthreadMocks.Po
 [exec] /bin/bash ./libtool  --tag=CXX   --mode=link g++ -DUSE_STATIC_LIB 
-DTHREADED 
-DZKSERVER_CMD=\/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/zkServer.sh\
 -DZOO_IPV6_ENABLED -g -O2 -static-libtool-libs -Wl,--wrap -Wl,calloc 
-Wl,--wrap -Wl,free -Wl,--wrap -Wl,flush_send_queue -Wl,--wrap -Wl,get_xid 
-Wl,--wrap -Wl,deliverWatchers -Wl,--wrap -Wl,activateWatcher -Wl,--wrap 
-Wl,pthread_mutex_lock -Wl,--wrap -Wl,pthread_mutex_trylock -Wl,--wrap 
-Wl,pthread_mutex_unlock  -o zktest-mt zktest_mt-TestDriver.o 
zktest_mt-LibCMocks.o zktest_mt-LibCSymTable.o zktest_mt-MocksBase.o 
zktest_mt-ZKMocks.o zktest_mt-Util.o zktest_mt-ThreadingUtil.o 
zktest_mt-TestZookeeperInit.o zktest_mt-TestZookeeperClose.o 
zktest_mt-TestReconfig.o zktest_mt-TestReconfigServer.o 
zktest_mt-TestClientRetry.o zktest_mt-TestOperations.o zktest_mt-TestMulti.o 
zktest_mt-TestWatchers.o zktest_mt-TestClient.o 
zktest_mt-ZooKeeperQuorumServer.o zktest_mt-TestReadOnlyClient.o 
zktest_mt-PthreadMocks.o libzkmt.la libhashtable.la -lpthread 
-L/usr/lib/x86_64-linux-gnu -lcppunit -ldl 
 [exec] libtool: link: g++ -DUSE_STATIC_LIB -DTHREADED 
-DZKSERVER_CMD=\/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/zkServer.sh\
 -DZOO_IPV6_ENABLED -g -O2 -Wl,--wrap -Wl,calloc -Wl,--wrap -Wl,free -Wl,--wrap 
-Wl,flush_send_queue -Wl,--wrap -Wl,get_xid -Wl,--wrap -Wl,deliverWatchers 
-Wl,--wrap 

[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Hongchao Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244534#comment-14244534
 ] 

Hongchao Deng commented on ZOOKEEPER-2091:
--

Did you try only loop writing in ClientCnxnSocketNIO#SendPacket() and see if it 
works? I can tell that's a problem.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread Hongchao Deng

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/
---

(Updated Dec. 12, 2014, 7:16 p.m.)


Review request for zookeeper.


Repository: zookeeper-git


Description
---

ZOOKEEPER-2069


Diffs (updated)
-

  build.xml bb5ff4f 
  src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
  src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
  src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
  src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
  src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
  src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
  src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
  src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java PRE-CREATION 
  src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
PRE-CREATION 
  src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/27244/diff/


Testing
---

1. use LinkedBlockingDeque.


Thanks,

Hongchao Deng



[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244659#comment-14244659
 ] 

Camille Fournier commented on ZOOKEEPER-2091:
-

I agree that it seems like only sending the full packet in 
ClientCnxnSocketNIO#SendPacket() would fix this issue. We don't do this in doIO 
presumably because we might only send part of the buffer, then allow another 
read to come in before we send the rest of the packet so as not to block on the 
complete send of the pending outgoing packet? Is that correct for implementing 
the nonblocking socket, can someone verify?

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Hongchao Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244672#comment-14244672
 ] 

Hongchao Deng commented on ZOOKEEPER-2091:
--

bq. then allow another read to come in

From what I understand in the code, reading the returning result for the 
packet sent needs to be complete, a.k.a. after the packet sent completely.
{code}
  if (!incomingBuffer.hasRemaining()) {
...
...
sendThread.readResponse(incomingBuffer);
...
  }
{code}
incomingBuffer needs to be filled.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper-trunk-openjdk7 - Build # 658 - Failure

2014-12-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/658/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 329349 lines...]
[junit] 2014-12-12 20:25:45,846 [myid:] - INFO  [main:MBeanRegistry@119] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port11221]
[junit] 2014-12-12 20:25:45,846 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-12-12 20:25:45,847 [myid:] - INFO  [main:JMXEnv@142] - 
ensureOnly:[]
[junit] 2014-12-12 20:25:45,848 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2014-12-12 20:25:45,848 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2014-12-12 20:25:45,848 [myid:] - INFO  
[main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 32 worker threads, and 64 
kB direct buffers.
[junit] 2014-12-12 20:25:45,849 [myid:] - INFO  
[main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2014-12-12 20:25:45,849 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2014-12-12 20:25:45,849 [myid:] - INFO  [main:ZooKeeperServer@781] 
- minSessionTimeout set to 6000
[junit] 2014-12-12 20:25:45,849 [myid:] - INFO  [main:ZooKeeperServer@790] 
- maxSessionTimeout set to 6
[junit] 2014-12-12 20:25:45,849 [myid:] - INFO  [main:ZooKeeperServer@152] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test188742289405531704.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test188742289405531704.junit.dir/version-2
[junit] 2014-12-12 20:25:45,851 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test188742289405531704.junit.dir/version-2/snapshot.b
[junit] 2014-12-12 20:25:45,853 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test188742289405531704.junit.dir/version-2/snapshot.b
[junit] 2014-12-12 20:25:45,855 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-12-12 20:25:45,855 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:33893
[junit] 2014-12-12 20:25:45,857 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@835] - Processing stat command from 
/127.0.0.1:33893
[junit] 2014-12-12 20:25:45,857 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@684] - Stat command output
[junit] 2014-12-12 20:25:45,857 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@1006] - Closed socket connection for client 
/127.0.0.1:33893 (no session established for client)
[junit] 2014-12-12 20:25:45,857 [myid:] - INFO  [main:JMXEnv@224] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2014-12-12 20:25:45,859 [myid:] - INFO  [main:JMXEnv@241] - 
expect:InMemoryDataTree
[junit] 2014-12-12 20:25:45,859 [myid:] - INFO  [main:JMXEnv@245] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2014-12-12 20:25:45,860 [myid:] - INFO  [main:JMXEnv@241] - 
expect:StandaloneServer_port
[junit] 2014-12-12 20:25:45,860 [myid:] - INFO  [main:JMXEnv@245] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2014-12-12 20:25:45,860 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@55] - Memory used 85675
[junit] 2014-12-12 20:25:45,860 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@60] - Number of threads 24
[junit] 2014-12-12 20:25:45,860 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@65] - FINISHED TEST METHOD testQuota
[junit] 2014-12-12 20:25:45,861 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2014-12-12 20:25:45,925 [myid:] - INFO  [main:ZooKeeper@968] - 
Session: 0x14a402db7bc closed
[junit] 2014-12-12 20:25:45,925 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2014-12-12 20:25:45,925 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@529] - EventThread shut down
[junit] 2014-12-12 20:25:45,925 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2014-12-12 20:25:45,925 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread 

Make JDK 7 the minimum support version

2014-12-12 Thread Hongchao Deng
Hi all,

I have been working on Netty + SSL and one of the changes I do is change
outgoingQueue to be LinkedBlockingDeque, which requires JDK 6.

As Flavio (fpj) and Patrick (phunt) suggested, there are JIRAs discussing
JDK7: ZOOKEEPER-2046, ZOOKEEPER-1963. It seems many people are interested
in an upgrade to JDK 7.

As we know, JDK 6 stops public updates. Making JDK 7 the minimum of
ZooKeeper going forward will drive us up-to-date and less work on
supporting older versions of JDK.

I am raising this topic here to gather public responses. The question is
simple: should we make JDK 7 the minimum support version? Your advice,
ideas, suggestions are highly welcome and appreciated!

-- 
*- Hongchao Deng*
*Software Engineer*


[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244861#comment-14244861
 ] 

Camille Fournier commented on ZOOKEEPER-2091:
-

I'm all the way in the bottom of this code and don't even remember the 
invariants. Why don't we wait till all the bytes are flushed to the buffer if 
we don't read anything until that's done?

FWIW, I looked into the depth's of Netty's NIO socket handling. In particular 
to verify that Netty sockets will sometimes switch to reading before all of the 
outgoing bytes are written, which they will. I thought it was interesting 
though that they also configure a writeSpinCount that lets them try to flush 
all bytes a configurable number of times before going back to the selector to 
do other IO. Might be something worth considering. Or, you know, not doing this 
socket stuff ourselves at all and just using netty :)
https://github.com/netty/netty/blob/0eb059bf58642f3c06144e1ea4a9d6c7632eb4d5/transport/src/main/java/io/netty/channel/nio/AbstractNioByteChannel.java#L198

In conclusion, I think that it may not matter much either way but probably we 
should only add this to the SendPacket method.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Hongchao Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244878#comment-14244878
 ] 

Hongchao Deng commented on ZOOKEEPER-2091:
--

Thanks [~fournc] for the in-depth thinking. I do have another JIRA, 
ZOOKEEPER-2069 to drive Netty things.

Back to this issue. The problem I can find here is that SendPacket() assumes a 
successful write. The other problem is that Sasl process implementation is too 
hackish. It stops/puts aside the nonblocking write logic. My point is that this 
is a good chance to fix it.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244892#comment-14244892
 ] 

Camille Fournier commented on ZOOKEEPER-2091:
-

I agree [~hdeng] but I'm not sure what conclusion you're driving to. From my 
observation, we can do one of two obvious things:
1) Only change SendPacket, not doIO, which I *think* I agree should solve the 
observed problem with less impact than the current patch

2) Something more drastic to actually fix the hack that is our current Sasl 
hacks, which I don't have a patch to do and I'm not sure if we have any other 
open tickets that will resolve this.

If we think solution 1 solves the issue, I think it is a simple enough fix to 
go ahead and use while we look into Netty etc.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread fpj


 On Dec. 11, 2014, 11:16 p.m., fpj wrote:
  src/java/main/org/apache/zookeeper/ClientCnxn.java, line 1472
  https://reviews.apache.org/r/27244/diff/28/?file=789547#file789547line1472
 
  I'm now confused by this change. Are you getting rid of enableWrite? 
  saslCompleted is fairly different from enableWrite.
 
 Hongchao Deng wrote:
 So the guy who make the SASL change do it in a hackish way. It simply 
 blocks the entire NIO thread and send it directly through socket. But when it 
 finishes SASL, it calls enableWrite(). I can definitely use the old name 
 enableWrite(). But do you think it is better to drive the change here? Since 
 Netty client doesn't need to enable write here.
 
 Hongchao Deng wrote:
 Hi Flavio. Maybe you mean enableWrite() is also used by CilentCnxnNIO and 
 worried that I change the name as well. I only change the interface to 
 outside. Inside ClientCnxnNIO there is still a enableWrite function which is 
 used inside the class's other places.

Yes, you're right, I did confuse the calls.


 On Dec. 11, 2014, 11:16 p.m., fpj wrote:
  src/java/main/org/apache/zookeeper/ClientCnxn.java, line 134
  https://reviews.apache.org/r/27244/diff/28/?file=789547#file789547line134
 
  It sounds like making pendingQueue and outgoingQueue are optimizations 
  that are independent from this issue. Shouldn't we make this change in a 
  different jira?
 
 Hongchao Deng wrote:
 outgoingQueue isn't optimization... My previous patch used a semaphore 
 which will be waken up by WakeupCnxn(). Your suggestion is use a blocking 
 outgoingQueue. What do you think? I am fine since I know both changes well 
 and I can do it in separate JIRAs.

My concern here is that it ended up implying more changes than just removing 
the semaphore. For example, the patch removes this synchronized block (not sure 
it is going to format it right):

for (Packet p : outgoingQueue) {
   conLossPacket(p);
}
outgoingQueue.clear();

and checking the other synchronized blocks removed, it isn't entirely 
straightforward that you can remove all of them.


- fpj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/#review64807
---


On Dec. 12, 2014, 7:16 p.m., Hongchao Deng wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27244/
 ---
 
 (Updated Dec. 12, 2014, 7:16 p.m.)
 
 
 Review request for zookeeper.
 
 
 Repository: zookeeper-git
 
 
 Description
 ---
 
 ZOOKEEPER-2069
 
 
 Diffs
 -
 
   build.xml bb5ff4f 
   src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
   src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
   src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
   src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27244/diff/
 
 
 Testing
 ---
 
 1. use LinkedBlockingDeque.
 
 
 Thanks,
 
 Hongchao Deng
 




Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread Hongchao Deng


 On Dec. 11, 2014, 11:16 p.m., fpj wrote:
  src/java/main/org/apache/zookeeper/ClientCnxn.java, line 134
  https://reviews.apache.org/r/27244/diff/28/?file=789547#file789547line134
 
  It sounds like making pendingQueue and outgoingQueue are optimizations 
  that are independent from this issue. Shouldn't we make this change in a 
  different jira?
 
 Hongchao Deng wrote:
 outgoingQueue isn't optimization... My previous patch used a semaphore 
 which will be waken up by WakeupCnxn(). Your suggestion is use a blocking 
 outgoingQueue. What do you think? I am fine since I know both changes well 
 and I can do it in separate JIRAs.
 
 fpj wrote:
 My concern here is that it ended up implying more changes than just 
 removing the semaphore. For example, the patch removes this synchronized 
 block (not sure it is going to format it right):
 
 for (Packet p : outgoingQueue) {
conLossPacket(p);
 }
 outgoingQueue.clear();
 
 and checking the other synchronized blocks removed, it isn't entirely 
 straightforward that you can remove all of them.

I can add those synchronized back. However, findbugs will complain that so I 
will add rules too.

On Netty side I can assure your concern:
1. In ClientCnxn, primeConnection(), cleanup(), queuePacket(). 
primeConnection(), cleanup() is guaranteed to be start and end of life cycles. 
queuePacket() is appending packets to outgoingQueue and waking up the blocking 
queue.


What do you think?


- Hongchao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/#review64807
---


On Dec. 12, 2014, 7:16 p.m., Hongchao Deng wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27244/
 ---
 
 (Updated Dec. 12, 2014, 7:16 p.m.)
 
 
 Review request for zookeeper.
 
 
 Repository: zookeeper-git
 
 
 Description
 ---
 
 ZOOKEEPER-2069
 
 
 Diffs
 -
 
   build.xml bb5ff4f 
   src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
   src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
   src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
   src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27244/diff/
 
 
 Testing
 ---
 
 1. use LinkedBlockingDeque.
 
 
 Thanks,
 
 Hongchao Deng
 




Re: Make JDK 7 the minimum support version

2014-12-12 Thread Flavio Junqueira
+1 for supporting only 7 or later.

 
 On 12 Dec 2014, at 21:28, Hongchao Deng hd...@cloudera.com wrote:
 
 Hi all,
 
 I have been working on Netty + SSL and one of the changes I do is change
 outgoingQueue to be LinkedBlockingDeque, which requires JDK 6.
 
 As Flavio (fpj) and Patrick (phunt) suggested, there are JIRAs discussing
 JDK7: ZOOKEEPER-2046, ZOOKEEPER-1963. It seems many people are interested
 in an upgrade to JDK 7.
 
 As we know, JDK 6 stops public updates. Making JDK 7 the minimum of
 ZooKeeper going forward will drive us up-to-date and less work on
 supporting older versions of JDK.
 
 I am raising this topic here to gather public responses. The question is
 simple: should we make JDK 7 the minimum support version? Your advice,
 ideas, suggestions are highly welcome and appreciated!
 
 -- 
 *- Hongchao Deng*
 *Software Engineer*



Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread fpj


 On Dec. 11, 2014, 11:16 p.m., fpj wrote:
  src/java/main/org/apache/zookeeper/ClientCnxn.java, line 134
  https://reviews.apache.org/r/27244/diff/28/?file=789547#file789547line134
 
  It sounds like making pendingQueue and outgoingQueue are optimizations 
  that are independent from this issue. Shouldn't we make this change in a 
  different jira?
 
 Hongchao Deng wrote:
 outgoingQueue isn't optimization... My previous patch used a semaphore 
 which will be waken up by WakeupCnxn(). Your suggestion is use a blocking 
 outgoingQueue. What do you think? I am fine since I know both changes well 
 and I can do it in separate JIRAs.
 
 fpj wrote:
 My concern here is that it ended up implying more changes than just 
 removing the semaphore. For example, the patch removes this synchronized 
 block (not sure it is going to format it right):
 
 for (Packet p : outgoingQueue) {
conLossPacket(p);
 }
 outgoingQueue.clear();
 
 and checking the other synchronized blocks removed, it isn't entirely 
 straightforward that you can remove all of them.
 
 Hongchao Deng wrote:
 I can add those synchronized back. However, findbugs will complain that 
 so I will add rules too.
 
 On Netty side I can assure your concern:
 1. In ClientCnxn, primeConnection(), cleanup(), queuePacket(). 
 primeConnection(), cleanup() is guaranteed to be start and end of life 
 cycles. queuePacket() is appending packets to outgoingQueue and waking up the 
 blocking queue.
 
 
 What do you think?

I'm not suggesting that you bring back the synchronized blocks. I'm saying that 
I haven't checked carefully and it isn't straightforward that you can remove 
them. For example, the one in ClientCnxnSocketNIO line 163 has multiple 
accesses to outgoingQueue, so removing the synchronized block doesn't 
necessarily gives us the same behavior. Can you actually justify the removal of 
each one of those synchronized blocks? If you aren't sure, then I'll check it 
carefully.


- fpj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/#review64807
---


On Dec. 12, 2014, 7:16 p.m., Hongchao Deng wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27244/
 ---
 
 (Updated Dec. 12, 2014, 7:16 p.m.)
 
 
 Review request for zookeeper.
 
 
 Repository: zookeeper-git
 
 
 Description
 ---
 
 ZOOKEEPER-2069
 
 
 Diffs
 -
 
   build.xml bb5ff4f 
   src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
   src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
   src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
   src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27244/diff/
 
 
 Testing
 ---
 
 1. use LinkedBlockingDeque.
 
 
 Thanks,
 
 Hongchao Deng
 




Re: Make JDK 7 the minimum support version

2014-12-12 Thread Patrick Hunt
+1 Makes sense to me.

I'm assuming we're talking 3.5+ here, right? We would continue to
support jdk6 in 3.4?

Patrick

On Fri, Dec 12, 2014 at 3:42 PM, Flavio Junqueira
fpjunque...@yahoo.com.invalid wrote:
 +1 for supporting only 7 or later.


 On 12 Dec 2014, at 21:28, Hongchao Deng hd...@cloudera.com wrote:

 Hi all,

 I have been working on Netty + SSL and one of the changes I do is change
 outgoingQueue to be LinkedBlockingDeque, which requires JDK 6.

 As Flavio (fpj) and Patrick (phunt) suggested, there are JIRAs discussing
 JDK7: ZOOKEEPER-2046, ZOOKEEPER-1963. It seems many people are interested
 in an upgrade to JDK 7.

 As we know, JDK 6 stops public updates. Making JDK 7 the minimum of
 ZooKeeper going forward will drive us up-to-date and less work on
 supporting older versions of JDK.

 I am raising this topic here to gather public responses. The question is
 simple: should we make JDK 7 the minimum support version? Your advice,
 ideas, suggestions are highly welcome and appreciated!

 --
 *- Hongchao Deng*
 *Software Engineer*



Re: Make JDK 7 the minimum support version

2014-12-12 Thread Camille Fournier
+1 for 3.5+

On Fri, Dec 12, 2014 at 6:53 PM, Patrick Hunt ph...@apache.org wrote:

 +1 Makes sense to me.

 I'm assuming we're talking 3.5+ here, right? We would continue to
 support jdk6 in 3.4?

 Patrick

 On Fri, Dec 12, 2014 at 3:42 PM, Flavio Junqueira
 fpjunque...@yahoo.com.invalid wrote:
  +1 for supporting only 7 or later.
 
 
  On 12 Dec 2014, at 21:28, Hongchao Deng hd...@cloudera.com wrote:
 
  Hi all,
 
  I have been working on Netty + SSL and one of the changes I do is change
  outgoingQueue to be LinkedBlockingDeque, which requires JDK 6.
 
  As Flavio (fpj) and Patrick (phunt) suggested, there are JIRAs
 discussing
  JDK7: ZOOKEEPER-2046, ZOOKEEPER-1963. It seems many people are
 interested
  in an upgrade to JDK 7.
 
  As we know, JDK 6 stops public updates. Making JDK 7 the minimum of
  ZooKeeper going forward will drive us up-to-date and less work on
  supporting older versions of JDK.
 
  I am raising this topic here to gather public responses. The question is
  simple: should we make JDK 7 the minimum support version? Your advice,
  ideas, suggestions are highly welcome and appreciated!
 
  --
  *- Hongchao Deng*
  *Software Engineer*
 



Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread Hongchao Deng


 On Dec. 11, 2014, 11:16 p.m., fpj wrote:
  src/java/main/org/apache/zookeeper/ClientCnxn.java, line 134
  https://reviews.apache.org/r/27244/diff/28/?file=789547#file789547line134
 
  It sounds like making pendingQueue and outgoingQueue are optimizations 
  that are independent from this issue. Shouldn't we make this change in a 
  different jira?
 
 Hongchao Deng wrote:
 outgoingQueue isn't optimization... My previous patch used a semaphore 
 which will be waken up by WakeupCnxn(). Your suggestion is use a blocking 
 outgoingQueue. What do you think? I am fine since I know both changes well 
 and I can do it in separate JIRAs.
 
 fpj wrote:
 My concern here is that it ended up implying more changes than just 
 removing the semaphore. For example, the patch removes this synchronized 
 block (not sure it is going to format it right):
 
 for (Packet p : outgoingQueue) {
conLossPacket(p);
 }
 outgoingQueue.clear();
 
 and checking the other synchronized blocks removed, it isn't entirely 
 straightforward that you can remove all of them.
 
 Hongchao Deng wrote:
 I can add those synchronized back. However, findbugs will complain that 
 so I will add rules too.
 
 On Netty side I can assure your concern:
 1. In ClientCnxn, primeConnection(), cleanup(), queuePacket(). 
 primeConnection(), cleanup() is guaranteed to be start and end of life 
 cycles. queuePacket() is appending packets to outgoingQueue and waking up the 
 blocking queue.
 
 
 What do you think?
 
 fpj wrote:
 I'm not suggesting that you bring back the synchronized blocks. I'm 
 saying that I haven't checked carefully and it isn't straightforward that you 
 can remove them. For example, the one in ClientCnxnSocketNIO line 163 has 
 multiple accesses to outgoingQueue, so removing the synchronized block 
 doesn't necessarily gives us the same behavior. Can you actually justify the 
 removal of each one of those synchronized blocks? If you aren't sure, then 
 I'll check it carefully.

That would be great if you can check it. Actually I just found a possible race.

Here's my version of analysis:
1. primeConnection: The start of life cycle. No one else can touch 
outgoingQueue except queuePacket(). primeConn prepends packets while 
queuePacket appends.
2. !!cleanup: The end of life cycle. No one else except queuePacket(). A 
possible race with queuePacket() in check closing and add packets. I will fix 
it shortly.
3. queuePacket: ...
4. findSendablePacket and doIO: this is the only place that remove items from 
outgoing Queue
  - the check in findSendablePacket() guarantees not empty. Because no where 
will remove items
  - enumeration will give a consistent current view. Adding new packets doesn't 
count in.
  - found non-priming packet, remove and addFirst wouldn't have race because 
here only queuePacket() will be called. And queuePacket() appends only.
5. last check in doTransport: this is duplicate..


- Hongchao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/#review64807
---


On Dec. 12, 2014, 7:16 p.m., Hongchao Deng wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27244/
 ---
 
 (Updated Dec. 12, 2014, 7:16 p.m.)
 
 
 Review request for zookeeper.
 
 
 Repository: zookeeper-git
 
 
 Description
 ---
 
 ZOOKEEPER-2069
 
 
 Diffs
 -
 
   build.xml bb5ff4f 
   src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
   src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
   src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
   src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27244/diff/
 
 
 Testing
 ---
 
 1. use LinkedBlockingDeque.
 
 
 Thanks,
 
 Hongchao Deng
 




RE: Make JDK 7 the minimum support version

2014-12-12 Thread Hongchao Deng
Right. We are discussing on ZK 3.5+

- Hongchao Deng

 From: ph...@apache.org
 Date: Fri, 12 Dec 2014 15:53:17 -0800
 Subject: Re: Make JDK 7 the minimum support version
 To: dev@zookeeper.apache.org
 CC: u...@zookeeper.apache.org
 
 +1 Makes sense to me.
 
 I'm assuming we're talking 3.5+ here, right? We would continue to
 support jdk6 in 3.4?
 
 Patrick
 
 On Fri, Dec 12, 2014 at 3:42 PM, Flavio Junqueira
 fpjunque...@yahoo.com.invalid wrote:
  +1 for supporting only 7 or later.
 
 
  On 12 Dec 2014, at 21:28, Hongchao Deng hd...@cloudera.com wrote:
 
  Hi all,
 
  I have been working on Netty + SSL and one of the changes I do is change
  outgoingQueue to be LinkedBlockingDeque, which requires JDK 6.
 
  As Flavio (fpj) and Patrick (phunt) suggested, there are JIRAs discussing
  JDK7: ZOOKEEPER-2046, ZOOKEEPER-1963. It seems many people are interested
  in an upgrade to JDK 7.
 
  As we know, JDK 6 stops public updates. Making JDK 7 the minimum of
  ZooKeeper going forward will drive us up-to-date and less work on
  supporting older versions of JDK.
 
  I am raising this topic here to gather public responses. The question is
  simple: should we make JDK 7 the minimum support version? Your advice,
  ideas, suggestions are highly welcome and appreciated!
 
  --
  *- Hongchao Deng*
  *Software Engineer*
 
  

Re: Review Request 27244: ZOOKEEPER-2069

2014-12-12 Thread Hongchao Deng


 On Dec. 11, 2014, 11:16 p.m., fpj wrote:
  src/java/main/org/apache/zookeeper/ClientCnxn.java, line 134
  https://reviews.apache.org/r/27244/diff/28/?file=789547#file789547line134
 
  It sounds like making pendingQueue and outgoingQueue are optimizations 
  that are independent from this issue. Shouldn't we make this change in a 
  different jira?
 
 Hongchao Deng wrote:
 outgoingQueue isn't optimization... My previous patch used a semaphore 
 which will be waken up by WakeupCnxn(). Your suggestion is use a blocking 
 outgoingQueue. What do you think? I am fine since I know both changes well 
 and I can do it in separate JIRAs.
 
 fpj wrote:
 My concern here is that it ended up implying more changes than just 
 removing the semaphore. For example, the patch removes this synchronized 
 block (not sure it is going to format it right):
 
 for (Packet p : outgoingQueue) {
conLossPacket(p);
 }
 outgoingQueue.clear();
 
 and checking the other synchronized blocks removed, it isn't entirely 
 straightforward that you can remove all of them.
 
 Hongchao Deng wrote:
 I can add those synchronized back. However, findbugs will complain that 
 so I will add rules too.
 
 On Netty side I can assure your concern:
 1. In ClientCnxn, primeConnection(), cleanup(), queuePacket(). 
 primeConnection(), cleanup() is guaranteed to be start and end of life 
 cycles. queuePacket() is appending packets to outgoingQueue and waking up the 
 blocking queue.
 
 
 What do you think?
 
 fpj wrote:
 I'm not suggesting that you bring back the synchronized blocks. I'm 
 saying that I haven't checked carefully and it isn't straightforward that you 
 can remove them. For example, the one in ClientCnxnSocketNIO line 163 has 
 multiple accesses to outgoingQueue, so removing the synchronized block 
 doesn't necessarily gives us the same behavior. Can you actually justify the 
 removal of each one of those synchronized blocks? If you aren't sure, then 
 I'll check it carefully.
 
 Hongchao Deng wrote:
 That would be great if you can check it. Actually I just found a possible 
 race.
 
 Here's my version of analysis:
 1. primeConnection: The start of life cycle. No one else can touch 
 outgoingQueue except queuePacket(). primeConn prepends packets while 
 queuePacket appends.
 2. !!cleanup: The end of life cycle. No one else except queuePacket(). A 
 possible race with queuePacket() in check closing and add packets. I will fix 
 it shortly.
 3. queuePacket: ...
 4. findSendablePacket and doIO: this is the only place that remove items 
 from outgoing Queue
   - the check in findSendablePacket() guarantees not empty. Because no 
 where will remove items
   - enumeration will give a consistent current view. Adding new packets 
 doesn't count in.
   - found non-priming packet, remove and addFirst wouldn't have race 
 because here only queuePacket() will be called. And queuePacket() appends 
 only.
 5. last check in doTransport: this is duplicate..

I just realize between cleanup() and queuePacket() there is race regardless of 
the synchronized..
Even though cleanup() finishes, it doesn't guarantee that state is CLOSED or 
closeSession packet was sent. Thus queuePacket() doesn't call conLossPacket().


- Hongchao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27244/#review64807
---


On Dec. 12, 2014, 7:16 p.m., Hongchao Deng wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27244/
 ---
 
 (Updated Dec. 12, 2014, 7:16 p.m.)
 
 
 Review request for zookeeper.
 
 
 Repository: zookeeper-git
 
 
 Description
 ---
 
 ZOOKEEPER-2069
 
 
 Diffs
 -
 
   build.xml bb5ff4f 
   src/java/main/org/apache/zookeeper/ClientCnxn.java b4ece07 
   src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 5ca0ba7 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java adb27ee 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java PRE-CREATION 
   src/java/main/org/apache/zookeeper/ZooKeeperTestable.java 775d1a2 
   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java dbc1080 
   src/java/test/org/apache/zookeeper/test/ClientTest.java dbe595c 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteBase.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteHammerTest.java 
 PRE-CREATION 
   src/java/test/org/apache/zookeeper/test/NettyNettySuiteTest.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27244/diff/
 
 
 Testing
 ---
 
 1. use LinkedBlockingDeque.
 
 
 Thanks,
 
 Hongchao Deng
 




[jira] [Commented] (ZOOKEEPER-2091) Possible logic error in ClientCnxnSocketNIO

2014-12-12 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245244#comment-14245244
 ] 

Rakesh R commented on ZOOKEEPER-2091:
-

bq.Did you try only loop writing in ClientCnxnSocketNIO#SendPacket() and see if 
it works? I can tell that's a problem.

What could be the case [of missing pendingQueue.add(p)| 
https://issues.apache.org/jira/browse/ZOOKEEPER-2091?focusedCommentId=14227396page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14227396]
 ?. Unfortunately now I didn't get chance to put one more patch into the 
cluster where it hits this problem.

 Possible logic error in ClientCnxnSocketNIO
 ---

 Key: ZOOKEEPER-2091
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2091
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.6
Reporter: Cheng
Assignee: Rakesh R
 Fix For: 3.5.1

 Attachments: ZOOKEEPER-2091.patch


 When SASL authentication is enabled, the ZooKeeper client will finally call 
 ClientCnxnSocketNIO#sendPacket(Packet p) to send a packet to server:
 @Override
 void sendPacket(Packet p) throws IOException {
 SocketChannel sock = (SocketChannel) sockKey.channel();
 if (sock == null) {
 throw new IOException(Socket is null!);
 }
 p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 }
 One problem I can see is that the sock is non-blocking, so when the sock's 
 output buffer is full(theoretically), only part of the Packet is sent out and 
 the communication will break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)