How about if we revert https://issues.apache.org/jira/browse/ZOOKEEPER-1437 and cut 3.4.5 RC0 today? Robert, Julio, et. al. could help verify the "fix". We can re-introduce 1437 in 3.4.6 and significantly reduce the risk until that's had more time to bake (say cut 3.4.6 a few months down the line).
Patrick On Tue, Sep 25, 2012 at 10:05 PM, Mahadev Konar <[email protected]> wrote: > Thanks for point this out JL and Robert. We have a jira for this > https://issues.apache.org/jira/browse/ZOOKEEPER-1477 and we will be > doing a 3.4.5 ASAP to address this. > > thanks > mahadev > > On Tue, Sep 25, 2012 at 3:37 PM, JL <[email protected]> wrote: >> This may be related. We are not using SSL, but the underlying cause may be >> the same: socket disconnection or failure to deliver data from the >> underlying channel. >> >> We experienced connection drops (after about ~3000ms) with Curator 1.1.18 >> and ZK 3.4.4. on openjdk 6 (on Linux), and JDK 7u7 on OS X. >> >> Oracle JVM 1.6.0_32 on Linux and v. 1.6.0_35 on OS X, seem to work. >> >> Here's the error we get over and over. >> >> E 09-25 17:01:35.408 http-bio-8081-exec-7-EventThread >> c.n.c.f.i.CuratorFrameworkImpl:486 :::] Background operation retry gave up >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >> ~[zookeeper-3.4.4.jar:3.4.4-1386507] >> at >> com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:448) >> ~[curator-framework-1.1.18.jar:na] >> at >> com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:49) >> [curator-framework-1.1.18.jar:na] >> at >> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:606) >> [zookeeper-3.4.4.jar:3.4.4-1386507] >> at >> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) >> [zookeeper-3.4.4.jar:3.4.4-1386507] >> >> >> Cheers, >> >> -Julio >> >> Sep 25, 2012 02:45:30 PM, [email protected] wrote: >> >> =========================================== >> >> I'm having some trouble with the 3.4.4 Java client on Linux (openjdk 6 >> or 7, debian or ubuntu). It doesn't seem to successfully connect to a >> server -- this goes for zkCli as well as custom code that uses the >> client jar. On OSX, it *does* connect. >> >> I've collected the DEBUG-level log output for the process of >> connecting to my development Zookeeper node (also running 3.4.4), >> waiting until a Connected event arrives, and sending getChildren("/"). >> The Linux version logs that it will not use SASL and then keeps >> deferring the getChildren request "until SASL authentication >> completes". The exact same (fat) jar running on OSX complains a few >> times about being "unable to locate a login configuration" but doesn't >> wait. >> >> Using the 3.4.3 client library or earlier does successfully connect, >> logging only the one message about JAAS that was changed as a result >> of ZOOKEEPER-1510. >> >> On Linux, the output of the 3.4.4 client is: >> >> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >> environment:zookeeper.version=3.4.4-1386507, built on 09/17/2012 08:33 GMT >> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >> environment:host.name=robert.corp.socrata.com >> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >> environment:java.version=1.6.0_24 >> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >> environment:java.vendor=Sun Microsystems Inc. >> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >> environment:java.home=/usr/lib/jvm/java-6-openjdk-amd64/jre >> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >> environment:java.class.path=socrata-zookeeper-assembly-0.0.1.jar >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:java.library.path=/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/server:/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64:/usr/lib/jvm/java-6-openjdk-amd64/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:java.io.tmpdir=/tmp >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:java.compiler= >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:os.name=Linux >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:os.arch=amd64 >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:os.version=3.2.0-3-amd64 >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:user.name=robertm >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:user.home=/home/robertm >> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >> environment:user.dir=/home/robertm/socrata/jvm/socrata-zookeeper >> INFO [main] 2012-09-25 14:02:24,523 org.apache.zookeeper.ZooKeeper >> Initiating client connection, connectString=10.0.2.106:2181 >> sessionTimeout=40000 >> watcher=com.socrata.zookeeper.ZooKeeperProvider$ZKWatcher@165973ea >> DEBUG [main] 2012-09-25 14:02:24,527 org.apache.zookeeper.ClientCnxn >> zookeeper.disableAutoWatchReset is false >> INFO [main] 2012-09-25 14:02:24,545 com.socrata.Main Waiting for >> connection... >> DEBUG [main] 2012-09-25 14:02:24,548 com.socrata.zookeeper.ZooKeeperProvider >> Waiting for connected-state... >> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,576 >> org.apache.zookeeper.ClientCnxn Opening socket connection to server >> mike.local/10.0.2.106:2181. Will not attempt to authenticate using SASL >> (unknown error) >> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,584 >> org.apache.zookeeper.ClientCnxn Socket connection established to >> mike.local/10.0.2.106:2181, initiating session >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,586 >> org.apache.zookeeper.ClientCnxn Session establishment request sent on >> mike.local/10.0.2.106:2181 >> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,600 >> org.apache.zookeeper.ClientCnxn Session establishment complete on server >> mike.local/10.0.2.106:2181, sessionid = 0x139ff2e85b60005, negotiated >> timeout = 40000 >> DEBUG [main-EventThread] 2012-09-25 14:02:24,614 >> com.socrata.zookeeper.ZooKeeperProvider ConnectionStateChanged(Connected) >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,636 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >> request:: '/,F response:: v{} until SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:37,923 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >> request:: '/,F response:: v{} until SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:37,924 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >> request:: '/,F response:: v{} until SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:37,924 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: >> null request:: null response:: nulluntil SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,260 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >> request:: '/,F response:: v{} until SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,260 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: >> null request:: null response:: nulluntil SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,261 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >> request:: '/,F response:: v{} until SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,261 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: >> null request:: null response:: nulluntil SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,261 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: >> null request:: null response:: nulluntil SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,265 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >> request:: '/,F response:: v{} until SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,265 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: >> null request:: null response:: nulluntil SASL authentication completes. >> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,266 >> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >> clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: >> null request:: null response:: nulluntil SASL authentication completes. >> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,266 >> org.apache.zookeeper.ClientCnxn Client session timed out, have not heard >> from server in 26668ms for sessionid 0x139ff2e85b60005, closing socket >> connection and attempting reconnect >> DEBUG [main-EventThread] 2012-09-25 14:02:51,377 >> com.socrata.zookeeper.ZooKeeperProvider ConnectionStateChanged(Disconnected) >> ConnectionLost >> >> On OSX, here is the output: >> >> INFO [main] 2012-09-25 14:03:16,286 org.apache.zookeeper.ZooKeeper Client >> environment:zookeeper.version=3.4.4-1386507, built on 09/17/2012 08:33 GMT >> INFO [main] 2012-09-25 14:03:16,287 org.apache.zookeeper.ZooKeeper Client >> environment:host.name=hydra.engr.corp.socrata.com >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.version=1.6.0_35 >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.vendor=Apple Inc. >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.class.path=socrata-zookeeper-assembly-0.0.1.jar >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.library.path=.:/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.io.tmpdir=/var/folders/zh/884zqwg14033gkv3bm2ylb6h0000gn/T/ >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:java.compiler= >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:os.name=Mac OS X >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:os.arch=x86_64 >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:os.version=10.7.4 >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:user.name=paradise >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:user.home=/Users/paradise >> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >> environment:user.dir=/Users/paradise/tmp >> INFO [main] 2012-09-25 14:03:16,289 org.apache.zookeeper.ZooKeeper >> Initiating client connection, connectString=10.0.2.106:2181 >> sessionTimeout=40000 >> watcher=com.socrata.zookeeper.ZooKeeperProvider$ZKWatcher@5e785d65 >> DEBUG [main] 2012-09-25 14:03:16,293 org.apache.zookeeper.ClientCnxn >> zookeeper.disableAutoWatchReset is false >> INFO [main] 2012-09-25 14:03:16,315 com.socrata.Main Waiting for >> connection... >> DEBUG [main] 2012-09-25 14:03:16,319 com.socrata.zookeeper.ZooKeeperProvider >> Waiting for connected-state... >> INFO [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,335 >> org.apache.zookeeper.ClientCnxn Opening socket connection to server >> 10.0.2.106/10.0.2.106:2181. Will not attempt to authenticate using SASL >> (Unable to locate a login configuration) >> INFO [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,344 >> org.apache.zookeeper.ClientCnxn Socket connection established to >> 10.0.2.106/10.0.2.106:2181, initiating session >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,346 >> org.apache.zookeeper.ClientCnxn Session establishment request sent on >> 10.0.2.106/10.0.2.106:2181 >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,347 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,351 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> INFO [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,368 >> org.apache.zookeeper.ClientCnxn Session establishment complete on server >> 10.0.2.106/10.0.2.106:2181, sessionid = 0x139ff2e85b60006, negotiated >> timeout = 40000 >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,371 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,371 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-EventThread] 2012-09-25 14:03:16,385 >> com.socrata.zookeeper.ZooKeeperProvider ConnectionStateChanged(Connected) >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,417 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,417 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,417 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,418 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,418 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,431 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,438 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,443 >> org.apache.zookeeper.ClientCnxn Reading reply sessionid:0x139ff2e85b60006, >> packet:: clientPath:/ serverPath:/ finished:false header:: 1,12 >> replyHeader:: 1,8292982,0 request:: '/,F response:: >> v{'ro,'row-index,'zkbtest,'consumers,'reindex,'hotstandby,'bigdir,'vs,'orestes,'eurybates,'shardedcly,'row-locks,'id-counter,'zookeeper,'cly,'locks,'rwlocks,'tickets,'brokers},s{0,0,0,0,0,61,0,0,0,19,8292893} >> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,444 >> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >> configuration: java.lang.SecurityException: Unable to locate a login >> configuration >> OK(Set(cly, row-locks, hotstandby, locks, tickets, bigdir, zkbtest, >> row-index, reindex, id-counter, eurybates, vs, rwlocks, shardedcly, brokers, >> consumers, zookeeper, orestes, ro),0,0,0,0,0,61,0,0,0,19,8292893) >> >> The server only logs this: >> >> 2012-09-25 14:02:29,671 [myid:] - INFO >> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - >> Accepted socket connection from /10.0.2.107:39645 >> 2012-09-25 14:02:29,675 [myid:] - INFO >> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@839] - Client >> attempting to establish new session at /10.0.2.107:39645 >> 2012-09-25 14:02:29,685 [myid:] - INFO [SyncThread:0:ZooKeeperServer@595] - >> Established session 0x139ff2e85b60005 with negotiated timeout 40000 for >> client /10.0.2.107:39645 >> 2012-09-25 14:02:56,354 [myid:] - WARN >> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end >> of stream exception >> EndOfStreamException: Unable to read additional data from client sessionid >> 0x139ff2e85b60005, likely client has closed socket
