Not a bad idea Pat. Lets wait until EOD today. Eugene and myself will be taking a look during the day and we can see if we should be able to fix it quickly or we should revert.
thanks mahadev On Wed, Sep 26, 2012 at 6:35 AM, Patrick Hunt <[email protected]> wrote: > How about if we revert > https://issues.apache.org/jira/browse/ZOOKEEPER-1437 and cut 3.4.5 RC0 > today? Robert, Julio, et. al. could help verify the "fix". We can > re-introduce 1437 in 3.4.6 and significantly reduce the risk until > that's had more time to bake (say cut 3.4.6 a few months down the > line). > > Patrick > > On Tue, Sep 25, 2012 at 10:05 PM, Mahadev Konar <[email protected]> > wrote: >> Thanks for point this out JL and Robert. We have a jira for this >> https://issues.apache.org/jira/browse/ZOOKEEPER-1477 and we will be >> doing a 3.4.5 ASAP to address this. >> >> thanks >> mahadev >> >> On Tue, Sep 25, 2012 at 3:37 PM, JL <[email protected]> wrote: >>> This may be related. We are not using SSL, but the underlying cause may be >>> the same: socket disconnection or failure to deliver data from the >>> underlying channel. >>> >>> We experienced connection drops (after about ~3000ms) with Curator 1.1.18 >>> and ZK 3.4.4. on openjdk 6 (on Linux), and JDK 7u7 on OS X. >>> >>> Oracle JVM 1.6.0_32 on Linux and v. 1.6.0_35 on OS X, seem to work. >>> >>> Here's the error we get over and over. >>> >>> E 09-25 17:01:35.408 http-bio-8081-exec-7-EventThread >>> c.n.c.f.i.CuratorFrameworkImpl:486 :::] Background operation retry gave up >>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>> KeeperErrorCode = ConnectionLoss >>> at >>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >>> ~[zookeeper-3.4.4.jar:3.4.4-1386507] >>> at >>> com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:448) >>> ~[curator-framework-1.1.18.jar:na] >>> at >>> com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:49) >>> [curator-framework-1.1.18.jar:na] >>> at >>> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:606) >>> [zookeeper-3.4.4.jar:3.4.4-1386507] >>> at >>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) >>> [zookeeper-3.4.4.jar:3.4.4-1386507] >>> >>> >>> Cheers, >>> >>> -Julio >>> >>> Sep 25, 2012 02:45:30 PM, [email protected] wrote: >>> >>> =========================================== >>> >>> I'm having some trouble with the 3.4.4 Java client on Linux (openjdk 6 >>> or 7, debian or ubuntu). It doesn't seem to successfully connect to a >>> server -- this goes for zkCli as well as custom code that uses the >>> client jar. On OSX, it *does* connect. >>> >>> I've collected the DEBUG-level log output for the process of >>> connecting to my development Zookeeper node (also running 3.4.4), >>> waiting until a Connected event arrives, and sending getChildren("/"). >>> The Linux version logs that it will not use SASL and then keeps >>> deferring the getChildren request "until SASL authentication >>> completes". The exact same (fat) jar running on OSX complains a few >>> times about being "unable to locate a login configuration" but doesn't >>> wait. >>> >>> Using the 3.4.3 client library or earlier does successfully connect, >>> logging only the one message about JAAS that was changed as a result >>> of ZOOKEEPER-1510. >>> >>> On Linux, the output of the 3.4.4 client is: >>> >>> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >>> environment:zookeeper.version=3.4.4-1386507, built on 09/17/2012 08:33 GMT >>> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >>> environment:host.name=robert.corp.socrata.com >>> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >>> environment:java.version=1.6.0_24 >>> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >>> environment:java.vendor=Sun Microsystems Inc. >>> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >>> environment:java.home=/usr/lib/jvm/java-6-openjdk-amd64/jre >>> INFO [main] 2012-09-25 14:02:24,521 org.apache.zookeeper.ZooKeeper Client >>> environment:java.class.path=socrata-zookeeper-assembly-0.0.1.jar >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:java.library.path=/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/server:/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64:/usr/lib/jvm/java-6-openjdk-amd64/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:java.io.tmpdir=/tmp >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:java.compiler= >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:os.name=Linux >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:os.arch=amd64 >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:os.version=3.2.0-3-amd64 >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:user.name=robertm >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:user.home=/home/robertm >>> INFO [main] 2012-09-25 14:02:24,522 org.apache.zookeeper.ZooKeeper Client >>> environment:user.dir=/home/robertm/socrata/jvm/socrata-zookeeper >>> INFO [main] 2012-09-25 14:02:24,523 org.apache.zookeeper.ZooKeeper >>> Initiating client connection, connectString=10.0.2.106:2181 >>> sessionTimeout=40000 >>> watcher=com.socrata.zookeeper.ZooKeeperProvider$ZKWatcher@165973ea >>> DEBUG [main] 2012-09-25 14:02:24,527 org.apache.zookeeper.ClientCnxn >>> zookeeper.disableAutoWatchReset is false >>> INFO [main] 2012-09-25 14:02:24,545 com.socrata.Main Waiting for >>> connection... >>> DEBUG [main] 2012-09-25 14:02:24,548 >>> com.socrata.zookeeper.ZooKeeperProvider Waiting for connected-state... >>> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,576 >>> org.apache.zookeeper.ClientCnxn Opening socket connection to server >>> mike.local/10.0.2.106:2181. Will not attempt to authenticate using SASL >>> (unknown error) >>> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,584 >>> org.apache.zookeeper.ClientCnxn Socket connection established to >>> mike.local/10.0.2.106:2181, initiating session >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,586 >>> org.apache.zookeeper.ClientCnxn Session establishment request sent on >>> mike.local/10.0.2.106:2181 >>> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,600 >>> org.apache.zookeeper.ClientCnxn Session establishment complete on server >>> mike.local/10.0.2.106:2181, sessionid = 0x139ff2e85b60005, negotiated >>> timeout = 40000 >>> DEBUG [main-EventThread] 2012-09-25 14:02:24,614 >>> com.socrata.zookeeper.ZooKeeperProvider ConnectionStateChanged(Connected) >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:24,636 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >>> request:: '/,F response:: v{} until SASL authentication completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:37,923 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >>> request:: '/,F response:: v{} until SASL authentication completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:37,924 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >>> request:: '/,F response:: v{} until SASL authentication completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:37,924 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:null serverPath:null finished:false header:: -2,11 >>> replyHeader:: null request:: null response:: nulluntil SASL authentication >>> completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,260 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >>> request:: '/,F response:: v{} until SASL authentication completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,260 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:null serverPath:null finished:false header:: -2,11 >>> replyHeader:: null request:: null response:: nulluntil SASL authentication >>> completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,261 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >>> request:: '/,F response:: v{} until SASL authentication completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,261 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:null serverPath:null finished:false header:: -2,11 >>> replyHeader:: null request:: null response:: nulluntil SASL authentication >>> completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,261 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:null serverPath:null finished:false header:: -2,11 >>> replyHeader:: null request:: null response:: nulluntil SASL authentication >>> completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,265 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:/ serverPath:/ finished:false header:: 0,12 replyHeader:: 0,0,0 >>> request:: '/,F response:: v{} until SASL authentication completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,265 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:null serverPath:null finished:false header:: -2,11 >>> replyHeader:: null request:: null response:: nulluntil SASL authentication >>> completes. >>> DEBUG [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,266 >>> org.apache.zookeeper.ClientCnxnSocketNIO deferring non-priming packet: >>> clientPath:null serverPath:null finished:false header:: -2,11 >>> replyHeader:: null request:: null response:: nulluntil SASL authentication >>> completes. >>> INFO [main-SendThread(mike.local:2181)] 2012-09-25 14:02:51,266 >>> org.apache.zookeeper.ClientCnxn Client session timed out, have not heard >>> from server in 26668ms for sessionid 0x139ff2e85b60005, closing socket >>> connection and attempting reconnect >>> DEBUG [main-EventThread] 2012-09-25 14:02:51,377 >>> com.socrata.zookeeper.ZooKeeperProvider ConnectionStateChanged(Disconnected) >>> ConnectionLost >>> >>> On OSX, here is the output: >>> >>> INFO [main] 2012-09-25 14:03:16,286 org.apache.zookeeper.ZooKeeper Client >>> environment:zookeeper.version=3.4.4-1386507, built on 09/17/2012 08:33 GMT >>> INFO [main] 2012-09-25 14:03:16,287 org.apache.zookeeper.ZooKeeper Client >>> environment:host.name=hydra.engr.corp.socrata.com >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.version=1.6.0_35 >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.vendor=Apple Inc. >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.class.path=socrata-zookeeper-assembly-0.0.1.jar >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.library.path=.:/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.io.tmpdir=/var/folders/zh/884zqwg14033gkv3bm2ylb6h0000gn/T/ >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:java.compiler= >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:os.name=Mac OS X >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:os.arch=x86_64 >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:os.version=10.7.4 >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:user.name=paradise >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:user.home=/Users/paradise >>> INFO [main] 2012-09-25 14:03:16,288 org.apache.zookeeper.ZooKeeper Client >>> environment:user.dir=/Users/paradise/tmp >>> INFO [main] 2012-09-25 14:03:16,289 org.apache.zookeeper.ZooKeeper >>> Initiating client connection, connectString=10.0.2.106:2181 >>> sessionTimeout=40000 >>> watcher=com.socrata.zookeeper.ZooKeeperProvider$ZKWatcher@5e785d65 >>> DEBUG [main] 2012-09-25 14:03:16,293 org.apache.zookeeper.ClientCnxn >>> zookeeper.disableAutoWatchReset is false >>> INFO [main] 2012-09-25 14:03:16,315 com.socrata.Main Waiting for >>> connection... >>> DEBUG [main] 2012-09-25 14:03:16,319 >>> com.socrata.zookeeper.ZooKeeperProvider Waiting for connected-state... >>> INFO [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,335 >>> org.apache.zookeeper.ClientCnxn Opening socket connection to server >>> 10.0.2.106/10.0.2.106:2181. Will not attempt to authenticate using SASL >>> (Unable to locate a login configuration) >>> INFO [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,344 >>> org.apache.zookeeper.ClientCnxn Socket connection established to >>> 10.0.2.106/10.0.2.106:2181, initiating session >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,346 >>> org.apache.zookeeper.ClientCnxn Session establishment request sent on >>> 10.0.2.106/10.0.2.106:2181 >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,347 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,351 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> INFO [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,368 >>> org.apache.zookeeper.ClientCnxn Session establishment complete on server >>> 10.0.2.106/10.0.2.106:2181, sessionid = 0x139ff2e85b60006, negotiated >>> timeout = 40000 >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,371 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,371 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-EventThread] 2012-09-25 14:03:16,385 >>> com.socrata.zookeeper.ZooKeeperProvider ConnectionStateChanged(Connected) >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,417 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,417 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,417 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,418 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,418 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,431 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,438 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,443 >>> org.apache.zookeeper.ClientCnxn Reading reply sessionid:0x139ff2e85b60006, >>> packet:: clientPath:/ serverPath:/ finished:false header:: 1,12 >>> replyHeader:: 1,8292982,0 request:: '/,F response:: >>> v{'ro,'row-index,'zkbtest,'consumers,'reindex,'hotstandby,'bigdir,'vs,'orestes,'eurybates,'shardedcly,'row-locks,'id-counter,'zookeeper,'cly,'locks,'rwlocks,'tickets,'brokers},s{0,0,0,0,0,61,0,0,0,19,8292893} >>> DEBUG [main-SendThread(10.0.2.106:2181)] 2012-09-25 14:03:16,444 >>> org.apache.zookeeper.client.ZooKeeperSaslClient Could not retrieve login >>> configuration: java.lang.SecurityException: Unable to locate a login >>> configuration >>> OK(Set(cly, row-locks, hotstandby, locks, tickets, bigdir, zkbtest, >>> row-index, reindex, id-counter, eurybates, vs, rwlocks, shardedcly, >>> brokers, consumers, zookeeper, orestes, ro),0,0,0,0,0,61,0,0,0,19,8292893) >>> >>> The server only logs this: >>> >>> 2012-09-25 14:02:29,671 [myid:] - INFO >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - >>> Accepted socket connection from /10.0.2.107:39645 >>> 2012-09-25 14:02:29,675 [myid:] - INFO >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@839] - Client >>> attempting to establish new session at /10.0.2.107:39645 >>> 2012-09-25 14:02:29,685 [myid:] - INFO [SyncThread:0:ZooKeeperServer@595] >>> - Established session 0x139ff2e85b60005 with negotiated timeout 40000 for >>> client /10.0.2.107:39645 >>> 2012-09-25 14:02:56,354 [myid:] - WARN >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end >>> of stream exception >>> EndOfStreamException: Unable to read additional data from client sessionid >>> 0x139ff2e85b60005, likely client has closed socket
