woot! that's some impressive remote detective work. Ganglia is the culprit here. It blindly tries "mntr" for every stat and falls back to "stat". An easy remedy...
I guess I misspoke as this error is happening in all our staging environments. I seem to remember not and did not revalidate before sending my earlier email. thanks for quick response and solving my issue, lex On Tue, Jan 17, 2012 at 9:11 PM, Patrick Hunt <[email protected]> wrote: > On Tue, Jan 17, 2012 at 6:19 PM, Ted Dunning <[email protected]> > wrote: > > I think that Pat is right: > > > > 1835955314 = 6D6E7472 = 'mntr' > > > > Nice. :-) "mntr" is new in 3.4, it's not available/known in 3.3 > http://zookeeper.apache.org/doc/r3.4.2/zookeeperAdmin.html#sc_zkCommands > > Patrick > > > On Wed, Jan 18, 2012 at 2:00 AM, Patrick Hunt <[email protected]> wrote: > > > >> Hi Alexius, it looks like localhost (127.0.0.1) is making a socket > >> connection and then sending a request with a supposed size of 1.8GB. > >> (the first 4 bytes of each zk request are the length) > >> > >> It's likely that this is a "4letterword" attempt gone wrong (or some > >> other host incorrectly pinging this port). Which application are you > >> pointing at the ZK server's client port? Could this be an invalid > >> monitoring request? (say someone trying to do a "stat" but spelling it > >> wrong?) > >> > >> Patrick > >> > >> On Tue, Jan 17, 2012 at 5:00 PM, Alexius Ludeman <[email protected]> > >> wrote: > >> > hi, > >> > > >> > I am getting the error from all 5 zookeeperd, "Exception causing > close of > >> > session 0x0 due to java.io.IOException: Len error 1835955314" every > >> > microsecond(?). > >> > > >> > I have a 5 node cluster on ubuntu maverick in aws. I've backported > >> > zookeeperd and now am running 3.3.4+dfsg1-1. My apps are > >> > using zookeeper-3.3.3.jar, but within the next 3-4 weeks I will be > >> > upgrading to zookeeper-3.3.4.jar. This problem also existed with > >> > zookeeperd on 3.3.3, in hopes of upgrading would solve my problem. > >> > > >> > I have stopped the whole world, rm -rf /var/lib/zookeeper and > restarted > >> > without any change to this log message. I have several other > >> environments > >> > without this error showing, same versions of client and server but > only 3 > >> > zookeeperd, and much fewer clients. > >> > > >> > Our use of zookeeper is: appName is a key and the value is a > server:port. > >> > After the app reads in its required appName(s), it puts watches on > these > >> > appNames looking for changes. So we have a bunch of apps with many > >> > watches. The client count is below. > >> > > >> > I would be grateful if anyone has some insight on how to debug > further. > >> > > >> > thanks > >> > lex > >> > > >> > zookeeper-0.p2:/var/log/zookeeper/zookeeper.log: > >> > > >> > 2012-01-17 22:23:59,097 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251][] - Accepted socket > >> > connection from /127.0.0.1:38653 > >> > 2012-01-17 22:23:59,097 - INFO [Thread-2309197:NIOServerCnxn@1435][] > - > >> > Closed socket connection for client /127.0.0.1:38652 (no session > >> > established for client) > >> > 2012-01-17 22:23:59,097 - WARN [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639][] - Exception causing close > of > >> > session 0x0 due to java.io.IOException: Len error 1835955314 > >> > 2012-01-17 22:23:59,097 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435][] - Closed socket connection > >> for > >> > client /127.0.0.1:38653 (no session established for client) > >> > 2012-01-17 22:23:59,098 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251][] - Accepted socket > >> > connection from /127.0.0.1:38654 > >> > 2012-01-17 22:23:59,098 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1237][] - Processing stat command > >> from / > >> > 127.0.0.1:38654 > >> > 2012-01-17 22:23:59,098 - INFO > >> > [Thread-2309198:NIOServerCnxn$StatCommand@1153][] - Stat command > output > >> > 2012-01-17 22:23:59,098 - INFO [Thread-2309198:NIOServerCnxn@1435][] > - > >> > Closed socket connection for client /127.0.0.1:38654 (no session > >> > established for client) > >> > 2012-01-17 22:23:59,098 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251][] - Accepted socket > >> > connection from /127.0.0.1:38655 > >> > 2012-01-17 22:23:59,099 - WARN [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639][] - Exception causing close > of > >> > session 0x0 due to java.io.IOException: Len error 1835955314 > >> > 2012-01-17 22:23:59,099 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435][] - Closed socket connection > >> for > >> > client /127.0.0.1:38655 (no session established for client) > >> > 2012-01-17 22:23:59,099 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251][] - Accepted socket > >> > connection from /127.0.0.1:38656 > >> > 2012-01-17 22:23:59,099 - INFO [NIOServerCxn.Factory: > >> > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1237][] - Processing stat command > >> from / > >> > 127.0.0.1:38656 > >> > > >> > > >> > Connected to zookeeper-0.p2 > >> > envi > >> > > >> > Environment: > >> > zookeeper.version=3.3.4--1, built on 12/09/2011 00:03 GMT > >> > host.name=zookeeper-0.p2.use.color.internal > >> > java.version=1.6.0_20 > >> > java.vendor=Sun Microsystems Inc. > >> > java.home=/usr/lib/jvm/java-6-openjdk/jre > >> > > >> > java.class.path=/etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/zookeeper.jar > >> > > >> > java.library.path=/usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server:/usr/lib/jvm/java-6-openjdk/jre/lib/amd64:/usr/lib/jvm/java-6-openjdk/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib > >> > java.io.tmpdir=/tmp > >> > java.compiler=<NA> > >> > os.name=Linux > >> > os.arch=amd64 > >> > os.version=2.6.35-24-virtual > >> > user.name=zookeeper > >> > user.home=/var/lib/zookeeper > >> > user.dir=/ > >> > > >> > Trying 10.118.145.79...Connected to zookeeper-0.p2 > >> > stats > >> > > >> > Zookeeper version: 3.3.4--1, built on 12/09/2011 00:03 GMT > >> > Clients: > >> > /10.40.50.163:45526[1](queued=0,recved=2819,sent=2820) > >> > /10.124.143.97:46512[0](queued=0,recved=1,sent=0) > >> > /10.40.37.34:33308[1](queued=0,recved=2795,sent=2796) > >> > /10.205.49.18:35339[1](queued=0,recved=2671,sent=2672) > >> > /10.205.49.18:42419[1](queued=0,recved=2718,sent=2719) > >> > /10.80.82.184:43535[1](queued=0,recved=2819,sent=2820) > >> > /10.84.133.213:52533[1](queued=0,recved=2817,sent=2844) > >> > > >> > Latency min/avg/max: 0/0/200 > >> > Received: 7330810 > >> > Sent: 7331408 > >> > Outstanding: 0 > >> > Zxid: 0x100014e31 > >> > Mode: follower > >> > Node count: 197 > >> > > >> > Trying 10.34.37.214...Connected to zookeeper-1.p2 > >> > stat > >> > > >> > Zookeeper version: 3.3.4--1, built on 12/09/2011 00:03 GMT > >> > Clients: > >> > /10.79.6.245:38198[1](queued=0,recved=2680,sent=2681) > >> > /10.98.119.91:37301[1](queued=0,recved=2689,sent=2690) > >> > /10.79.35.125:59744[1](queued=0,recved=2681,sent=2682) > >> > /10.39.81.28:38047[1](queued=0,recved=2666,sent=2667) > >> > /10.124.143.97:48969[0](queued=0,recved=1,sent=0) > >> > > >> > Latency min/avg/max: 0/0/81 > >> > Received: 4984011 > >> > Sent: 4984267 > >> > Outstanding: 0 > >> > Zxid: 0x100014e31 > >> > Mode: follower > >> > Node count: 197 > >> > > >> > Trying 10.38.85.114...Connected to zookeeper-2.p2 > >> > > >> > stat > >> > Zookeeper version: 3.3.4--1, built on 12/09/2011 00:03 GMT > >> > Clients: > >> > /10.84.133.213:51499[1](queued=0,recved=2655,sent=2656) > >> > /10.124.143.97:39412[0](queued=0,recved=1,sent=0) > >> > /10.39.81.28:56431[1](queued=0,recved=2715,sent=2716) > >> > /10.204.183.195:40180[1](queued=0,recved=2664,sent=2665) > >> > /10.32.51.156:59390[1](queued=0,recved=2634,sent=2635) > >> > /10.79.6.245:42161[1](queued=0,recved=2731,sent=2732) > >> > /10.118.14.138:40355[1](queued=0,recved=2643,sent=2644) > >> > > >> > Latency min/avg/max: 0/0/33 > >> > Received: 6650278 > >> > Sent: 6651369 > >> > Outstanding: 0 > >> > Zxid: 0x100014e31 > >> > Mode: leader > >> > Node count: 197 > >> > > >> > Trying 10.101.42.61...Connected to zookeeper-3.p2 > >> > > >> > stat > >> > Zookeeper version: 3.3.4--1, built on 12/09/2011 00:03 GMT > >> > Clients: > >> > /10.204.230.212:57523[1](queued=0,recved=2719,sent=2720) > >> > /10.85.19.23:52916[1](queued=0,recved=2836,sent=2837) > >> > /10.79.35.125:54742[1](queued=0,recved=2734,sent=2735) > >> > /10.124.143.97:60909[0](queued=0,recved=1,sent=0) > >> > /10.204.230.212:57918[1](queued=0,recved=2674,sent=2675) > >> > /10.100.90.101:35977[1](queued=0,recved=2680,sent=2681) > >> > /10.98.119.91:42147[1](queued=0,recved=2745,sent=2746) > >> > > >> > Latency min/avg/max: 0/0/123 > >> > Received: 8430215 > >> > Sent: 8430724 > >> > Outstanding: 0 > >> > Zxid: 0x100014e31 > >> > Mode: follower > >> > Node count: 197 > >> > > >> > Trying 10.204.59.55...Connected to zookeeper-4.p2 > >> > > >> > stat > >> > Zookeeper version: 3.3.4--1, built on 12/09/2011 00:03 GMT > >> > Clients: > >> > /10.100.90.101:34437[1](queued=0,recved=2728,sent=2729) > >> > /10.204.183.195:49961[1](queued=0,recved=2706,sent=2707) > >> > /10.124.143.97:34797[0](queued=0,recved=1,sent=0) > >> > > >> > Latency min/avg/max: 0/0/19 > >> > Received: 46410 > >> > Sent: 46411 > >> > Outstanding: 0 > >> > Zxid: 0x100014e31 > >> > Mode: follower > >> > Node count: 197 > >> >
