Rakesh Kumar Singh created ZOOKEEPER-2592: ---------------------------------------------
Summary: Zookeeper is not recoverable once running system( machine on which zookeeper is running) is out of space Key: ZOOKEEPER-2592 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2592 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.5.2, 3.5.1 Reporter: Rakesh Kumar Singh Priority: Critical Zookeeper is not recoverable once running system( machine on which zookeeper is running) is out of space Steps to reproduce:- 1. Install zookeeper on standalone mode and start zookeeper 2. Make the machine physical memory full 3. Connect through client to zookeeper and trying create some znodes with some data. 4. After sometime creating further znode will not happened as complete memory is occupied 5. Now start creating space in that machine 6. Again connect through a client. Connection is fine. Now try to execute any command like "ls / " it fails even though now space is more than 11gb Client log:- BLR1000007042:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin # df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda2 36G 24G 11G 70% / udev 1.9G 116K 1.9G 1% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm BLR1000007042:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin # ./zkCli.sh Connecting to localhost:2181 2016-09-19 22:50:20,227 [myid:] - INFO [main:Environment@109] - Client environment:zookeeper.version=3.5.1-alpha--1, built on 08/18/2016 08:20 GMT 2016-09-19 22:50:20,231 [myid:] - INFO [main:Environment@109] - Client environment:host.name=BLR1000007042 2016-09-19 22:50:20,231 [myid:] - INFO [main:Environment@109] - Client environment:java.version=1.7.0_79 2016-09-19 22:50:20,234 [myid:] - INFO [main:Environment@109] - Client environment:java.vendor=Oracle Corporation 2016-09-19 22:50:20,234 [myid:] - INFO [main:Environment@109] - Client environment:java.home=/usr/java/jdk1.7.0_79/jre 2016-09-19 22:50:20,234 [myid:] - INFO [main:Environment@109] - Client environment:java.class.path=/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../build/classes:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../build/lib/*.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/slf4j-log4j12-1.7.5.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/slf4j-api-1.7.5.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/servlet-api-2.5-20081211.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/netty-3.7.0.Final.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/log4j-1.2.16.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jline-2.11.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jetty-util-6.1.26.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jetty-6.1.26.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/javacc.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jackson-mapper-asl-1.9.11.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jackson-core-asl-1.9.11.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/commons-cli-1.2.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/ant-eclipse-1.0-jvm1.2.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../zookeeper-3.5.1-alpha.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../src/java/lib/ant-eclipse-1.0-jvm1.2.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../conf:/usr/java/jdk1.7.0_79/lib 2016-09-19 22:50:20,234 [myid:] - INFO [main:Environment@109] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2016-09-19 22:50:20,234 [myid:] - INFO [main:Environment@109] - Client environment:java.io.tmpdir=/tmp 2016-09-19 22:50:20,234 [myid:] - INFO [main:Environment@109] - Client environment:java.compiler=<NA> 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:os.name=Linux 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:os.arch=amd64 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:os.version=3.0.76-0.11-default 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:user.name=root 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:user.home=/root 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:user.dir=/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin 2016-09-19 22:50:20,235 [myid:] - INFO [main:Environment@109] - Client environment:os.memory.free=52MB 2016-09-19 22:50:20,237 [myid:] - INFO [main:Environment@109] - Client environment:os.memory.max=227MB 2016-09-19 22:50:20,238 [myid:] - INFO [main:Environment@109] - Client environment:os.memory.total=57MB 2016-09-19 22:50:20,241 [myid:] - INFO [main:ZooKeeper@716] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@3865db85 Welcome to ZooKeeper! 2016-09-19 22:50:20,264 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1138] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2016-09-19 22:50:20,270 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@980] - Socket connection established, initiating session, client: /127.0.0.1:47801, server: localhost/127.0.0.1:2181 JLine support is enabled [INFO] Unable to bind key for unsupported operation: backward-delete-word [INFO] Unable to bind key for unsupported operation: backward-delete-word [INFO] Unable to bind key for unsupported operation: down-history [INFO] Unable to bind key for unsupported operation: up-history [INFO] Unable to bind key for unsupported operation: up-history [INFO] Unable to bind key for unsupported operation: down-history [INFO] Unable to bind key for unsupported operation: up-history [INFO] Unable to bind key for unsupported operation: down-history [INFO] Unable to bind key for unsupported operation: up-history [INFO] Unable to bind key for unsupported operation: down-history [INFO] Unable to bind key for unsupported operation: up-history [INFO] Unable to bind key for unsupported operation: down-history [zk: localhost:2181(CONNECTING) 0] ls / 2016-09-19 22:50:35,280 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1251] - Client session timed out, have not heard from server in 15011ms for sessionid 0x0, closing socket connection and attempting reconnect Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for / at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2255) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2283) at org.apache.zookeeper.cli.LsCommand.exec(LsCommand.java:93) at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:674) at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:577) at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360) at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:320) at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:280) BLR1000007042:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin # -{color:blue} Server log 2016-09-19 22:34:13,380 [myid:] - INFO [main:QuorumPeerConfig@114] - Reading configuration from: /home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../conf/zoo.cfg 2016-09-19 22:34:13,386 [myid:] - INFO [main:QuorumPeerConfig@316] - clientPortAddress is 0.0.0.0/0.0.0.0:2181 2016-09-19 22:34:13,386 [myid:] - INFO [main:QuorumPeerConfig@320] - secureClientPort is not set 2016-09-19 22:34:13,389 [myid:] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3 2016-09-19 22:34:13,389 [myid:] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0 2016-09-19 22:34:13,390 [myid:] - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled. 2016-09-19 22:34:13,390 [myid:] - WARN [main:QuorumPeerMain@122] - Either no config or no quorum defined in config, running in standalone mode 2016-09-19 22:34:13,402 [myid:] - INFO [main:QuorumPeerConfig@114] - Reading configuration from: /home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../conf/zoo.cfg 2016-09-19 22:34:13,402 [myid:] - INFO [main:QuorumPeerConfig@316] - clientPortAddress is 0.0.0.0/0.0.0.0:2181 2016-09-19 22:34:13,402 [myid:] - INFO [main:QuorumPeerConfig@320] - secureClientPort is not set 2016-09-19 22:34:13,403 [myid:] - INFO [main:ZooKeeperServerMain@113] - Starting server 2016-09-19 22:34:13,416 [myid:] - INFO [main:Environment@109] - Server environment:zookeeper.version=3.5.1-alpha--1, built on 08/18/2016 08:20 GMT 2016-09-19 22:34:13,416 [myid:] - INFO [main:Environment@109] - Server environment:host.name=BLR1000007042 2016-09-19 22:34:13,416 [myid:] - INFO [main:Environment@109] - Server environment:java.version=1.7.0_79 2016-09-19 22:34:13,417 [myid:] - INFO [main:Environment@109] - Server environment:java.vendor=Oracle Corporation 2016-09-19 22:34:13,417 [myid:] - INFO [main:Environment@109] - Server environment:java.home=/usr/java/jdk1.7.0_79/jre 2016-09-19 22:34:13,419 [myid:] - INFO [main:Environment@109] - Server environment:java.class.path=/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../build/classes:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../build/lib/*.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/slf4j-log4j12-1.7.5.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/slf4j-api-1.7.5.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/servlet-api-2.5-20081211.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/netty-3.7.0.Final.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/log4j-1.2.16.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jline-2.11.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jetty-util-6.1.26.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jetty-6.1.26.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/javacc.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jackson-mapper-asl-1.9.11.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/jackson-core-asl-1.9.11.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/commons-cli-1.2.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../lib/ant-eclipse-1.0-jvm1.2.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../zookeeper-3.5.1-alpha.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../src/java/lib/ant-eclipse-1.0-jvm1.2.jar:/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin/../conf:/usr/java/jdk1.7.0_79/lib 2016-09-19 22:34:13,420 [myid:] - INFO [main:Environment@109] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2016-09-19 22:34:13,420 [myid:] - INFO [main:Environment@109] - Server environment:java.io.tmpdir=/tmp 2016-09-19 22:34:13,420 [myid:] - INFO [main:Environment@109] - Server environment:java.compiler=<NA> 2016-09-19 22:34:13,420 [myid:] - INFO [main:Environment@109] - Server environment:os.name=Linux 2016-09-19 22:34:13,420 [myid:] - INFO [main:Environment@109] - Server environment:os.arch=amd64 2016-09-19 22:34:13,421 [myid:] - INFO [main:Environment@109] - Server environment:os.version=3.0.76-0.11-default 2016-09-19 22:34:13,421 [myid:] - INFO [main:Environment@109] - Server environment:user.name=root 2016-09-19 22:34:13,421 [myid:] - INFO [main:Environment@109] - Server environment:user.home=/root 2016-09-19 22:34:13,421 [myid:] - INFO [main:Environment@109] - Server environment:user.dir=/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/bin 2016-09-19 22:34:13,421 [myid:] - INFO [main:Environment@109] - Server environment:os.memory.free=51MB 2016-09-19 22:34:13,422 [myid:] - INFO [main:Environment@109] - Server environment:os.memory.max=889MB 2016-09-19 22:34:13,422 [myid:] - INFO [main:Environment@109] - Server environment:os.memory.total=57MB 2016-09-19 22:34:13,424 [myid:] - INFO [main:ZooKeeperServer@858] - minSessionTimeout set to 4000 2016-09-19 22:34:13,424 [myid:] - INFO [main:ZooKeeperServer@867] - maxSessionTimeout set to 40000 2016-09-19 22:34:13,424 [myid:] - INFO [main:ZooKeeperServer@156] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/zoo_log/version-2 snapdir /home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2 2016-09-19 22:34:13,453 [myid:] - INFO [main:Slf4jLog@67] - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2016-09-19 22:34:13,477 [myid:] - INFO [main:Slf4jLog@67] - jetty-6.1.26 2016-09-19 22:34:13,510 [myid:] - INFO [main:Slf4jLog@67] - Started SelectChannelConnector@0.0.0.0:8080 2016-09-19 22:34:13,514 [myid:] - INFO [main:JettyAdminServer@105] - Started AdminServer on address 0.0.0.0, port 8080 and command URL /commands 2016-09-19 22:34:13,521 [myid:] - INFO [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 1 selector thread(s), 8 worker threads, and 64 kB direct buffers. 2016-09-19 22:34:13,523 [myid:] - INFO [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:2181 2016-09-19 22:34:13,537 [myid:] - INFO [main:FileTxnSnapLog@298] - Snapshotting: 0x0 to /home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2/snapshot.0 2016-09-19 22:34:13,567 [myid:] - INFO [main:ContainerManager@64] - Using checkIntervalMs=60000 maxPerMinute=10000 2016-09-19 22:35:41,907 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /0:0:0:0:0:0:0:1:49485 2016-09-19 22:35:41,917 [myid:] - INFO [NIOWorkerThread-1:ZooKeeperServer@964] - Client attempting to establish new session at /0:0:0:0:0:0:0:1:49485 2016-09-19 22:35:41,919 [myid:] - INFO [SyncThread:0:FileTxnLog@200] - Creating new log file: log.1 2016-09-19 22:35:41,952 [myid:] - INFO [SyncThread:0:ZooKeeperServer@678] - Established session 0x100632436270000 with negotiated timeout 30000 for client /0:0:0:0:0:0:0:1:49485 2016-09-19 22:40:21,211 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /10.18.221.194:34892 2016-09-19 22:40:21,218 [myid:] - INFO [NIOWorkerThread-8:ZooKeeperServer@964] - Client attempting to establish new session at /10.18.221.194:34892 2016-09-19 22:40:21,221 [myid:] - INFO [SyncThread:0:ZooKeeperServer@678] - Established session 0x100632436270001 with negotiated timeout 30000 for client /10.18.221.194:34892 2016-09-19 22:40:40,298 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@649] - Processed session termination for sessionid: 0x100632436270001 2016-09-19 22:40:40,301 [myid:] - INFO [NIOWorkerThread-3:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port2181,name1=Connections,name2=10.18.221.194,name3=0x100632436270001] 2016-09-19 22016-09-19 22:43:47,733 [myid:] - INFO [SyncThread:0:ZooKeeperServer@498] - shutting down 2016-09-19 22:44:39,892 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:47796 2016-09-19 22:44:39,898 [myid:] - INFO [NIOWorkerThread-2:ZooKeeperServer@964] - Client attempting to establish new session at /127.0.0.1:47796 2016-09-19 22:45:15,883 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /0:0:0:0:0:0:0:1:49493 2016-09-19 22:45:15,890 [myid:] - INFO [NIOWorkerThread-3:ZooKeeperServer@964] - Client attempting to establish new session at /0:0:0:0:0:0:0:1:49493 2016-09-19 22:45:16,000 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxn@606] - Closed socket connection for client /127.0.0.1:47796 which had sessionid 0x100632436270012 2016-09-19 22:45:46,000 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxn@606] - Closed socket connection for client /0:0:0:0:0:0:0:1:49493 which had sessionid 0x100632436270013 2016-09-19 22:47:42,512 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /0:0:0:0:0:0:0:1:49494 2016-09-19 22:47:42,519 [myid:] - INFO [NIOWorkerThread-4:ZooKeeperServer@964] - Client attempting to establish new session at /0:0:0:0:0:0:0:1:49494 2016-09-19 22:48:16,001 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxn@606] - Closed socket connection for client /0:0:0:0:0:0:0:1:49494 which had sessionid 0x100632436270014 2016-09-19 22:50:20,268 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:47801 2016-09-19 22:50:20,275 [myid:] - INFO [NIOWorkerThread-5:ZooKeeperServer@964] - Client attempting to establish new session at /127.0.0.1:47801 2016-09-19 22:50:56,000 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxn@606] - Closed socket connection for client /127.0.0.1:47801 which had sessionid 0x100632436270015 {color} -- This message was sent by Atlassian JIRA (v6.3.4#6332)