Josh Elser created ACCUMULO-3148:
------------------------------------

             Summary: TabletServer didn't get Session expired in 
HalfDeadTServerIT
                 Key: ACCUMULO-3148
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3148
             Project: Accumulo
          Issue Type: Bug
          Components: test
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 1.6.1, 1.7.0


Beening seeing spurious failures with HalfDeadTServerIT where it doesn't get 
the ZK session expiration

{noformat}
2014-09-15 09:39:59,201 [tserver.TabletServer] DEBUG: ScanSess tid 
172.31.33.94:35957 !0 0 entries in 0.07 secs, nbTimes = [63 63 63.00 1] 
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
2014-09-15 09:40:20,088 [tserver.TabletServer] FATAL: Lost tablet server lock 
(reason = LOCK_DELETED), exiting.
2014-09-15 09:40:20,088 [zookeeper.ZooCache] WARN : Zookeeper error, will retry
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for 
/accumulo/d0b9b8e7-9869-4b00-9ae7-317f5231f2c1/tables/1/conf/table.iterator.minc.vers.opt.maxVersions
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
        at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:261)
        at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCache.java:153)
        at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:277)
        at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:224)
        at 
org.apache.accumulo.server.conf.ZooCachePropertyAccessor.get(ZooCachePropertyAccessor.java:114)
        at 
org.apache.accumulo.server.conf.ZooCachePropertyAccessor.getProperties(ZooCachePropertyAccessor.java:144)
        at 
org.apache.accumulo.server.conf.TableConfiguration.getProperties(TableConfiguration.java:108)
        at 
org.apache.accumulo.core.conf.AccumuloConfiguration.iterator(AccumuloConfiguration.java:69)
        at 
org.apache.accumulo.core.conf.ConfigSanityCheck.validate(ConfigSanityCheck.java:40)
        at 
org.apache.accumulo.server.conf.ServerConfigurationFactory.getTableConfiguration(ServerConfigurationFactory.java:155)
        at 
org.apache.accumulo.server.conf.ServerConfiguration.getTableConfiguration(ServerConfiguration.java:69)
        at 
org.apache.accumulo.tserver.TabletServer.getTableConfiguration(TabletServer.java:3983)
        at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1277)
        at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1256)
        at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1112)
        at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1089)
        at 
org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2935)
        at 
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at 
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at 
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at 
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:745)
2014-09-15 09:40:20,090 [tserver.TabletServer] WARN : Check for long GC pauses 
not called in a timely fashion. Expected every 5.0 seconds but was 16.3 seconds 
since last check
2014-09-15 09:40:20,477 [datanode.DataNode] ERROR: 127.0.0.1:57185:DataXceiver 
error processing WRITE_BLOCK operation  src: /127.0.0.1:42146 dst: 
/127.0.0.1:57185
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:467)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:771)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:718)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

It looks like the tserver killed itself after the connection loss but before 
the tserver retried to connect and got the session expiration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to