[
https://issues.apache.org/jira/browse/ACCUMULO-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Elser resolved ACCUMULO-1916.
----------------------------------
Resolution: Not A Problem
> Hung TServer during CI with agitation
> -------------------------------------
>
> Key: ACCUMULO-1916
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1916
> Project: Accumulo
> Issue Type: Bug
> Components: tserver
> Environment: hdp-2.0 (apache hadoop 2.2.0), Accumulo 1.5.1-SNAPSHOT
> (11/14/2013 timeframe)
> Reporter: Josh Elser
> Attachments: jstack.1, jstack.2, jstack.3
>
>
> Ran continuous ingest on a 6 node system for ~18hrs with full agitation
> (datanode, tserver and master/gc).
> Checked on the system and noticed that queries were still running but ingest
> was hung. A single tabletserver believed that any time it tried to create a
> new WAL file that it couldn't be replicated.
> {noformat}
> 2013-11-21 10:45:45,998 [tabletserver.LargestFirstMemoryManager] DEBUG:
> COMPACTING 9;7c905f;7c387d total = 1,787,421,387 ingestMemory = 1,787,421,387
> 2013-11-21 10:45:45,998 [tabletserver.LargestFirstMemoryManager] DEBUG:
> chosenMem = 57,180,830 chosenIT = 0.15 load 57,187,348
> 2013-11-21 10:45:46,000 [tabletserver.NativeMap] DEBUG: Allocated native map
> 0x000000000151c0e0
> 2013-11-21 10:45:46,000 [tabletserver.Tablet] DEBUG: MinC initiate lock 0.00
> secs
> 2013-11-21 10:45:46,001 [tabletserver.MinorCompactor] DEBUG: Begin minor
> compaction /accumulo/tables/9/t-0000skc/F0001bmp.rf_tmp 9;7c905f;7c387d
> 2013-11-21 10:45:46,038 [tabletserver.TabletServer] DEBUG: UpSess
> 192.168.56.172:50599 23,482 in 0.765s, at=[0 3 0.05 63] ft=0.619s(pt=0.013s
> lt=0.413s ct=0.193s)
> 2013-11-21 10:45:46,252 [tabletserver.LargestFirstMemoryManager] DEBUG:
> BEFORE compactionThreshold = 0.834 maxObserved = 1,815,829,128
> 2013-11-21 10:45:46,253 [tabletserver.LargestFirstMemoryManager] DEBUG: AFTER
> compactionThreshold = 0.834
> 2013-11-21 10:45:46,900 [tabletserver.TabletServer] DEBUG: gc
> ParNew=287.26(+0.07) secs ConcurrentMarkSweep=19.96(+0.00) secs
> freemem=622,040,112(+393,782,976) totalmem=1,021,313,024
> 2013-11-21 10:45:47,965 [hdfs.DFSClient] WARN : DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8 could
> only be replicated to 0 nodes instead of minReplication (=1). There are 5
> datanode(s) running and no node(s) are excluded in this operation.
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:47,968 [hdfs.DFSClient] WARN : Error while syncing
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8 could
> only be replicated to 0 nodes instead of minReplication (=1). There are 5
> datanode(s) running and no node(s) are excluded in this operation.
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:47,999 [log.DfsLogger] WARN : Exception syncing
> java.lang.reflect.InvocationTargetException
> 2013-11-21 10:45:48,002 [log.TabletServerLogger] ERROR: Unexpected error
> writing to log, retrying attempt 1
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> at
> org.apache.accumulo.server.tabletserver.log.DfsLogger$LoggerOperation.await(DfsLogger.java:178)
> at
> org.apache.accumulo.server.tabletserver.log.TabletServerLogger.write(TabletServerLogger.java:279)
> at
> org.apache.accumulo.server.tabletserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:362)
> at
> org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1552)
> at
> org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1461)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.accumulo.trace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:63)
> at com.sun.proxy.$Proxy10.applyUpdates(Unknown Source)
> at
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2080)
> at
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2066)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:156)
> at
> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:478)
> at
> org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:208)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at
> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> at
> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.accumulo.server.tabletserver.log.DfsLogger$LogSyncingTask.run(DfsLogger.java:116)
> ... 1 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8 could
> only be replicated to 0 nodes instead of minReplication (=1). There are 5
> datanode(s) running and no node(s) are excluded in this operation.
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:48,004 [log.DfsLogger] WARN : Exception syncing
> java.lang.reflect.InvocationTargetException
> 2013-11-21 10:45:49,511 [log.TabletServerLogger] ERROR: Unexpected error
> writing to log, retrying attempt 1
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> at
> org.apache.accumulo.server.tabletserver.log.DfsLogger$LoggerOperation.await(DfsLogger.java:178)
> at
> org.apache.accumulo.server.tabletserver.log.TabletServerLogger.write(TabletServerLogger.java:279)
> at
> org.apache.accumulo.server.tabletserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:362)
> at
> org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1552)
> at
> org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1461)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.accumulo.trace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:63)
> at com.sun.proxy.$Proxy10.applyUpdates(Unknown Source)
> at
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2080)
> at
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2066)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:156)
> at
> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:478)
> at
> org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:208)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at
> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> at
> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.accumulo.server.tabletserver.log.DfsLogger$LogSyncingTask.run(DfsLogger.java:116)
> ... 1 more
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8 could
> only be replicated to 0 nodes instead of minReplication (=1). There are 5
> datanode(s) running and no node(s) are excluded in this operation.
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2013-11-21 10:45:49,609 [log.DfsLogger] ERROR:
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /accumulo/wal/192.168.56.172+9997/087cf8a0-c1c1-41e2-a2cb-343afb9dd9e8 could
> only be replicated to 0 nodes instead of minReplication (=1). There are 5
> datanode(s) running and no node(s) are excluded in this operation.
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> {noformat}
> After this, it appears that the tabletserver would give up on that file name,
> and then begin to try again with a new file
> {noformat}
> 2013-11-21 10:45:49,610 [log.DfsLogger] DEBUG: DfsLogger.open() begin
> 2013-11-21 10:45:49,610 [log.DfsLogger] DEBUG: Found CREATE enum CREATE
> 2013-11-21 10:45:49,611 [log.DfsLogger] DEBUG: Found synch enum SYNC_BLOCK
> 2013-11-21 10:45:49,611 [log.DfsLogger] DEBUG: CreateFlag set: [CREATE,
> SYNC_BLOCK]
> 2013-11-21 10:45:49,611 [log.DfsLogger] DEBUG: creating
> /accumulo/wal/192.168.56.172+9997/c44258a0-2f42-42e7-a43b-8b1553722ad3 with
> SYNCH_BLOCK flag
> 2013-11-21 10:45:49,617 [crypto.CryptoModuleFactory] DEBUG: About to
> instantiate crypto module NullCryptoModule
> 2013-11-21 10:45:49,625 [hdfs.DFSClient] WARN : DataStreamer Exception
> ..... same exceptions as before but with the new file name.....
> {noformat}
> Tried to get a heap dump from the tablet server, but it ended up OOME'ing. I
> did get some stack traces that I'll attach.
> Restarting this tabletserver appears to have resolved the issue.
--
This message was sent by Atlassian JIRA
(v6.1#6144)