[
https://issues.apache.org/jira/browse/FLUME-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809980#comment-13809980
]
Gopinathan A commented on FLUME-2228:
-------------------------------------
[~justlooks] Actually there is no issue with flume or hdfs.
This error occurred due to namenode in safemode, generally NN will take some
time to get all the block info from DN, due to this NN in Safemode.
This error will get cleared after some time as shown in below log.
{noformat}
Name node is in safe m
ode.
The reported blocks 3722 has reached the threshold 0.9990 of total blocks 3722.
Safe mode will be turned off automatically in 4 seconds.
{noformat}
> HDFS namenode failover cause flume error
> ----------------------------------------
>
> Key: FLUME-2228
> URL: https://issues.apache.org/jira/browse/FLUME-2228
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.3.0
> Environment: redhat6.2,flume 1.3.0
> Reporter: alex kim
> Priority: Critical
>
> i test flume in HDFS namenode HA env ,
> here is my config for flume
> agent.sources = seqGenSrc
> agent.channels = memoryChannel
> agent.sinks = HDFSSink
> agent.sources.seqGenSrc.type = exec
> agent.sources.seqGenSrc.command = tail -f /tmp/mytest
> agent.sources.seqGenSrc.channels = memoryChannel
> agent.sinks.HDFSSink.type = hdfs
> agent.sinks.HDFSSink.hdfs.path = /myflume
> agent.sinks.HDFSSink.channel = memoryChannel
> agent.channels.memoryChannel.type = memory
> agent.channels.memoryChannel.capacity = 100
> i run this in commandline,write one record each 0.5 second
> # for i in `seq 1000`;do echo yes$i >> /tmp/mytest;sleep .5;done
> when write the file ,i restart active NN,to trigger NN autofailover,the
> active become standby ,and standby become active
> and here is my flume log output
> 30 Oct 2013 13:55:13,192 WARN
> [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot add block to /myflume/FlumeData.1383111641112.tmp. Name node is in
> safe m
> ode.
> The reported blocks 3722 has reached the threshold 0.9990 of total blocks
> 3722. Safe mode will be turned off automatically in 4 seconds.
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2254)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2175)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:480)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)
> at org.apache.hadoop.ipc.Client.call(Client.java:1225)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)
> at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1176)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1029)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
> 30 Oct 2013 13:55:18,194 ERROR
> [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:82) -
> Unexpected error while che
> cking replication factor
> java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:147)
> at
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:68)
> at
> org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:454)
> at
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:389)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
> at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:722)
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot add block to /myflume/FlumeData.1383111641112.tmp. Name node is in
> safe mode.
--
This message was sent by Atlassian JIRA
(v6.1#6144)