David Stendardi created FLUME-2375:
--------------------------------------
Summary: HDFS sink's fail to recover from datanode unavailability
Key: FLUME-2375
URL: https://issues.apache.org/jira/browse/FLUME-2375
Project: Flume
Issue Type: Bug
Affects Versions: v1.4.0
Reporter: David Stendardi
Hello !
We are running flume-ng with version cdh-4.5-1.4. When a datanode used by
flume-ng goes done, we get the following exceptions :
{code}
30 Apr 2014 01:10:38,130 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) -
Unexpected error while checking replication factor
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
at
org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
at
org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
{code}
These exceptions are logged but not rethrown, and the
AbstractHdfsSink::isUnderReplicated still returns false so the writer continue
to try writing on the node.
--
This message was sent by Atlassian JIRA
(v6.2#6252)