Tycho Lamerigts created FLUME-2566:
--------------------------------------

             Summary: BucketWriter tries to close file endlessly
                 Key: FLUME-2566
                 URL: https://issues.apache.org/jira/browse/FLUME-2566
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.5.1
            Reporter: Tycho Lamerigts


The following scenario causes BucketWriter to go in a endless loop trying to 
close a file:

On the first call to close, there is a timeout (due to hdfs being temporarily 
overloaded)

{noformat}
12:53:57.363 [hdfs-hdfs_sink_3-roll-timer-0] WARN   
o.a.flume.sink.hdfs.BucketWriter - failed to close() HDFSWriter for file 
(/rawdata/medusa/data/p_nl_omm_goat_medusa01/20141202/node1/FOO/medusa.1417521000000.1417521129207.avro.tmp).
 Exception follows.
java.io.IOException: Callable timed out after 10000 ms on file: 
/rawdata/medusa/data/p_nl_omm_goat_medusa01/20141202/node1/FOO/medusa.1417521000000.1417521129207.avro.tmp
        at 
org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:736) 
~[flume-hdfs-sink.jar:na]
        at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:417) 
~[flume-hdfs-sink.jar:na]
        at 
org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:476) 
[flume-hdfs-sink.jar:na]
        at 
org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:471) 
[flume-hdfs-sink.jar:na]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_20]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_20]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [na:1.8.0_20]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_20]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_20]
{noformat}

BucketWriter schedules an operation to retry the close. This retry, and every 
other retry after that, fails because, by now, the channel succeeded in closing 
and refuses to flush/close again. Instead it throws an exception that causes 
BucketWriter to schedule another close retry (which will fail again).

{noformat}
14:32:58.793 [hdfs-hdfs_sink_3-roll-timer-0] WARN   
o.a.flume.sink.hdfs.BucketWriter - Closing file: 
/rawdata/medusa/data/p_nl_omm_goat_medusa01/20141202/node1/FOO/medusa.1417521000000.1417521129207.avro.tmp
 failed. Will retry again in 180 seconds.
java.nio.channels.ClosedChannelException: null
        at 
org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1527) 
~[hadoop-hdfs.jar:na]
        at 
org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1843) 
~[hadoop-hdfs.jar:na]
        at 
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1803) 
~[hadoop-hdfs.jar:na]
        at 
org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1788) 
~[hadoop-hdfs.jar:na]
        at 
org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:120) 
~[hadoop-common.jar:na]
        at 
org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:139) 
~[flume-hdfs-sink.jar:na]
        at 
org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:341) 
~[flume-hdfs-sink.jar:na]
        at 
org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:335) 
~[flume-hdfs-sink.jar:na]
        at 
org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:722) 
~[flume-hdfs-sink.jar:na]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to