[
https://issues.apache.org/jira/browse/FLUME-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997254#comment-13997254
]
Brock Noland commented on FLUME-2245:
-------------------------------------
Hi,
Yes, I was able to test this the BucketWriter writer change (and only that) and
I found it fixed this issue.
Note: I used kill -STOP on the DN to reproduce.
> HDFS files with errors unable to close
> --------------------------------------
>
> Key: FLUME-2245
> URL: https://issues.apache.org/jira/browse/FLUME-2245
> Project: Flume
> Issue Type: Bug
> Reporter: Juhani Connolly
> Attachments: flume.log.1133, flume.log.file
>
>
> This is running on a snapshot of Flume-1.5 with the git hash
> 99db32ccd163daf9d7685f0e8485941701e1133d
> When a datanode goes unresponsive for a significant amount of time(for
> example a big gc) an append failure will occur followed by repeated time outs
> appearing in the log, and failure to close the stream. Relevant section of
> logs attached(where it first starts appearing.
> The same log repeats periodically, consistently running into a
> TimeoutException.
> Restarting flume(or presumably just the HDFSSink) solves the issue.
> Probable cause in comments
--
This message was sent by Atlassian JIRA
(v6.2#6252)