[
https://issues.apache.org/jira/browse/FLUME-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638282#comment-16638282
]
ASF subversion and git services commented on FLUME-2973:
--------------------------------------------------------
Commit 1b4378396a441629fa0332d4814e053345c58ffb in flume's branch
refs/heads/trunk from [~emajor]
[ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=1b43783 ]
FLUME-2973 Deadlock in hdfs sink
This PR is based on Yan Jian's fix and his test improvements.
Also contains the deadlock reproduction contributed by @adenes.
I have made minimal changes to those contributions.
Denes's test was used for checking the fix.
Yan's fix contains an optimization as it first calls the callback function
that removes the BucketWriter from the cache.
This is useful, should help to avoid some errors.
This closes #226
Reviewers: Peter Turcsanyi, Ferenc Szabo
(Endre Major, Yan Jian, Denes Arvay via Ferenc Szabo)
> Deadlock in hdfs sink
> ---------------------
>
> Key: FLUME-2973
> URL: https://issues.apache.org/jira/browse/FLUME-2973
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: 1.7.0
> Reporter: Denes Arvay
> Assignee: Denes Arvay
> Priority: Critical
> Labels: hdfssink
> Fix For: 1.9.0
>
> Attachments: FLUME-2973-1.patch, FLUME-2973-min2.patch,
> FLUME-2973.patch
>
>
> Automatic close of BucketWriters (when open file count reached
> {{hdfs.maxOpenFiles}}) and the file rolling thread can end up in deadlock.
> When creating a new {{BucketWriter}} in {{HDFSEventSink}} it locks
> {{HDFSEventSink.sfWritersLock}} and the {{close()}} called in
> {{HDFSEventSink.sfWritersLock.removeEldestEntry}} tries to lock the
> {{BucketWriter}} instance.
> On the other hand if the file is being rolled in
> {{BucketWriter.close(boolean)}} it locks the {{BucketWriter}} instance first
> and in the close callback it tries to lock the {{sfWritersLock}}.
> The chances for this deadlock is higher when the {{hdfs.maxOpenFiles}}'s
> value is low (1).
> Script to reproduce:
> https://gist.github.com/adenes/96503a6e737f9604ab3ee9397a5809ff
> (put to
> {{flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs}})
> Deadlock usually occurs before ~30 iterations.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]