Mike Percy created FLUME-1219:
---------------------------------

             Summary: Race conditions in BucketWriter / HDFSEventSink
                 Key: FLUME-1219
                 URL: https://issues.apache.org/jira/browse/FLUME-1219
             Project: Flume
          Issue Type: Bug
            Reporter: Mike Percy


BucketWriter has several race conditions that came up during my performance 
testing over the weekend. One issue that caused data loss was the lack of 
atomic close() and open() semantics related to the "retry" mechanism after the 
abort() call in HDFSEventSink.process().

Another issue is the lack of clearly delineated responsibilities for calling 
open(), flush(), close(), etc. For example, HDFSEventSink.start() calls open(), 
HDFSEventSink.process() calls and abort() which calls open(), and 
BucketWriter.append() also calls close() and open().

There is another race condition related to the JVM shutdown hooks, which cause 
.tmp files not to be renamed.

These APIs need to be refactored and their responsibilities need to be 
clarified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to