-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7659/
-----------------------------------------------------------

(Updated Nov. 14, 2012, 8 a.m.)


Review request for Flume and Mike Percy.


Changes
-------

Added a unit test.

There's some commentary on a corner case in there... If you process an event 
just as the close is happening, what can happen is the following:
- process gets handle to the bucketwriter
- (in the timer thread)that bucketwriter times out, get closed and removed from 
map
- (in the sink runner/test thread)the bucketwriter has append called on it, 
opening up a new file with a single append to it
- a new bucketwriter is created for following events, which will be writing to 
a different location.

it results in an extra file with a single append, so while not disastrous it's 
not exactly a great situation(though likely very rare)

The only way I can see to fix it is to introduce a synchronization block around 
the map removal in the callback and the map.get/map.insert in the process() 
call. The synchronization block was one of the things that we had been wanting 
to avoid for this alternate method for the implementation...


Description
-------

Added lastWrite to BucketWriter to verify when it was last updated

Added a thread to HDFSEventSink which verifies the last update of each active 
bucketWriter and closes them after the configurable timeout 
hdfs.closeIdleTimeout has passed.


This addresses bug FLUME-1660.
    https://issues.apache.org/jira/browse/FLUME-1660


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 
  
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
 9f2c763 
  
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
 e369604 
  
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java
 6a8072e 
  
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
 fee4c8b 

Diff: https://reviews.apache.org/r/7659/diff/


Testing
-------

Local machine testing was performed and the correct closing of files was 
confirmed, as well as the correct behavior of the configuration setting 
including disabling the feature(by using the default value for 
hdfs.closeIdleTimeout of 0)


There is one unrelated test failure which I'm not sure of(if anyone knows 
what's causing this, please let me know)

Failed tests:   testInOut(org.apache.flume.test.agent.TestFileChannel): 
Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. 
expected:<1> but was:<0>

All other tests pass.


Thanks,

Juhani Connolly

Reply via email to