-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5436/
-----------------------------------------------------------

(Updated June 21, 2012, 12:52 a.m.)


Review request for Flume and Brock Noland.


Changes
-------

Incorporated review feedback and also removed a debug assertion that was 
causing unnecessary failure when logging was at debug level.


Description
-------

The file channel uses a disk-serialized in-memory checkpointing mechanism. When 
the channel is full and the capacity is large, these checkpoints take a long 
time to serialize and deserialize. For example, a channel with 1M entries could 
take many minutes to boot up. Similarly, a boot up of a largely full channel 
would require the replay of all log events to reconstruct the correct state. 
Due to this latency issues and the failure interaction of the channel with the 
LifeCycleSupervisor, the system could get into an unusable state easily as 
evident from the FLUME-1232 issue.

This patch modifies the checkpointing mechanism as follows:
* The FlumeEventQueue itself represents a checkpoint that is maintained as a 
memory mapped file.
* During checkpointing, a marker is introduced in active logs which is used to 
skip records during display. 

In order to ensure correctness, a reader/writer lock is used where the reader 
lock is used by consumers operating against the channel while the writer lock 
is used to facilitate checkpointing. Some limitations of this approach are:

* The total number of active log files is now limited to a maximum of 1024. 
* Dynamic resizing of the channel capacity is no longer allowed unless the 
checkpoint is rebuild from scratch which can cause significant delay in startup.


This addresses bug FLUME-1232.
    https://issues.apache.org/jira/browse/FLUME-1232


Diffs (updated)
-----

  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Checkpoint.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannelConfiguration.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java
 1351988 
  
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java
 1351988 
  /trunk/flume-ng-core/src/main/java/org/apache/flume/sink/LoggerSink.java 
1351988 

Diff: https://reviews.apache.org/r/5436/diff/


Testing
-------

Ran all tests. Did some manual testing. Will be doing more manual testing and 
cleanup as necessary while the review is underway.


Thanks,

Arvind Prabhakar

Reply via email to