[ 
https://issues.apache.org/jira/browse/FLUME-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099057#comment-13099057
 ] 

E. Sammer commented on FLUME-761:
---------------------------------

Port the Flume HDFS sink functionality over to Flume NG.

The interesting features are file rotation, output bucketing, and support for 
append (flush).

A minimal implementation would support file rotation. Rotation should be 
configurable based on both time interval (specified in seconds) and size. 
Ideally, we do not create files unless there are events output (i.e. lazy file 
creation). It should be possible to specify rotation for time and size 
together, meaning rotate on whichever happens first.

Output bucketing is a feature support by Flume today that allows interpolation 
of event attributes in output paths. For instance, an output path of 
/logs/%{year}/%{month}/%{day}/ should become /logs/2011/01/01/ for an event 
with the atributes year=2011, month=01, day=01. This implies we must keep 
multiple writers open concurrently, each with separate bookkeeping on rotation 
time and output size.

Support for append should be orthogonal to file rotation. In other words we 
should still allow the user to specify a rotation policy (time and size) but we 
should call flush with a given frequency, probably specified in terms of the 
number of events. A fully durable configuration would flush after each event 
(i.e. flushInterval=1). We should only enable append support if the underlying 
HDFS install supports it. If the user specifies a flush policy and HDFS doesn't 
support append, we should warn, but continue.

> Implement HDFS Flume NG sink
> ----------------------------
>
>                 Key: FLUME-761
>                 URL: https://issues.apache.org/jira/browse/FLUME-761
>             Project: Flume
>          Issue Type: Sub-task
>          Components: Build, Docs, Master, Node, Shell, Sinks+Sources, 
> Technical Debt, Test, Web
>            Reporter: E. Sammer
>            Assignee: E. Sammer
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to