Flume output bucketing
----------------------
Key: FLUME-1072
URL: https://issues.apache.org/jira/browse/FLUME-1072
Project: Flume
Issue Type: Question
Components: Sinks+Sources
Affects Versions: v0.9.3
Reporter: Nguyen
Hi all,
Could you please help me to understand why flume can't control the output of
log-events to particular directories based on the value of event's field.
Example:
collectorSink("hdfs://namenode/flume/webdata/%H00/", "%{host}-")
1. a flume collector receives a message to be logged to hdfs with source is
SyslogTcp and Sink is HDFS
2. 16:00 PM Flume process crashes --> SyslogNG buffers the log-events on the
local disk
3. 19:00 PM Flume process restart --> SyslogNG sends the buffered-data to
flume. It means log-events have a delay
4. I expect that Flume controls the output of log-events to particular
directories based on the value of event's field , it means log-events on 16:00
PM will be created on the directory /flume/webdata/1600
5. The result is that directory /webdata/1900 is created for log-events
Could you please tell me why flume cannot control the output of log-events as
described in docu?
Thank you
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira