[
https://issues.apache.org/jira/browse/FLUME-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hari Shreedharan resolved FLUME-2539.
-------------------------------------
Resolution: Invalid
Unfortunately, that is not possible - as the HDFS API allocates a lease to each
file to a specific client, in this case each Flume agent - thus each file ends
up being writable on from a single JVM.
Though theoretically, multiple sinks within the same agent could write to the
same file, we don't do that to avoid locking files between sinks, reducing
parallelism.
You could run a process to combine your files later (as an MR job or something)
if you want to reduce the number of files.
Moreover, this is a question for the user@ mailing list, not jira. So I am
closing this as Invalid.
> Append HDFS files from multiple FLUME instances
> -----------------------------------------------
>
> Key: FLUME-2539
> URL: https://issues.apache.org/jira/browse/FLUME-2539
> Project: Flume
> Issue Type: New Feature
> Components: Sinks+Sources
> Affects Versions: v1.4.0
> Environment: apache-flume-1.4.0-cdh4.5.0, zookeeper-3.4.5-cdh4.5.0
> Reporter: Arun Gujjar
>
> I have multiple flume instances running in one of the test environment. Each
> instance creates its own FlumeData file. We have a file rotation policy set
> as 1 hour. So, every hour 1 file get's created. As we have 2 instances, so
> every hour there are two files created. Could you please tell me if there is
> any way I can append both instances point to 1 files.
> Basically I need just 1 file created every once in an hour and both the
> instance should write the data into same file.
> Please suggest how I can setup this configuration.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)