[jira] [Resolved] (FLUME-1486) Ability to configure a staging directory for data

Mike Percy (JIRA) Wed, 19 Dec 2012 19:21:15 -0800

     [ 
https://issues.apache.org/jira/browse/FLUME-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mike Percy resolved FLUME-1486.
-------------------------------

    Resolution: Won't Fix

Please reopen if FLUME-1702 is not a sufficient solution.
                
> Ability to configure a staging directory for data 
> --------------------------------------------------
>
>                 Key: FLUME-1486
>                 URL: https://issues.apache.org/jira/browse/FLUME-1486
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>            Reporter: Ricky Saltzer
>
> It would be nice to be able to configure a staging directory for files being 
> written to HDFS. Once the file stream is complete the file would then be 
> moved to the configured "final" directory. 
> One example use case where this helps is with log files which are being 
> analyzed by Hive. We could have a Hive table that points to HDFS folder which 
> contains a bunch of log files. As it stands, if flume is writing a tmp file 
> into that directory, and you fire up a MapReduce job, and that file is 
> finished being written to (thus changing the filename) than the job will fail 
> because it can't find that job. 
> The current workaround is to use virtual columns to not look at TMP files, 
> but this tedious to do for every query. It would be nice to be able to have a 
> directory Flume can write the files into, once it finishes streaming data to 
> a job and closes the file for writing, it can move it to the final directory. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (FLUME-1486) Ability to configure a staging directory for data

Reply via email to