[ 
https://issues.apache.org/jira/browse/FLUME-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496540#comment-13496540
 ] 

Mike Percy commented on FLUME-1702:
-----------------------------------

Whoops, didn't notice you filed this JIRA Brock. Adding description from dup 
ticket:

We should add the capability to the HDFS sink to specify a prefix for the .tmp 
files. I believe this needs to be configurable and disabled by default.
However we should document that we recommend "_" or "." as a prefix for the 
temp files.
This is because Hadoop's default FileInputFormat will skip files beginning with 
"_" or "." (hidden files)
                
> HDFSEventSink should write to a hidden file as opposed to a .tmp file
> ---------------------------------------------------------------------
>
>                 Key: FLUME-1702
>                 URL: https://issues.apache.org/jira/browse/FLUME-1702
>             Project: Flume
>          Issue Type: Improvement
>            Reporter: Brock Noland
>
> Currently we write to a .tmp file. The problem is that if MR jobs are being 
> run on the directory we are writing to, then it's common for an MR job to 
> list the directory, get a .tmp file and then in the mean time the .tmp file 
> is renamed causing the job to fail when run.
> Using JavaMR you can use a PathFilter to avoid this, however a custom 
> solution is required for Pig, Hive, etc.
> Perhaps we should write to a hidden file so that MR never tries to process 
> data in flight.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to