[jira] [Comment Edited] (FLUME-1702) HDFSEventSink should write to a hidden file as opposed to a .tmp file

darkz (JIRA) Mon, 27 Mar 2017 04:18:07 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943005#comment-15943005
 ]


darkz edited comment on FLUME-1702 at 3/27/17 11:16 AM:
--------------------------------------------------------

I select the .tmp data in hive,then it cauth a error:

Failed with exception 
java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
org.codehaus.jackson.JsonParseException: Illegal character ((CTRL-CHAR, code 
1)): only regular white space (\r, \n, \t) is allowed between tokens
 at [Source: java.io.ByteArrayInputStream@7730ef88; line: 1, column: 2]

I think is the compressed file with '.tmp' suffix is in use and is not a 
completed compressed file,so codec in hadoop colud not recognize the content of 
it

After all:Yes，I use the "." prefix to skip ".tmp" file,but the flume docuent 
dos not mention it...



was (Author: darkz):
Yes，I use the "." prefix to skip ".tmp" file,but the flume document dos not 
mention it...

> HDFSEventSink should write to a hidden file as opposed to a .tmp file
> ---------------------------------------------------------------------
>
>                 Key: FLUME-1702
>                 URL: https://issues.apache.org/jira/browse/FLUME-1702
>             Project: Flume
>          Issue Type: Improvement
>            Reporter: Brock Noland
>            Assignee: Jarek Jarcec Cecho
>             Fix For: 1.4.0
>
>         Attachments: bugFLUME-1702.patch, bugFLUME-1702.patch
>
>
> Currently we write to a .tmp file. The problem is that if MR jobs are being 
> run on the directory we are writing to, then it's common for an MR job to 
> list the directory, get a .tmp file and then in the mean time the .tmp file 
> is renamed causing the job to fail when run.
> Using JavaMR you can use a PathFilter to avoid this, however a custom 
> solution is required for Pig, Hive, etc.
> Perhaps we should write to a hidden file so that MR never tries to process 
> data in flight.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (FLUME-1702) HDFSEventSink should write to a hidden file as opposed to a .tmp file

Reply via email to