[ 
https://issues.apache.org/jira/browse/FLUME-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Sun updated FLUME-2701:
----------------------------
    Attachment: webhdfs.2.patch

> Adding WebHDFS support
> ----------------------
>
>                 Key: FLUME-2701
>                 URL: https://issues.apache.org/jira/browse/FLUME-2701
>             Project: Flume
>          Issue Type: New Feature
>            Reporter: Mark Sun
>         Attachments: webhdfs.1.patch, webhdfs.2.patch
>
>
> I'm using HttpFs as a HDFS Web Gateway to handle data from Flume in other 
> datacenter via Internet or WAN, in my case, a gateway is necessary for 
> minimizing the footprint required to access HDFS, but WebHDFS API do not 
> support hsync(), which is required by Flume.
> HDFS will sync all data and metadata to DN disk before file close, and it 
> also works in WebHDFS API. It seems to me that we can use this guarantee to 
> make data safe without hsync()  when unavailable. Personally, I guess it’s 
> much easier than adding hsync() support to WebHDFS/HttpFs.
> Basically, the idea is making transaction open until rolling occurs, if we 
> found the schema of HDFS URI is “webhdfs”.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to