[jira] [Commented] (FLUME-1045) Proposal to support disk based spooling

Inder SIngh (Commented) (JIRA) Fri, 23 Mar 2012 23:36:08 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237459#comment-13237459
 ]


Inder SIngh commented on FLUME-1045:
------------------------------------

Arvind,

Thanks and appreciate your prompt feedback. I understand that it doesn't 
directly fit the abstraction's of flume in current state. Not sure whether 
thinking about it as a transient sink makes it any better? We were 
contemplating replacing an existing system with flume. With it's current state 
there were concerns around operablility of the system using a memory channel.

Building a file-channel to support transaction semantics across multiple source 
& sink threads is challenging and i believe it is WIP. This could act as a good 
alternative for folks using mem channel and avoid the repel effect of a 
sink/agent being down. Worst case we could use it as as stop gap solution till 
a high throughput file channel is available.

Please share your thoughts and if you agree i believe the concerns you 
highlighted could be worked on in an incremental way with your inputs, 
otherwise please advise on the correct route to be taken here.

                
> Proposal to support disk based spooling
> ---------------------------------------
>
>                 Key: FLUME-1045
>                 URL: https://issues.apache.org/jira/browse/FLUME-1045
>             Project: Flume
>          Issue Type: New Feature
>    Affects Versions: v1.0.0
>            Reporter: Inder SIngh
>            Priority: Minor
>              Labels: patch
>         Attachments: FLUME-1045-1.patch, FLUME-1045-2.patch
>
>
> 1. Problem Description 
> A sink being unavailable at any stage in the pipeline causes it to back-off 
> and retry after a while. Channel's associated with such sinks start buffering 
> data with the caveat that if you are using a memory channel it can result in 
> a domino effect on the entire pipeline. There could be legitimate down times 
> eg: HDFS sink being down for name node maintenance, hadoop upgrades. 
> 2. Why not use a durable channel (JDBC, FileChannel)?
> Want high throughput and support sink down times as a first class use-case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1045) Proposal to support disk based spooling

Reply via email to