[ 
https://issues.apache.org/jira/browse/CHUKWA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544779#comment-13544779
 ] 

Eric Yang commented on CHUKWA-678:
----------------------------------

List of sending events should be a ring buffer.  If the sending action fails 
multiple retries, then it should discard that data.  The error entry can be 
logged for failure by recursively inject into the buffer ring or logged 
locally.  For handling pipeline failure, our general rule of thumb is to throw 
exception back to agent when one of the write failed to commit.  If one or more 
of the writers have failed in the writing action, we throw exception.  There 
chunk will be retried, and this means multiple data sink can receive duplicated 
data.  We have the unique sequence number in our meta data, therefore the 
de-dupe can happen synchronously (in writer) or asynchronously (off band 
process in map reduce).  We provide a single result of the commit status from 
pipeline writer, instead of sending List of results back to agent.  This will 
make sure retries and de-dupe logic can implemented correctly.
                
> Make use of ChukwaWriter in agent
> ---------------------------------
>
>                 Key: CHUKWA-678
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-678
>             Project: Chukwa
>          Issue Type: Sub-task
>          Components: Data Collection
>         Environment: MacOSX, Java 6
>            Reporter: shreyas subramanya
>
> The chukwa agent sends out data chunks to various destinations through the 
> combination of Connector and ChukwaSender interfaces. For sending chunks to 
> collector, we have http implementation of these interfaces. The collector 
> writes out the received chunks to various destinations through classes 
> implementing ChukwaWriter interface. Optionally, multiple destinations can be 
> chosen by specifying PipelineStageWriter.
> The proposal is to:
> 1. Use ChukwaWriter to send out data chunks to multiple destinations from the 
> agent. Further, PipelinestageWriter can be made default and pipeline 
> configuration specified in the agent config file
> 2. Implement (or modify) Pipelineable writers for HBase, Http, Hdfs and 
> WebHdfs
> 3. Do away with the Connector interface and have a single non configurable 
> connector object as part of the agent. This class initiates the configured 
> writer, waits for data chunks and passes the chunks to Writer.add()/send(). 
> Connection protocol for each destination is handled by the init() of the 
> individual writers.
> Considerations:
> 1. In case of Pipelineable writers, we need a way to merge the results of 
> each pipeline stage before the agent commits the chunk.
> 2. Handling pipeline failure

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to