[
https://issues.apache.org/jira/browse/CHUKWA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ari Rabkin updated CHUKWA-369:
------------------------------
Status: Patch Available (was: In Progress)
I've now tested this fairly extensively, at data rates up to 200 MB/sec, up to
256 agents and 20 collectors. It's looking very good and I want to commit it.
- Substantial tests are included.
- The asynch ack mechanism is controlled by a conf option, and defaults to off.
So if you're hesitant about it, you don't need to use it and everything should
remain the way it was.
- Even if it's turned on, collectors can still respond with an immediate Ack,
if they happen to write synchronously. (E.g., if the collector is writing to
HBase or local filesystem)
- I tried pretty hard to code this in such a way that we can easily evolve and
adapt the code to support other reliability strategies in the future.
> proposed reliability mechanism
> ------------------------------
>
> Key: CHUKWA-369
> URL: https://issues.apache.org/jira/browse/CHUKWA-369
> Project: Hadoop Chukwa
> Issue Type: New Feature
> Components: data collection
> Affects Versions: 0.3.0
> Reporter: Ari Rabkin
> Assignee: Ari Rabkin
> Fix For: 0.3.0
>
> Attachments: CHUKWA-369.patch, delayedAcks.patch
>
>
> We like to say that Chukwa is a system for reliable log collection. It isn't,
> quite, since we don't handle collector crashes. Here's a proposed
> reliability mechanism.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.