[
https://issues.apache.org/jira/browse/CHUKWA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750672#action_12750672
]
Ari Rabkin commented on CHUKWA-369:
-----------------------------------
- You do not need to get an acknowledgment from the same collector you sent to.
The "ack" is really just a confirmation that the file in question rotated OK,
and was a sufficient length when it rotated.
- Collectors don't need to do anything special on rotation
- There's no long-running TCP connection between agent and collector. But my
current implementation does assume that an agent will continue to use a single
collector until it gets an IOException. For now, I'm not using timeouts;
instead, it relies on getting an IOException from a down collector. This is
simpler, but would require modification if we started doing dynamic
load-balancing across collectors.
> proposed reliability mechanism
> ------------------------------
>
> Key: CHUKWA-369
> URL: https://issues.apache.org/jira/browse/CHUKWA-369
> Project: Hadoop Chukwa
> Issue Type: New Feature
> Components: data collection
> Affects Versions: 0.3.0
> Reporter: Ari Rabkin
> Assignee: Ari Rabkin
> Fix For: 0.3.0
>
> Attachments: delayedAcks.patch
>
>
> We like to say that Chukwa is a system for reliable log collection. It isn't,
> quite, since we don't handle collector crashes. Here's a proposed
> reliability mechanism.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.