[
https://issues.apache.org/jira/browse/FLUME-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186253#comment-13186253
]
[email protected] commented on FLUME-927:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3487/#review4385
-----------------------------------------------------------
Ship it!
Nice patch prasad. lgtm.
- jmhsieh
On 2012-01-13 18:10:49, Prasad Mujumdar wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/3487/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-01-13 18:10:49)
bq.
bq.
bq. Review request for Mingjie Lai and jmhsieh.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. When the WAL decorator starts its subsink, it waits for one second for it
to be active. If the subsink doesn't start in that interval then it goes ahead
and mark it for stop and hence making the agent idle.
bq. The agent sinks contains retry sink which will keep trying the open till
is succeed. The WAL forcing it to close in one second makes this retry
mechanism useless and forces user to restart the agent.
bq. The patch is to wait for the subsink to be active, only exceptions in the
subsink will abort the wait.
bq.
bq.
bq. This addresses bug FLUME-927.
bq. https://issues.apache.org/jira/browse/FLUME-927
bq.
bq.
bq. Diffs
bq. -----
bq.
bq.
flume-core/src/main/java/com/cloudera/flume/agent/durability/NaiveFileWALDeco.java
3a88ab8
bq.
flume-core/src/main/java/com/cloudera/flume/handlers/debug/DelayDecorator.java
15a9066
bq.
flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALDeco.java
8dd45fa
bq.
bq. Diff: https://reviews.apache.org/r/3487/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. added new testcase. will run the full regression test suite.
bq.
bq.
bq. Thanks,
bq.
bq. Prasad
bq.
bq.
> A Flume agent started before collectors in E2E mode could fail to connect to
> the collector
> ------------------------------------------------------------------------------------------
>
> Key: FLUME-927
> URL: https://issues.apache.org/jira/browse/FLUME-927
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4, v0.9.5
> Reporter: Prasad Mujumdar
> Assignee: Prasad Mujumdar
> Fix For: v0.9.5
>
>
> The write ahead log (WAL) mechanism expects the agent sink to be active in 1
> second. After that, it assumes that the agent couldn't connect to collector
> and shuts it down. The AgentSink has a retry mechanism that handles network
> problems, unavailable collector etc for a configurable amount of time. The
> hardcode 1 sec timeout in WAL decorator invalidates this retry mechanism.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira