[
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191998#comment-14191998
]
ASF GitHub Bot commented on STORM-329:
--------------------------------------
Github user clockfly commented on the pull request:
https://github.com/apache/storm/pull/268#issuecomment-61286844
About performance test:
===========================
I tested the performance of new patch.
It has no sigificant difference with storm-0.92.
About STORM-404 chained crash issue(one worker cause another worker to
crash)
========================
With this patch, the reconnection is successfully aborted. And new
connection is established.
```
2014-10-31T23:00:11.738+0800 b.s.m.n.Client [INFO] Reconnect started for
Netty-Client-IDHV22-04/192.168.1.54:6703... [30]
2014-10-31T23:00:12.738+0800 b.s.m.n.Client [INFO] Closing Netty Client
Netty-Client-IDHV22-04/192.168.1.54:6703
2014-10-31T23:00:12.739+0800 b.s.m.n.Client [INFO] Waiting for pending
batchs to be sent with Netty-Client-IDHV22-04/192.168.1.54:6703..., timeout:
600000ms, pendings: 0
2014-10-31T23:00:32.754+0800 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (30). Pinning to 29
2014-10-31T23:00:32.754+0800 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [30]
2014-10-31T23:00:32.754+0800 b.s.m.n.Client [INFO] New Netty Client,
connect to IDHV22-01, 6702, config: , buffer_size: 5242880
2014-10-31T23:00:32.754+0800 b.s.m.n.Client [INFO] Reconnect started for
Netty-Client-IDHV22-01/192.168.1.51:6702... [0]
2014-10-31T23:00:32.755+0800 b.s.m.n.Client [INFO] connection established
to a remote host Netty-Client-IDHV22-01/192.168.1.51:6702, [id: 0x4f7eb44b,
/192.168.1.51:56592 => IDHV22-01/192.168.1.51:6702]
```
> Add Option to Config Message handling strategy when connection timeout
> ----------------------------------------------------------------------
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
> Issue Type: Improvement
> Affects Versions: 0.9.2-incubating
> Reporter: Sean Zhong
> Priority: Minor
> Labels: Netty
> Fix For: 0.9.2-incubating
>
> Attachments: storm-329.patch
>
>
> This is to address a [concern brought
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986]
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are
> blocking. My biggest concern around the blocking is in the case of a worker
> crashing. If a single worker crashes this can block the entire topology from
> executing until that worker comes back up. In some cases I can see that being
> something that you would want. In other cases I can see speed being the
> primary concern and some users would like to get partial data fast, rather
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max
> limit to the buffering that is allowed, before we block, or throw data away
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the
> messages, and use the built-in storm failover mechanism?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)