[
https://issues.apache.org/jira/browse/STORM-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573171#comment-14573171
]
ASF GitHub Bot commented on STORM-763:
--------------------------------------
Github user eshioji commented on the pull request:
https://github.com/apache/storm/pull/568#issuecomment-108978724
@revans2 I was able to bring back the performance where it was, if not a
bit higher:
| | 0.9.5-SNAPSHOT | STORM-763 |
|-----|----------------|-----------|
| 10 | 23.8 | 23.9 |
| 1K | 78.6 | 82.5 |
| 2K | 75.6 | 74.5 |
| 10K | 71.1 | 72.9 |
The change brings back the batching of in-between calls. However, it does
it a bit differently; instead of using a background process, it uses Netty's
Channel interest notification callback. This should have an added benefit in
theory, because pending messages are immediately flushed as space becomes
available, rather than potentially waiting up to FLUSH_CHECK_INTERVAL_MS. I
also brought back @miguno 's graceful shutdown. Let me know if this is good.
One question; I noticed that you guys plan to make 0.9.5 the last 0.9.x
release. This PR is against 0.9.x, but should I rather open it against 0.10.x
(or master)?
PS
@miguno No worries at all, I'm glad I can give back, however small it may
be :)
> nimbus reassigned worker A to another machine, but other worker's netty
> client can't connect to the new worker A
> -----------------------------------------------------------------------------------------------------------------
>
> Key: STORM-763
> URL: https://issues.apache.org/jira/browse/STORM-763
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 0.9.4
> Environment: Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
> java version "1.7.0_03"
> storm 0.9.4
> cluster 50+ machines
> Reporter: 3in
>
> Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
> java version "1.7.0_03"
> storm 0.9.4
> cluster 50+ machines
> my topology have 50+ worker, it can't emit 50000 thousand tuples in ten
> minutes.
> sometimes one worker is reassigned to another machine by nimbus because of
> task heartbeat timeout:
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor
> my_topology-22-1428243953:[440 440] not alive
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor
> my_topology-22-1428243953:[90 90] not alive
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor
> my_topology-22-1428243953:[510 510] not alive
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor
> my_topology-22-1428243953:[160 160] not alive
> i can see the reassigned worker is already started in storm UI, but other
> worker write error log all the time:
> 2015-04-08T16:56:43.091+0800 b.s.m.n.Client [ERROR] dropping 1 message(s)
> destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] dropping 1 message(s)
> destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:45.715+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:45.716+0800 b.s.m.n.Client [ERROR] dropping 1 message(s)
> destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.277+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:46.278+0800 b.s.m.n.Client [ERROR] dropping 1 message(s)
> destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] dropping 1 message(s)
> destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] dropping 1 message(s)
> destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.835+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-host_19/192.168.163.19:5700 is unavailable
> The worker of destined host is already started, and i can telnet
> 192.168.163.19 5700.
> however, why the netty client can't connect to the ip:port?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)