[
https://issues.apache.org/jira/browse/STORM-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhuo Liu updated STORM-1657:
----------------------------
Description:
There are still lots of complaints from users and industry saying that Storm is
hard to debug, mainly because Storm runs executors from different components
(spout, bolt) in a single JVM process.
The original motivation for doing that is to allow inter-component
communication to happen within a single process rather than over network socket
as much as possible.
For small topology, this seems make sense, where we can place executors even
within one or several worker. But for larger topology with more executors in
each component, the saving in network transfer by using round-robin executor
distribution to different workers is marginal. And the trade-off is that we
will lose lots of debuggability, which makes problem-cracking difficult and
prevents some potential users to start using Storm.
With RAS (STORM-894), it should be convenient for us to add a new strategy to
allow one worker to only run executors (>= 1) from a single component, which
will increase a few workers but not waste any memory /CPU resources.
By using this strategy (per-topology basis), it will be easier for users to
find which component in a topology is having problems by checking their worker
log, gc log and heap dump, without being bothered by noise made by other
components' instances.
was:
There are still lots of complaints from users and industry saying that Storm is
hard to debug, mainly because Storm put executors from different components
(spout, bolt) to run in a single JVM process.
The original motivation for doing that is to allow inter-component
communication to happen within a single process rather than over network socket
as much as possible.
For small topology, this seems make sense, where we can place executors even
within one or several worker. But for larger topology with more executors in
each component, the saving in network transfer by using round-robin executor
distribution to different workers is marginal. And the trade-off is that we
will lose lots of debuggability, which makes problem-cracking difficult and
prevents some potential users to start using Storm.
With RAS (STORM-894), it should be convenient for us to add a new strategy to
allow one worker to only run executors (>= 1) from a single component, which
will increase a few workers but not waste any memory /CPU resources.
By using this strategy (per-topology basis), it will be easier for users to
find which component in a topology is having problems by checking their worker
log, gc log and heap dump, without being bothered by noise made by other
components' instances.
> A new strategy in RAS that allows one worker to run executors from one
> component
> --------------------------------------------------------------------------------
>
> Key: STORM-1657
> URL: https://issues.apache.org/jira/browse/STORM-1657
> Project: Apache Storm
> Issue Type: Story
> Reporter: Zhuo Liu
>
> There are still lots of complaints from users and industry saying that Storm
> is hard to debug, mainly because Storm runs executors from different
> components (spout, bolt) in a single JVM process.
> The original motivation for doing that is to allow inter-component
> communication to happen within a single process rather than over network
> socket as much as possible.
> For small topology, this seems make sense, where we can place executors even
> within one or several worker. But for larger topology with more executors in
> each component, the saving in network transfer by using round-robin executor
> distribution to different workers is marginal. And the trade-off is that we
> will lose lots of debuggability, which makes problem-cracking difficult and
> prevents some potential users to start using Storm.
> With RAS (STORM-894), it should be convenient for us to add a new strategy to
> allow one worker to only run executors (>= 1) from a single component, which
> will increase a few workers but not waste any memory /CPU resources.
> By using this strategy (per-topology basis), it will be easier for users to
> find which component in a topology is having problems by checking their
> worker log, gc log and heap dump, without being bothered by noise made by
> other components' instances.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)