James Xu created STORM-150:
------------------------------
Summary: Replace jar distribution strategy with bittorent
Key: STORM-150
URL: https://issues.apache.org/jira/browse/STORM-150
Project: Apache Storm (Incubating)
Issue Type: Improvement
Reporter: James Xu
Priority: Minor
https://github.com/nathanmarz/storm/issues/435
Consider using http://turn.github.com/ttorrent/
----------
ptgoetz: I've been looking into implementing this, but have a design question
that boils down to this:
Should the client (storm jar command) be the initial seeder, or should we wait
and let nimbus do the seeding?
The benefit of doing the seeding client-side is that we would only have to
transfer a small .torrent through the nimbus thrift API. But I can imagine
situations where the network environment would prevent BitTorrent clients from
connecting back to the machine that's submitting the topology. This would
create an indefinitely "stalled submission" since none of the cluster nodes
would be able to connect to the seeder.
The alternative would be to use the current technique of uploading the jar to
nimbus, and have nimbus generate and distribute the .torrent file, and provide
the initial seed. If the cluster is properly configured, we're pretty much
guaranteed connectivity between nimbus and supervisor nodes.
I'm leaning toward the latter approach, but would be interested in others'
opinions.
----------
nathanmarz: @ptgoetz I think Nimbus should do the seeding. That ensures that
when the client finishes submitting, it can disconnect/go away without having
to worry about making the topology unlaunchable.
----------
jasonjckn: @nathanmarz How does this solve the nimbuses dependency on reliable
local disk state (as you talked about in person)?
What happens when zookeeper is offline for 1 hour? All the workers will die,
and nimbus will be continually restarting. The onus is still on nimbus to store
topology jars on local disk, so that when the workers and supervisors reboots
it can seed all this again.
You -can- solve the local disk persistence problem with replicated state to the
non-elected nimbuses, but that's orthogonal to a distribution strategy. Yes
there is some replication going on in bittorrent, but it's not really a
protocol that delivers reliable persistence of state.
I think it's still a good feature if it gives us performant topology submit
times even with 500 workers, which take 3 minutes for us.
Particularly with the worker heartbeat start-up timeout of 120s, you want to be
able to start 500 workers within 120s, or even 1500 workers within 120s, the
current distribution strategy is not scalable in that way.
----------
nathanmarz: @jasonjckn On top of the bittorrent stuff we can ensure that a
topology is considered submitted only when the artifacts exist on at least N
nodes. Nimbus would only be the initial seed for topology files. Also, it
wouldn't have to only be Nimbus that acts as a seed, that work could be shared
by the supervisors. That's less relevant in the storm-mesos world, but you
could still fairly easily run multiple Nimbus's to get replication.
----------
jasonjckn: This PR might be aided by "topology packages" #557, as it bundles
all the state that needs to be replicated.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)