Looking at tag aware scheduling that sounds like it could solve our problems.
Thank you for the help.
From: firstname.lastname@example.org At: 02/08/18 11:42:15To: email@example.com
Cc: Mitchell Rathbun (BLOOMBERG/ 731 LEX )
Subject: Re: Limiting Available Machines for Topology
Placement is all done by the scheduler. The default schedulers do not have the
capability. They just try to spread things round robin around the cluster.
This works well for small clusters that are built for a very specific purpose.
Not so well for large clusters. I have not used tag aware scheduling but it
looks like it works and there are people using it in production.
Resource Aware Scheduling is not going to work to force your topology to be run
on a very specific node. It has the option of accepting a hint for which
node(s) you want and which you don't. But those are just hints. We also
recently added in generic resources so you could define your own resources and
use that like tags. Generic resources is still a work in progress. We are
starting to roll it out to production, but we have not even updated the UI to
show the generic resources, so there may still be some bugs in it and it needs
The big thing to be careful of with what you are trying to do is fault
tolerance and failure domains. If there is really one and only one node that
your topology will work on if that node goes down you are done for. Similarly
for one and only one rack.
On Wed, Feb 7, 2018 at 5:55 PM Arnaud BOS <arnaud.t...@gmail.com> wrote:
I **guess** you could use the “tag-aware scheduling” described here:
Or maybe bend the "resource aware scheduler" presented here:
to do what you want, but that sounds hackish.
Anyways, I've never used any of them so I'm just sending the links for further
Hope this helps.
On Thu, Feb 8, 2018, 12:12 AM Mitchell Rathbun (BLOOMBERG/ 731 LEX)
Given a multiple node Storm cluster, is it possible to ensure that a topology
submitted on a specific machine runs only on that machine? More specifically,
given a cluster of machines A, B, and C, if a topology is submitted from
machine A, is there a way to guarantee that:
-The topology runs on machine A.
-If machine A crashes, the topology is not re-run on another machine.
I am guessing there isn't and that the answer to this is to run a leader nimbus
per machine (nimbus.seeds: ["localhost"]), but I wanted to see if there was a
way to do this that I am missing.