Looking at tag aware scheduling that sounds like it could solve our problems. Thank you for the help.
From: email@example.com At: 02/08/18 11:42:15To: firstname.lastname@example.org Cc: Mitchell Rathbun (BLOOMBERG/ 731 LEX ) Subject: Re: Limiting Available Machines for Topology Placement is all done by the scheduler. The default schedulers do not have the capability. They just try to spread things round robin around the cluster. This works well for small clusters that are built for a very specific purpose. Not so well for large clusters. I have not used tag aware scheduling but it looks like it works and there are people using it in production. Resource Aware Scheduling is not going to work to force your topology to be run on a very specific node. It has the option of accepting a hint for which node(s) you want and which you don't. But those are just hints. We also recently added in generic resources so you could define your own resources and use that like tags. Generic resources is still a work in progress. We are starting to roll it out to production, but we have not even updated the UI to show the generic resources, so there may still be some bugs in it and it needs more polish. The big thing to be careful of with what you are trying to do is fault tolerance and failure domains. If there is really one and only one node that your topology will work on if that node goes down you are done for. Similarly for one and only one rack. - Bobby On Wed, Feb 7, 2018 at 5:55 PM Arnaud BOS <arnaud.t...@gmail.com> wrote: I **guess** you could use the “tag-aware scheduling” described here: https://inside.edited.com/taking-control-of-your-apache-storm-cluster-with-tag-aware-scheduling-b60aaaa5e37e Or maybe bend the "resource aware scheduler" presented here: https://storm.apache.org/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.html to do what you want, but that sounds hackish. Anyways, I've never used any of them so I'm just sending the links for further reading. Hope this helps. On Thu, Feb 8, 2018, 12:12 AM Mitchell Rathbun (BLOOMBERG/ 731 LEX) <mrathb...@bloomberg.net> wrote: Given a multiple node Storm cluster, is it possible to ensure that a topology submitted on a specific machine runs only on that machine? More specifically, given a cluster of machines A, B, and C, if a topology is submitted from machine A, is there a way to guarantee that: -The topology runs on machine A. -If machine A crashes, the topology is not re-run on another machine. I am guessing there isn't and that the answer to this is to run a leader nimbus per machine (nimbus.seeds: ["localhost"]), but I wanted to see if there was a way to do this that I am missing.