Yes the config is supervisor.slots.ports. If you only have one node I really have no idea why it would think there are so many free slots.
- Bobby On Wed, Jun 6, 2018 at 10:02 AM Mitchell Rathbun (BLOOMBERG/ 731 LEX) < [email protected]> wrote: > These slots are controlled by the config property supervisor.slots.ports, > right? We only have one node per cluster currently (Nimbus, Supervisor, and > Worker processes all run on the same machine). > > From: [email protected] At: 06/06/18 10:58:35 > To: Mitchell Rathbun (BLOOMBERG/ 731 LEX ) <[email protected]> > Cc: [email protected] > > Subject: Re: Nimbus repeatedly crashing to issue with disk/ZooKeeper > resources > > In the case of the EvenScheduler it is all of the free slots in the > cluster. So it is how ever many slots are on all of the nodes in the > cluster that don't have anything scheduled them. > > It should be proportional to the number of nodes in your cluster. > > - Bobby > > On Wed, Jun 6, 2018 at 9:48 AM Mitchell Rathbun (BLOOMBERG/ 731 LEX) < > [email protected]> wrote: > >> What determines the number of slots that we want to schedule on Nimbus >> startup? Is it existing worker processes at the time Nimbus is brought up, >> or is it a config property like supervisor.slots.ports? >> >> From: [email protected] At: 06/06/18 10:37:32 >> To: Mitchell Rathbun (BLOOMBERG/ 731 LEX ) <[email protected]>, >> [email protected] >> Subject: Re: Nimbus repeatedly crashing to issue with disk/ZooKeeper >> resources >> >> The issue is that intervleave-all is a recursive function. >> >> >> https://github.com/apache/storm/blob/e40d213de7067f7d3aa4d4992b81890d8ed6ff31/storm-core/src/clj/org/apache/storm/util.clj#L776-L784 >> >> So the depth of the stack trace is the number of slots you want to >> schedule on * 3 because of how the recursion happens. >> >> Sadly in the latest code it is the same, but still in java so it is not * >> 3, but still bad. >> >> >> https://github.com/apache/storm/blob/3e098f12e2b09d4954eeeaaf807e4ff6006a6929/storm-server/src/main/java/org/apache/storm/utils/ServerUtils.java#L113-L130 >> >> So if you want to file a JIRA for us to fix this, that would be great. >> Even better if you could look at making interleaveAll no longer recursive. >> >> Thanks, >> >> Bobby >> >> On Tue, Jun 5, 2018 at 10:43 PM Mitchell Rathbun (BLOOMBERG/ 731 LEX) < >> [email protected]> wrote: >> >>> >>> >>> From: Mitchell Rathbun (BLOOMBERG/ 731 LEX) At: 06/05/18 23:42:02 >>> To: Mitchell Rathbun (BLOOMBERG/ 731 LEX ) <[email protected]> >>> Subject: Nimbus repeatedly crashing to issue with disk/ZooKeeper >>> resources >>> Recently, our Nimbus crashed with a stack overflow error, and we are >>> having some difficulty determining what the initial cause was. I have >>> attached the stack trace to help with the debugging. This same stack trace >>> occurred every time I ran Nimbus. I then deleted everything in the >>> directory specified by storm.local.dir and removed everything in ZooKeeper >>> under the storm.zookeeper.root path. I was then able to successfully run >>> Nimbus. So this points to there being an issue with the data/state that >>> Nimbus keeps. Has this issue been seen before, and how could the state >>> reach a point that would prevent Nimbus from running at all? Is it possible >>> that there was not enough disk/zk space, even though the logs don't really >>> point to this being the issue? >>> >> >> >
