> On Aug. 24, 2015, 8:43 p.m., Nate Cole wrote: > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java, > > line 114 > > <https://reviews.apache.org/r/37739/diff/1/?file=1048842#file1048842line114> > > > > If 100 hosts per stage, would 30 have to fail to fail the stage (not 3) > > if set to 70%?
Don't ever let me accuse you of not reading the comments in a code review. Fixed. > On Aug. 24, 2015, 8:43 p.m., Nate Cole wrote: > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java, > > line 341 > > <https://reviews.apache.org/r/37739/diff/1/?file=1048842#file1048842line341> > > > > formatting Fixed. > On Aug. 24, 2015, 8:43 p.m., Nate Cole wrote: > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java, > > line 396 > > <https://reviews.apache.org/r/37739/diff/1/?file=1048842#file1048842line396> > > > > successFactor already a Float, no need to down-reference to the > > primitive. Fixed. > On Aug. 24, 2015, 8:43 p.m., Nate Cole wrote: > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java, > > lines 503-506 > > <https://reviews.apache.org/r/37739/diff/1/?file=1048842#file1048842line503> > > > > Thank you! You're welcome :) - Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37739/#review96255 ----------------------------------------------------------- On Aug. 24, 2015, 7:30 p.m., Jonathan Hurley wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/37739/ > ----------------------------------------------------------- > > (Updated Aug. 24, 2015, 7:30 p.m.) > > > Review request for Ambari, Alejandro Fernandez and Nate Cole. > > > Bugs: AMBARI-12867 > https://issues.apache.org/jira/browse/AMBARI-12867 > > > Repository: ambari > > > Description > ------- > > On 1000 node RU I had 2.3.0.0-2557 installed with some 20 hosts down with > heartbeat lost. Then I registered 2.3.2.0-2664 and when I proceeded to > install, it would always get aborted with no logs in server or agents. > > Turns out that whenever we install, we do so in stages containing 100 hosts > each. If any of the host failed or timed out etc., the rest of the stages are > aborted. So in this case the first stage had 1 host timeout, which resulted > in that and other stages being aborted. > > I cannot install a version without all hosts being alive. Workaround seems to > be to delete lost hosts from Ambari. > > The solution is to use the stage's success criteria to determine if the other > stages in the request should be aborted. > > > Diffs > ----- > > ambari-server/src/main/java/org/apache/ambari/server/Role.java 636df3f > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java > 6133885 > > ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProviderTest.java > a56823b > > Diff: https://reviews.apache.org/r/37739/diff/ > > > Testing > ------- > > mvn clean test > > > Thanks, > > Jonathan Hurley > >
