> On Feb. 11, 2016, 10:50 p.m., Navina Ramesh wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 57 > > <https://reviews.apache.org/r/43350/diff/3/?file=1239498#file1239498line57> > > > > Is there any particular reason for updating state within the > > runContainer method? > > Imo, if the method is called "runContainer", it should do things > > related to running a container. Updating state is very important in this > > workflow and I want to make sure that it is obvious when you read the code. > > Can you consider renaming the method or moving back the state update > > statement?
There are a couple subjective things in the balance here: 1. What is the most accurate description of what these 4 lines are accomplishing? To me, it's running the container on the specified host and the state update is just something that should always happen when you run a container. That is to say, the state update is not the primary action here, but a consequence. I'm open to other names if they're accurate. 2. It could have been named updateStateAndRunContainer, which is more descriptive, but more verbose too. I think we agree, though, that this code should not be duplicated in 3 places. That was the goal of the refactor. I'm happy to name it whatever you prefer. - Jake ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43350/#review118935 ----------------------------------------------------------- On Feb. 10, 2016, 3:57 p.m., Jake Maes wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43350/ > ----------------------------------------------------------- > > (Updated Feb. 10, 2016, 3:57 p.m.) > > > Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan > (Data Infrastructure). > > > Repository: samza > > > Description > ------- > > * Throw a SamzaContainerStartException whenever the container start request > to the NM fails. > * Catch the exception in the Allocator and re-request for a container on > ANY-HOST > * Release the failed container > * Also, refactored some methods in the ContainerUtil and ContainerAllocators > to simplify the code. > > > Diffs > ----- > > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java > 9ee2daccc3c44202308637207a084def81b49c09 > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java > 7c57a866114aabf76a36d3b7ca4c5810628e0c77 > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java > ab3061eae2cfc2da6681ce2034492b165d0d8b96 > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java > 91fae98c074e1648e7168fb8e76d6e1e656816fc > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java > ff22dbfe5537ba2a3c55c4e6063680ff4c9e55f4 > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java > bc5b606394351e4039d084914fe5898e6ff148a1 > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerLaunchException.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java > a3562a1155cbf24989200063b3ebd472b295db37 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java > e2b45d7577dea5a4a71af22521c93a7fd75eaefc > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java > 269d82479650b3bc2890d250da0391d34104b1eb > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java > 8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java > e7441e512e73b5d09740f252dc52efece640bea4 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java > 4426ce6ac8808257dc00df50452df3d7555d6d0b > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java > 951e0f9504b87263eba891d367c9806248241c7d > > Diff: https://reviews.apache.org/r/43350/diff/ > > > Testing > ------- > > Added 2 unit tests and I'm doing some manual testing in parallel with this > review. The manual test is to kill the NM for one of the containers of a job > that uses Host Affinity. When the RM detects the outage, it should notify the > AM which will try to restart the container on the same host. It will get a > connection error and at that point should retry on a DIFFERENT host. > > > Thanks, > > Jake Maes > >