-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50523/#review146854
-----------------------------------------------------------




src/slave/containerizer/docker.cpp (lines 1028 - 1034)
<https://reviews.apache.org/r/50523/#comment213619>

    I feel like failing to recover devices warrants more than just a LOG 
message.
    
    Consider the situation where we fail to recover them, and we just continue. 
    
    From Mesos's perspective, these devices are now free to hand out to other 
containers.  This will cause conflicts -- especially in the case of GPUs.
    
    We need to think through the failure scenario here a bit more.



src/slave/containerizer/docker.cpp (line 1102)
<https://reviews.apache.org/r/50523/#comment213633>

    I don't think you need to save a temporary variable here. Just return the 
call to `docker->inspect` directly, i.e.:
    
    ```
    return docker->inspect(containerName, slave::DOCKER_INSPECT_DELAY)
      .then(defer(self(), [=](const Docker::Container& container) {
        ...
      }));
    ```



src/slave/containerizer/docker.cpp (line 1112)
<https://reviews.apache.org/r/50523/#comment213621>

    We typically write out full words for our variable names.
    
    I'd either change this to `nvidiaDevicePrefix` or just simply `prefix.



src/slave/containerizer/docker.cpp (line 1113)
<https://reviews.apache.org/r/50523/#comment213629>

    I probably wouldn't use an `Option` here, but rather just an (initially) 
empty set.



src/slave/containerizer/docker.cpp (line 1116)
<https://reviews.apache.org/r/50523/#comment213612>

    There should be a line break here.



src/slave/containerizer/docker.cpp (lines 1117 - 1120)
<https://reviews.apache.org/r/50523/#comment213615>

    Indentation is wrong here.



src/slave/containerizer/docker.cpp (line 1118)
<https://reviews.apache.org/r/50523/#comment213625>

    You should use `strings::remove(deviceString, prefix)` here.



src/slave/containerizer/docker.cpp (lines 1123 - 1125)
<https://reviews.apache.org/r/50523/#comment213616>

    You should use `numify()` with error checking here.



src/slave/containerizer/docker.cpp (line 1130)
<https://reviews.apache.org/r/50523/#comment213628>

    We should (at a minimum) LOG an error here.
    
    My preference though, would be to return an error at all error points in 
this function and then do the logging in the caller.



src/slave/containerizer/docker.cpp (lines 1137 - 1138)
<https://reviews.apache.org/r/50523/#comment213630>

    If you follow the suggesstion above, then this code is unnecessary.



src/slave/containerizer/docker.cpp (line 1145)
<https://reviews.apache.org/r/50523/#comment213631>

    If you follow the suggesstion above, then this would turn into:
    
    ```
    if (!gpus.empty()) {
    ```



src/slave/containerizer/docker.cpp (line 1148)
<https://reviews.apache.org/r/50523/#comment213636>

    With all of the other comments above being addressed, this line should 
become:
    
    ```
    return nvidiaGpuAllocator->allocate(gpus);
    ```



src/tests/containerizer/docker_containerizer_tests.cpp (line 3980)
<https://reviews.apache.org/r/50523/#comment213637>

    You don't need this (in fact, you shouldn't have this) for the docker 
containerizer. These flags are only relevant for the mesos containerizer.
    
    Instead you should have:
    `flags.containerizers = docker`.



src/tests/containerizer/docker_containerizer_tests.cpp (line 4105)
<https://reviews.apache.org/r/50523/#comment213638>

    It looks like you wait for recovery to complete, but then you don't 
actually verify that the recovery was successful.
    
    I think after recovery you should verify that the the container was 
actually recovered properly -- devices an all.


- Kevin Klues


On Aug. 24, 2016, 12:56 a.m., Rajat Phull wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50523/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 12:56 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Kevin Klues, Yubo Li, and Vikrama 
> Ditya.
> 
> 
> Bugs: MESOS-5795
>     https://issues.apache.org/jira/browse/MESOS-5795
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Updated docker recovery to account for GPU resources.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/docker.hpp f2a06065cf99fed934c2c1ffc47461ec8a97f50d 
>   src/slave/containerizer/docker.cpp 5c1ee8e467d1c54c60b67dc5275ef71e1bb90723 
>   src/tests/containerizer/docker_containerizer_tests.cpp 
> 42d4364b6fcbc94c7852721511001c103cb5a90d 
> 
> Diff: https://reviews.apache.org/r/50523/diff/
> 
> 
> Testing
> -------
> 
> GTEST_FILTER="DockerContainerizerTest.ROOT_NVIDIA_GPU_DOCKER_LaunchWithGpuRecovery"
>  make -j check
> 
> 
> Thanks,
> 
> Rajat Phull
> 
>

Reply via email to