Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-31 Thread Avinash sridharan


> On March 29, 2016, 5:35 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.hpp, line 151
> > 
> >
> > Should have pointed in the earlier patches, why do we need `Info` to be 
> > a smart pointer. We can have it as an object. We are not going to share 
> > this information with any other class or thread.
> 
> Qian Zhang wrote:
> I think `Owned` is the suggested way, please see:
> 
> https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/isolators/cgroups/cpushare.hpp#L110
> 
> https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups/mem.hpp#L129
> 
> https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups/perf_event.hpp#L115

That's because they ended up using an `Info*` to begin with. I think there were 
historical reasons for using `Info*` instead of an Info object. In this case I 
don't see the need for storing the `Info` as a pointer since it is not being 
passed around.


> On March 29, 2016, 5:35 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 136
> > 
> >
> > Why do we have this change as part of this patch? Can we move this out 
> > to a different patch?
> 
> Qian Zhang wrote:
> Because when I wrote this patch, I found I need a new method to parse 
> `NetworkInfo`, but I can not name it as `parse()` since there is already a 
> `parse()` method for parsing `NetworkConfig`, so I have to rename that one as 
> `parseNetworkConfig()`, and introduce the new one as `parseNetworkInfo()`.

Sure, but can we make it a separate patch ?


> On March 29, 2016, 5:35 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 817
> > 
> >
> > Isn't this a problem? We are returning `Nothing` here, which implies 
> > that in the `recover` method the container information is assumed to have 
> > been stored correctly. However, there is NO container informationt that has 
> > been stored!! I thought we should never run into this issue, since we 
> > should be creating the directories before calling the CNI plugin?
> 
> Qian Zhang wrote:
> > which implies that in the recover method the container information is 
> assumed to have been stored correctly.
> 
> It does not imply the container info has been stored correctly, actually 
> it implies that for this kind of container (which has no networkInfoDir 
> created yet), we do not need to store anything for it (actually we have 
> nothing to restore from since there is no networkInfoDir for the container).

That doesn't sound right. This effectively means that we have a container 
without a IP address or even a veth associated with the network namespace? This 
seems like an error.


> On March 29, 2016, 5:35 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, lines 832-835
> > 
> >
> > What if the isolator dies right after creating the directories, but 
> > after invoking the plugin? If we just return `Nothing` we might end up 
> > leaking state information from the CNI plugin (IPAM). I think we should 
> > return `Error` here and ask the isolator to cleanup the containers and 
> > start over.
> 
> Qian Zhang wrote:
> If the plugin has been invoked, then the `networkDirs` will not be empty.

Well, the ip address and DNS information returned by the plugin would have been 
lost, because the plugin would be a zombie process (child died). In this case 
not sure if it makes sense to recover the containers.


> On March 29, 2016, 5:35 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 878
> > 
> >
> > This error doesn't make sense. Maybe:
> > "Found multiple interfaces for a given CNI network. This should not be 
> > allowed" ?
> 
> Qian Zhang wrote:
> But it will miss the case that interfaces.get().size() == 0.

Do we throw an error in `isolate` if a container tries to join the same network 
with multiple `NetworkInfo` messages ? I agree with your comment though. If we 
are handling the case that a container can never have more than one interface 
attached to the network, we can replace the existing string by
"Unable to find any interface attached to " +  +" for this 
container" 


> On March 29, 2016, 5:35 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/spec.cpp, line 53
> > 
> >
> > Why not just return 

Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-31 Thread Jie Yu


> On March 30, 2016, 6:12 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 339
> > 
> >
> > I think this is not needed because if recover fails, slave will restart.
> 
> Qian Zhang wrote:
> Agree, but I see other isolators do the similar, e.g., 
> CgroupsCpushareIsolatorProcess::recover(), 
> CgroupsMemIsolatorProcess::recover(), etc., maybe we need to fix them as well?

yeah, those are not done correctly.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/#review126143
---


On March 31, 2016, 11:33 a.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45383/
> ---
> 
> (Updated March 31, 2016, 11:33 a.m.)
> 
> 
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-4759
> https://issues.apache.org/jira/browse/MESOS-4759
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented recover() method of "network/cni" isolator.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
> 873e0c52475f4868e611bd24a6782ad5eb261a99 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
> 1c8e231813c0579b79681c5d18b1f799a727ead7 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
> f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
> 611f3869402b9033081b7f9ecc1bdf006f61918b 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> 6a3c33645bab73edaf5af4d298a671852ea59c46 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 
> 
> Diff: https://reviews.apache.org/r/45383/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-31 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/
---

(Updated March 31, 2016, 7:33 p.m.)


Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.


Bugs: MESOS-4759
https://issues.apache.org/jira/browse/MESOS-4759


Repository: mesos


Description
---

Implemented recover() method of "network/cni" isolator.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
873e0c52475f4868e611bd24a6782ad5eb261a99 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
1c8e231813c0579b79681c5d18b1f799a727ead7 
  src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
  src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
611f3869402b9033081b7f9ecc1bdf006f61918b 
  src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
6a3c33645bab73edaf5af4d298a671852ea59c46 
  src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 

Diff: https://reviews.apache.org/r/45383/diff/


Testing
---


Thanks,

Qian Zhang



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-31 Thread Qian Zhang


> On March 31, 2016, 2:12 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 339
> > 
> >
> > I think this is not needed because if recover fails, slave will restart.

Agree, but I see other isolators do the similar, e.g., 
CgroupsCpushareIsolatorProcess::recover(), 
CgroupsMemIsolatorProcess::recover(), etc., maybe we need to fix them as well?


> On March 31, 2016, 2:12 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, lines 887-890
> > 
> >
> > Or agent crashes right before removing the containerDir?

Yes! Let me add it in the comment.


> On March 31, 2016, 2:12 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, lines 891-908
> > 
> >
> > This basically duplicates what's in the cleanup. I am wondering if we 
> > should make cleanup more robust to tolerate a container dir without network 
> > name dirs. In other words, remove the logic here, change cleanup so that it 
> > can handle empty network dirs case.

Yes, we can do it, but then we have to remove 
`CHECK(infos.contains(containerId))` from `_cleanup()` and make the logic in 
the `cleanup()` a little obscure. Anyway, I have updated the code accordingly, 
please take a look and let me know for any further comments :-)


> On March 31, 2016, 2:12 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 864
> > 
> >
> > OK, now I realized that `getNetworkInfoDir` is a little confusing with 
> > `getNetworkDir`. Can we `s/getNetworkInfoDir/getContainerDir/` in a 
> > subsequent patch?

Sure, I have renamed it in the patch: https://reviews.apache.org/r/45532/


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/#review126143
---


On March 30, 2016, 9:39 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45383/
> ---
> 
> (Updated March 30, 2016, 9:39 p.m.)
> 
> 
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented recover() method of "network/cni" isolator.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
> 873e0c52475f4868e611bd24a6782ad5eb261a99 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
> 1c8e231813c0579b79681c5d18b1f799a727ead7 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
> f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
> 611f3869402b9033081b7f9ecc1bdf006f61918b 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> 6a3c33645bab73edaf5af4d298a671852ea59c46 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 
> 
> Diff: https://reviews.apache.org/r/45383/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-30 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/#review126143
---




src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 322)


`s/recoverInfo/_recover/`

We usually move `_xxx` right below `xxx` for readability. So please move 
the function to the correct place.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 324)


I think this is not needed because if recover fails, slave will restart.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 333)


Ditto.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 350)


Ditto.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 840)


OK, now I realized that `getNetworkInfoDir` is a little confusing with 
`getNetworkDir`. Can we `s/getNetworkInfoDir/getContainerDir/` in a subsequent 
patch?



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (lines 858 - 859)


Please align the parameter properly.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 862)


`networkDirs->empty()`



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (lines 863 - 866)


Or agent crashes right before removing the containerDir?



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (lines 867 - 884)


This basically duplicates what's in the cleanup. I am wondering if we 
should make cleanup more robust to tolerate a container dir without network 
name dirs. In other words, remove the logic here, change cleanup so that it can 
handle empty network dirs case.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 889)


instead of that, can you instead
```
s/getNetworkDirs/getNetworkNames/
```

In other words, return a list of names, rather than  dirs.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 894)


Ditto.

s/getInterfaceDirs/getInterfaces/


- Jie Yu


On March 30, 2016, 1:39 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45383/
> ---
> 
> (Updated March 30, 2016, 1:39 p.m.)
> 
> 
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented recover() method of "network/cni" isolator.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
> 873e0c52475f4868e611bd24a6782ad5eb261a99 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
> 1c8e231813c0579b79681c5d18b1f799a727ead7 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
> f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
> 611f3869402b9033081b7f9ecc1bdf006f61918b 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> 6a3c33645bab73edaf5af4d298a671852ea59c46 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 
> 
> Diff: https://reviews.apache.org/r/45383/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-30 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/#review126104
---



Patch looks great!

Reviews applied: [44004, 44200, 44549, 44555, 44622, 44514, 44706, 45082, 45383]

Passed command: export OS='ubuntu:14.04' CONFIGURATION='--verbose' 
COMPILER='gcc' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker_build.sh

- Mesos ReviewBot


On March 30, 2016, 1:39 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45383/
> ---
> 
> (Updated March 30, 2016, 1:39 p.m.)
> 
> 
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented recover() method of "network/cni" isolator.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
> 873e0c52475f4868e611bd24a6782ad5eb261a99 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
> 1c8e231813c0579b79681c5d18b1f799a727ead7 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
> f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
> 611f3869402b9033081b7f9ecc1bdf006f61918b 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> 6a3c33645bab73edaf5af4d298a671852ea59c46 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 
> 
> Diff: https://reviews.apache.org/r/45383/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-30 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/
---

(Updated March 30, 2016, 9:39 p.m.)


Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.


Repository: mesos


Description
---

Implemented recover() method of "network/cni" isolator.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
873e0c52475f4868e611bd24a6782ad5eb261a99 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
1c8e231813c0579b79681c5d18b1f799a727ead7 
  src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
  src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
611f3869402b9033081b7f9ecc1bdf006f61918b 
  src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
6a3c33645bab73edaf5af4d298a671852ea59c46 
  src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 

Diff: https://reviews.apache.org/r/45383/diff/


Testing
---


Thanks,

Qian Zhang



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-30 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/
---

(Updated March 30, 2016, 5:12 p.m.)


Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.


Repository: mesos


Description
---

Implemented recover() method of "network/cni" isolator.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
873e0c52475f4868e611bd24a6782ad5eb261a99 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
1c8e231813c0579b79681c5d18b1f799a727ead7 
  src/slave/containerizer/mesos/isolators/network/cni/paths.hpp 
f627ec9499a34ca104d2c1a4d28e1d2f4b849f64 
  src/slave/containerizer/mesos/isolators/network/cni/paths.cpp 
611f3869402b9033081b7f9ecc1bdf006f61918b 
  src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
6a3c33645bab73edaf5af4d298a671852ea59c46 
  src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 

Diff: https://reviews.apache.org/r/45383/diff/


Testing
---


Thanks,

Qian Zhang



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-30 Thread Qian Zhang


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.hpp, line 103
> > 
> >
> > Why do we need to use a hashmap over here? Why not a vector or a list? 
> > Hashmap's are expensive in terms of memory. Usually a container won't be 
> > associated with more than one network, so seems pretty inefficient?

This was suggested by Jie when I wrote the `prepare()` patch, you can check 
https://reviews.apache.org/r/44514/diff/5?file=1304896#file1304896line82 for 
details.


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.hpp, line 139
> > 
> >
> > Maybe s/recoverInfo/_recover/

I'd like to name this method more meaningful, `recoverInfo` implies that this 
method will recover/rebuild the `Info` struct for the container.


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.hpp, line 151
> > 
> >
> > Should have pointed in the earlier patches, why do we need `Info` to be 
> > a smart pointer. We can have it as an object. We are not going to share 
> > this information with any other class or thread.

I think `Owned` is the suggested way, please see:
https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/isolators/cgroups/cpushare.hpp#L110
https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups/mem.hpp#L129
https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups/perf_event.hpp#L115


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 136
> > 
> >
> > Why do we have this change as part of this patch? Can we move this out 
> > to a different patch?

Because when I wrote this patch, I found I need a new method to parse 
`NetworkInfo`, but I can not name it as `parse()` since there is already a 
`parse()` method for parsing `NetworkConfig`, so I have to rename that one as 
`parseNetworkConfig()`, and introduce the new one as `parseNetworkInfo()`.


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, lines 288-290
> > 
> >
> > Do we need this loop? `infos.clear` below should destroy the hashmap 
> > which in turn would destroy any of the residing objects (`networkinfos`)

Yes, I agree, and I will fix this in `_cleanup()` as well.


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 817
> > 
> >
> > Isn't this a problem? We are returning `Nothing` here, which implies 
> > that in the `recover` method the container information is assumed to have 
> > been stored correctly. However, there is NO container informationt that has 
> > been stored!! I thought we should never run into this issue, since we 
> > should be creating the directories before calling the CNI plugin?

> which implies that in the recover method the container information is assumed 
> to have been stored correctly.

It does not imply the container info has been stored correctly, actually it 
implies that for this kind of container (which has no networkInfoDir created 
yet), we do not need to store anything for it (actually we have nothing to 
restore from since there is no networkInfoDir for the container).


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 878
> > 
> >
> > This error doesn't make sense. Maybe:
> > "Found multiple interfaces for a given CNI network. This should not be 
> > allowed" ?

But it will miss the case that interfaces.get().size() == 0.


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/spec.cpp, line 53
> > 
> >
> > Why not just return `parse` ?

Either way should be OK? I see 
https://github.com/apache/mesos/blob/0.28.0/src/docker/spec.cpp#L110:L124 does 
the similar.


> On March 30, 2016, 1:35 a.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, lines 832-835
> > 
> >
> > What if the isolator dies right after creating the directories, 

Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-29 Thread Avinash sridharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/#review125880
---




src/slave/containerizer/mesos/isolators/network/cni/cni.hpp (line 103)


Why do we need to use a hashmap over here? Why not a vector or a list? 
Hashmap's are expensive in terms of memory. Usually a container won't be 
associated with more than one network, so seems pretty inefficient?



src/slave/containerizer/mesos/isolators/network/cni/cni.hpp (line 139)


Maybe s/recoverInfo/_recover/



src/slave/containerizer/mesos/isolators/network/cni/cni.hpp (line 151)


Should have pointed in the earlier patches, why do we need `Info` to be a 
smart pointer. We can have it as an object. We are not going to share this 
information with any other class or thread.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 136)


Why do we have this change as part of this patch? Can we move this out to a 
different patch?



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (lines 288 - 290)


Do we need this loop? `infos.clear` below should destroy the hashmap which 
in turn would destroy any of the residing objects (`networkinfos`)



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 


I think we should have this in a separate patch?



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 809)


Isn't this a problem? We are returning `Nothing` here, which implies that 
in the `recover` method the container information is assumed to have been 
stored correctly. However, there is NO container informationt that has been 
stored!! I thought we should never run into this issue, since we should be 
creating the directories before calling the CNI plugin?



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (lines 824 - 827)


What if the isolator dies right after creating the directories, but after 
invoking the plugin? If we just return `Nothing` we might end up leaking state 
information from the CNI plugin (IPAM). I think we should return `Error` here 
and ask the isolator to cleanup the containers and start over.



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp (line 870)


This error doesn't make sense. Maybe:
"Found multiple interfaces for a given CNI network. This should not be 
allowed" ?



src/slave/containerizer/mesos/isolators/network/cni/spec.cpp (line 53)


Why not just return `parse` ?


- Avinash sridharan


On March 29, 2016, 11 a.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45383/
> ---
> 
> (Updated March 29, 2016, 11 a.m.)
> 
> 
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented recover() method of "network/cni" isolator.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
> b1b7205f4f10b6dc256fcc4ecb3210105c1240b4 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
> 7cda5715814a0cfc4b394eb04437831e6dc44e3f 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.cpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> 6a3c33645bab73edaf5af4d298a671852ea59c46 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 
> 
> Diff: https://reviews.apache.org/r/45383/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-29 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/#review125905
---



Patch looks great!

Reviews applied: [44004, 44200, 44549, 44555, 44622, 44514, 44706, 45082, 45383]

Passed command: export OS='ubuntu:14.04' CONFIGURATION='--verbose' 
COMPILER='gcc' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker_build.sh

- Mesos ReviewBot


On March 29, 2016, 11 a.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45383/
> ---
> 
> (Updated March 29, 2016, 11 a.m.)
> 
> 
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented recover() method of "network/cni" isolator.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
> b1b7205f4f10b6dc256fcc4ecb3210105c1240b4 
>   src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
> 7cda5715814a0cfc4b394eb04437831e6dc44e3f 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/paths.cpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> 6a3c33645bab73edaf5af4d298a671852ea59c46 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 
> 
> Diff: https://reviews.apache.org/r/45383/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 45383: Implemented recover() method of "network/cni" isolator.

2016-03-29 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45383/
---

(Updated March 29, 2016, 7 p.m.)


Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.


Repository: mesos


Description
---

Implemented recover() method of "network/cni" isolator.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/cni/cni.hpp 
b1b7205f4f10b6dc256fcc4ecb3210105c1240b4 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
7cda5715814a0cfc4b394eb04437831e6dc44e3f 
  src/slave/containerizer/mesos/isolators/network/cni/paths.hpp PRE-CREATION 
  src/slave/containerizer/mesos/isolators/network/cni/paths.cpp PRE-CREATION 
  src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
6a3c33645bab73edaf5af4d298a671852ea59c46 
  src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
5b5f904def9ef6dcc4462a03a2d024ad4eb3d946 

Diff: https://reviews.apache.org/r/45383/diff/


Testing
---


Thanks,

Qian Zhang