Re: Mesos (and Marathon) port mapping

2017-03-31 Thread Jie Yu
Thomas,

- it is the hostports which are used to multiplex traffic into container.
> My understanding is that, since each container is in it's network
> namespace, it has its own full range of container ports and that you use a
> direct mapping (hostport n <-> same container port n), is that correct ?

Yes.

- those ports which are divided into disjoint subsets are the ephermeral
> ports. The non- ephemeral ports are in a set shared between all containers,
> correct ?


No. non-ephemeral ports are allocated by framework. (non-ephemeral ports
are modeled as Resources in Mesos). So containers must have disjoint sets
of non-ephemeral ports.

- the use case you described is when you cannot afford one ip/container and
> when you are using the mesos containeraizer : does it mean that network
> mapping isolation makes no sense with the docker containerizer or can it be
> somehow composed with it ?]


If you're looking for private bridge + DNAT solution (like Docker
--net=bridge), you can follow the following docs if you want to use it with
Mesos containerizer. It's supported through a more standard interface
called CNI (https://github.com/containernetworking/cni)
https://github.com/apache/mesos/blob/master/docs/cni.md
https://github.com/apache/mesos/blob/master/docs/cni.md#a-port-mapper-plugin-for-cni-networks

The ip/container limitation is not related to which containerizer you're
using. It's specific to the company (Twitter)'s environment. For instance,
we cannot change the service discovery mechanism at that time, requiring
all container's IP must be routable.

I didn't quite understand why you cannot use NAT (in the same way docker in
> BRIDGE mode does) and assign as many ip addresses that you want in a
> private network...


See my response above. If you're looking for docker --net=bridge support,
follow the two links above.

- Jie

On Fri, Mar 31, 2017 at 3:39 AM, Thomas HUMMEL 
wrote:

> Thanks for your answer,
>
> I've watched your talk. Very interesting.
>
> Let me check if I get everything staight :
>
> - it is the hostports which are used to multiplex traffic into container.
> My understanding is that, since each container is in it's network
> namespace, it has its own full range of container ports and that you use a
> direct mapping (hostport n <-> same container port n), is that correct ?
>
> - those ports which are divided into disjoint subsets are the ephermeral
> ports. The non- ephemeral ports are in a set shared between all containers,
> correct ?
>
> - the use case you described is when you cannot afford one ip/container
> and when you are using the mesos containeraizer : does it mean that network
> mapping isolation makes no sense with the docker containerizer or can it be
> somehow composed with it ?]
>
> I didn't quite understand why you cannot use NAT (in the same way docker
> in BRIDGE mode does) and assign as many ip addresses that you want in a
> private network...
>
> Thanks.
>
> --
>
> TH.
>
>
>
>


Re: Mesos (and Marathon) port mapping

2017-03-31 Thread Jie Yu
Tomek and Olivier,

The bridge network support (with port mapping) has been added to Mesos 1.2.
See this doc for more details how to use it:
https://github.com/apache/mesos/blob/master/docs/cni.md#a-port-mapper-plugin-for-cni-networks

TL;DR: we developed a CNI port mapper plugin (DNAT) in Mesos repo, and uses
a delegation model in CNI. For the bridge CNI plugin, you can simply use
the default bridge plugin in CNI repo (
https://github.com/containernetworking/cni). @avinash can explain more here.



On Fri, Mar 31, 2017 at 3:40 AM, Olivier Sallou 
wrote:

>
>
> On 03/31/2017 10:23 AM, Tomek Janiszewski wrote:
>
> I have a question that is related to this topic. In "docker support and
> current limitations" section [1] there is a following statement:
> > Only host network is supported. We will add bridge network support soon
> using CNI support in Mesos (MESOS-4641
> )
> Mentioned issue is resolved. Does this means bridge network is working for
> Mesos containerizer?
>
> [1]: https://github.com/apache/mesos/blob/master/docs/
> container-image.md#docker-support-and-current-limitations
>
> CNI support in unified containerizer (mesos) gives the possibility to
> assign an IP per container, so no port mapping (the ports you use will be
> used direclty as container has its own IP address). There is no "bridge"
> network as per Docker (mapping of container port 80 to host port 3 for
> example)
>
> Olivier
>
>
> pt., 31 mar 2017 o 02:04 użytkownik Jie Yu  napisał:
>
>> are you talking about the NAT feature of docker in BRIDGE m
>>
>>
>> Yes
>>
>>  - regarding the "port mapping isolator giving network namespace" : what
>> confuses me is that, given the previous answers, I thought that in that
>> case, the non-ephemeral port range was *shared* (as a ressource) between
>> containers, which sounds to me at the opposite of the namespace concept (as
>> a slightly different example 2 docker container have their own private 80
>> port for instance).
>>
>>
>> The port mapping isolator is for the case where ip per container is not
>> possible (due to ipam restriction, etc), but the user still wants to have
>> network namespace per container (for isolation, getting statistics, etc.)
>>
>> Since all containers, even if they are in separate namespaces, share the
>> same IP, we have to use some other mechanism to tell which packet belongs
>> to which container. We use ports in that case. You can find more details
>> about port mapping isolator in this talk I gave in 2015 MesosCon:
>> https://www.youtube.com/watch?v=ZA96g1M4v8Y
>>
>> - Jie
>>
>> On Thu, Mar 30, 2017 at 2:13 AM, Thomas HUMMEL 
>> wrote:
>>
>>
>> On 03/29/2017 07:25 PM, Jie Yu wrote:
>>
>> Thomas,
>>
>> I think you are confused about the port mapping for NAT purpose, and the port
>> mapping isolator
>> .
>> Those two very different thing. The port mapping isolator (unfortunate
>> naming), as described in the doc, gives you network namespace per container
>> without requiring ip per container. No NAT is involved. I think for you
>> case, you should not use it and it does not work for DockerContainerizer.
>>
>> Thanks,
>>
>> I'm not sure to understand what you say :
>>
>> - are you talking about the NAT feature of docker in BRIDGE mode ?
>>
>> - regarding the "port mapping isolator giving network namespace" : what
>> confuses me is that, given the previous answers, I thought that in that
>> case, the non-ephemeral port range was *shared* (as a ressource) between
>> containers, which sounds to me at the opposite of the namespace concept (as
>> a slightly different example 2 docker container have their own private 80
>> port for instance).
>>
>> What am I missing ?
>>
>> Thanks
>>
>> --
>> TH
>>
>>
>>
> --
> Olivier Sallou
> IRISA / University of Rennes 1
> Campus de Beaulieu, 35000 RENNES - FRANCE
> Tel: 02.99.84.71.95
>
> gpg key id: 4096R/326D8438  (keyring.debian.org)
> Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
>
>
>


Re: Mesos (and Marathon) port mapping

2017-03-31 Thread Olivier Sallou


On 03/31/2017 10:23 AM, Tomek Janiszewski wrote:
> I have a question that is related to this topic. In "docker support
> and current limitations" section [1] there is a following statement:
> > Only host network is supported. We will add bridge network support
> soon using CNI support in Mesos (MESOS-4641
> )
> Mentioned issue is resolved. Does this means bridge network is working
> for Mesos containerizer?
>
> [1]: 
> https://github.com/apache/mesos/blob/master/docs/container-image.md#docker-support-and-current-limitations
CNI support in unified containerizer (mesos) gives the possibility to
assign an IP per container, so no port mapping (the ports you use will
be used direclty as container has its own IP address). There is no
"bridge" network as per Docker (mapping of container port 80 to host
port 3 for example)

Olivier
>
> pt., 31 mar 2017 o 02:04 użytkownik Jie Yu  > napisał:
>
> are you talking about the NAT feature of docker in BRIDGE m
>
>
> Yes
>
>  - regarding the "port mapping isolator giving network
> namespace" : what confuses me is that, given the previous
> answers, I thought that in that case, the
> non-ephemeral port range was *shared* (as a ressource) between
> containers, which sounds to me at the opposite of the
> namespace concept (as a slightly different example 2 docker
> container have their own private 80 port for instance).
>
>
> The port mapping isolator is for the case where ip per container
> is not possible (due to ipam restriction, etc), but the user still
> wants to have network namespace per container (for isolation,
> getting statistics, etc.)
>
> Since all containers, even if they are in separate namespaces,
> share the same IP, we have to use some other mechanism to tell
> which packet belongs to which container. We use ports in that
> case. You can find more details about port mapping isolator in
> this talk I gave in 2015
> MesosCon: https://www.youtube.com/watch?v=ZA96g1M4v8Y
>
> - Jie
>
> On Thu, Mar 30, 2017 at 2:13 AM, Thomas HUMMEL
> > wrote:
>
>
> On 03/29/2017 07:25 PM, Jie Yu wrote:
>> Thomas,
>>
>> I think you are confused about the port mapping for NAT
>> purpose, and the port mapping isolator
>> 
>> .
>> Those two very different thing. The port mapping isolator
>> (unfortunate naming), as described in the doc, gives you
>> network namespace per container without requiring ip per
>> container. No NAT is involved. I think for you case, you
>> should not use it and it does not work for DockerContainerizer.
> Thanks,
>
> I'm not sure to understand what you say :
>
> - are you talking about the NAT feature of docker in BRIDGE mode ?
>
> - regarding the "port mapping isolator giving network
> namespace" : what confuses me is that, given the previous
> answers, I thought that in that case, the non-ephemeral port
> range was *shared* (as a ressource) between containers, which
> sounds to me at the opposite of the namespace concept (as a
> slightly different example 2 docker container have their own
> private 80 port for instance).
>
> What am I missing ?
>
> Thanks
>
> --
> TH
>
>

-- 
Olivier Sallou
IRISA / University of Rennes 1
Campus de Beaulieu, 35000 RENNES - FRANCE
Tel: 02.99.84.71.95

gpg key id: 4096R/326D8438  (keyring.debian.org)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438



Re: Mesos (and Marathon) port mapping

2017-03-31 Thread Thomas HUMMEL

Thanks for your answer,

I've watched your talk. Very interesting.

Let me check if I get everything staight :

- it is the hostports which are used to multiplex traffic into 
container. My understanding is that, since each container is in it's 
network namespace, it has its own full range of container ports and that 
you use a direct mapping (hostport n <-> same container port n), is that 
correct ?


- those ports which are divided into disjoint subsets are the ephermeral 
ports. The non- ephemeral ports are in a set shared between all 
containers, correct ?


- the use case you described is when you cannot afford one ip/container 
and when you are using the mesos containeraizer : does it mean that 
network mapping isolation makes no sense with the docker containerizer 
or can it be somehow composed with it ?]


I didn't quite understand why you cannot use NAT (in the same way docker 
in BRIDGE mode does) and assign as many ip addresses that you want in a 
private network...


Thanks.

--

TH.





Re: Mesos (and Marathon) port mapping

2017-03-31 Thread Tomek Janiszewski
I have a question that is related to this topic. In "docker support and
current limitations" section [1] there is a following statement:
> Only host network is supported. We will add bridge network support soon
using CNI support in Mesos (MESOS-4641
)
Mentioned issue is resolved. Does this means bridge network is working for
Mesos containerizer?

[1]:
https://github.com/apache/mesos/blob/master/docs/container-image.md#docker-support-and-current-limitations

pt., 31 mar 2017 o 02:04 użytkownik Jie Yu  napisał:

> are you talking about the NAT feature of docker in BRIDGE m
>
>
> Yes
>
>  - regarding the "port mapping isolator giving network namespace" : what
> confuses me is that, given the previous answers, I thought that in that
> case, the non-ephemeral port range was *shared* (as a ressource) between
> containers, which sounds to me at the opposite of the namespace concept (as
> a slightly different example 2 docker container have their own private 80
> port for instance).
>
>
> The port mapping isolator is for the case where ip per container is not
> possible (due to ipam restriction, etc), but the user still wants to have
> network namespace per container (for isolation, getting statistics, etc.)
>
> Since all containers, even if they are in separate namespaces, share the
> same IP, we have to use some other mechanism to tell which packet belongs
> to which container. We use ports in that case. You can find more details
> about port mapping isolator in this talk I gave in 2015 MesosCon:
> https://www.youtube.com/watch?v=ZA96g1M4v8Y
>
> - Jie
>
> On Thu, Mar 30, 2017 at 2:13 AM, Thomas HUMMEL 
> wrote:
>
>
> On 03/29/2017 07:25 PM, Jie Yu wrote:
>
> Thomas,
>
> I think you are confused about the port mapping for NAT purpose, and the port
> mapping isolator
> .
> Those two very different thing. The port mapping isolator (unfortunate
> naming), as described in the doc, gives you network namespace per container
> without requiring ip per container. No NAT is involved. I think for you
> case, you should not use it and it does not work for DockerContainerizer.
>
> Thanks,
>
> I'm not sure to understand what you say :
>
> - are you talking about the NAT feature of docker in BRIDGE mode ?
>
> - regarding the "port mapping isolator giving network namespace" : what
> confuses me is that, given the previous answers, I thought that in that
> case, the non-ephemeral port range was *shared* (as a ressource) between
> containers, which sounds to me at the opposite of the namespace concept (as
> a slightly different example 2 docker container have their own private 80
> port for instance).
>
> What am I missing ?
>
> Thanks
>
> --
> TH
>
>
>


Re: mesos container cluster came across health check coredump log

2017-03-31 Thread Alex Rukletsov
Cool, looking forward to it!

On Fri, Mar 31, 2017 at 4:30 AM, tommy xiao  wrote:

> Alex,Yes, let me have a try.
>
> 2017-03-31 3:16 GMT+08:00 Alex Rukletsov :
>
>> This is https://issues.apache.org/jira/browse/MESOS-7210. Deshi, do you
>> want to send the patch? I or Haosdent can shepherd.
>>
>> A.
>>
>> On Thu, Mar 30, 2017 at 12:27 PM, tommy xiao  wrote:
>>
>>> interesting for the specified case.
>>>
>>> 2017-03-30 7:52 GMT+08:00 Jie Yu :
>>>
 + AlexR, haosdent

 For posterity, the root cause of this problem is that when agent is
 running inside a docker container and `--docker_mesos_image` flag is
 specified, the pid namespace of the executor container (which initiate the
 health check) is different than the root pid namespace. Therefore, getting
 the network namespace handle using `/proc//ns/net` does not work
 because the 'pid' here is in the root pid namespace (reported by docker
 daemon).

 Alex and haosdent, I think we should fix this issue. As suggested
 above, we can launch the executor container with --pid=host if
 `--docker_mesos_image` is specified.

 - Jie

 On Wed, Mar 29, 2017 at 3:56 AM, tommy xiao  wrote:

> it resolved by add --pid=host.  thanks for community guys supports.
> thanks a lot.
>
> 2017-03-29 9:52 GMT+08:00 tommy xiao :
>
>> My Environment is specified:
>>
>> mesos 1.2 in docker containerized.
>>
>> send a sample nginx docker container with mesos native health check.
>>
>> then get sandbox core dump.
>>
>> i have digg into more information for your reference:
>>
>> in mesos slave container, i can only see task container pid. but i
>> can't found process nginx pid.
>>
>> but in host console, i can found the nginx pid. so how can i get the
>> pid in container?
>>
>>
>>
>>
>> 2017-03-28 13:49 GMT+08:00 tommy xiao :
>>
>>> https://issues.apache.org/jira/browse/MESOS-6184
>>>
>>> anyone give some hint?
>>>
>>> ```
>>>
>>> I0328 11:48:12.922181 48 exec.cpp:162] Version: 1.2.0
>>> I0328 11:48:12.929252 54 exec.cpp:237] Executor registered on agent
>>> a29dc3a5-3e3f-4058-8ab4-dd7de2ae58d1-S4
>>> I0328 11:48:12.931640 54 docker.cpp:850] Running docker -H
>>> unix:///var/run/docker.sock run --cpu-shares 10 --memory 33554432
>>> --env-file /tmp/gvqGyb -v /data/mesos/slaves/a29dc3a5-3e
>>> 3f-4058-8ab4-dd7de2ae58d1-S4/frameworks/d7ef5d2b-f924-42d9-a
>>> 274-c020afba6bce-/executors/0-hc-xychu-datamanmesos-2f3b
>>> 47f9ffc048539c7b22baa6c32d8f/runs/458189b8-2ff4-4337-ad3a-67321e96f5cb:/mnt/mesos/sandbox
>>> --net bridge --label=USER_NAME=xychu --label=GROUP_NAME=groupautotest
>>> --label=APP_ID=hc --label=VCLUSTER=clusterautotest
>>> --label=USER=xychu --label=CLUSTER=datamanmesos --label=SLOT=0
>>> --label=APP=hc -p 31000:80/tcp --name mesos-a29dc3a5-3e3f-4058-8ab4-
>>> dd7de2ae58d1-S4.458189b8-2ff4-4337-ad3a-67321e96f5cb nginx
>>> I0328 11:48:16.145714 53 health_checker.cpp:196] Ignoring failure as
>>> health check still in grace period
>>> W0328 11:48:26.289958 49 health_checker.cpp:202] Health check failed
>>> 1 times consecutively: HTTP health check failed: curl returned 
>>> terminated
>>> with signal Aborted (core dumped): ABORT: (../../../3rdparty/libprocess/
>>> include/process/posix/subprocess.hpp:190): Failed to execute
>>> Subprocess::ChildHook: Failed to enter the net namespace of pid 18596: 
>>> Pid
>>> 18596 does not exist
>>>
>>>-
>>>   -
>>>  - Aborted at 1490672906 (unix time) try "date -d
>>>  @1490672906" if you are using GNU date ***
>>>  PC: @ 0x7f26bfb485f7 __GI_raise
>>>  - SIGABRT (@0x4a) received by PID 74 (TID 0x7f26ba152700)
>>>  from PID 74; stack trace: ***
>>>  @ 0x7f26c0703100 (unknown)
>>>  @ 0x7f26bfb485f7 __GI_raise
>>>  @ 0x7f26bfb49ce8 __GI_abort
>>>  @ 0x7f26c315778e _Abort()
>>>  @ 0x7f26c31577cc _Abort()
>>>  @ 0x7f26c237a4b6 process::internal::childMain()
>>>  @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>>>  @ 0x7f26c2379e53 process::internal::defaultClone()
>>>  @ 0x7f26c237b951 process::internal::cloneChild()
>>>  @ 0x7f26c237954f process::subprocess()
>>>  @ 0x7f26c15a9fb1 mesos::internal::checks::Healt
>>>  hCheckerProcess::httpHealthCheck()
>>>  @ 0x7f26c15ababd mesos::internal::checks::Healt
>>>  hCheckerProcess::performSingleCheck()
>>>  @ 0x7f26c2331389 process::ProcessManager::resume()
>>>  @ 0x7f26c233a3f7