Sounds great!

>libprocess should always bind to 0.0.0.0

Do you proposal include this?

On Mon, Oct 17, 2016 at 2:12 AM, Jie Yu <yujie....@gmail.com> wrote:

> OK, guys. Thanks for the input! Here is my proposal:
>
> 1) If the container uses host network, Mesos agent will set
> LIBPROCESS_ADVERTISE_IP
> to agent IP. This is for the case where DNS is not configured properly on
> the host (we don't need to do that if DNS is configured properly). By doing
> this, libprocess will skip hostname lookup and advertise
> LIBPROCESS_ADVERTISE_IP
> directly.
>
> 2) If the container uses non-host network, and defines port mapping (e.g.,
> bridge). Mesos agent will not set any libprocess env variables. Given that
> there could be multiple mapped ports, Mesos agent don't know how to
> set LIBPROCESS_ADVERTISE_PORT.
> So it's framework's responsibility to set LIBPROCESS_ADVERTISE_IP and
> LIBPROCESS_ADVERTISE_PORT
> properly in this case (through CommandInfo.environment)
>
> 3) If the container uses non-host network, and does not define port mapping
> (e.g., ip per container). Mesos agent will not set any libprocess env
> variables. In this case, both CNI isolator and docker engine will properly
> setup DNS in the container so hostname lookup should work properly.
>
> - Jie
>
> On Sat, Oct 15, 2016 at 4:01 PM, tommy xiao <xia...@gmail.com> wrote:
>
> > good point, +1
> >
> > 2016-10-13 0:27 GMT+08:00 Jie Yu <yujie....@gmail.com>:
> >
> > > Stephan,
> > >
> > > I think the only time the framework needs to set
> LIBPROCESS_ADVERTISE_IP
> > is
> > > when DNAT is necessary for the container (e.g., bridge). In that
> > > case, LIBPROCESS_ADVERTISE_IP should always be agent ip and
> > > the relevant host port allocated for the container. For other cases,
> > > framework should not do anything.
> > >
> > > - Jie
> > >
> > > On Wed, Oct 12, 2016 at 4:43 AM, Erb, Stephan <
> > stephan....@blue-yonder.com
> > > >
> > > wrote:
> > >
> > > > >Framework should be the one that sets
> > > > >LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT appropriately
> if
> > > it
> > > > >tries to launch another Mesos framework so that Master can reach the
> > new
> > > > >framework.
> > > >
> > > > As a framework/executor author this is not possible in all scenarios:
> > > > There is no way to discover IP addresses assigned via CNI before the
> > > first
> > > > StatusUpdate has been received. It is therefore not possible to set
> > > > LIBPROCESS_ADVERTISE_IP appropriately at launch time.
> > > >
> > > > Please see https://issues.apache.org/jira/browse/MESOS-6281 for
> > details.
> > > >
> > > >
> > > > On 12/10/16 06:42, "Avinash Sridharan" <avin...@mesosphere.io>
> wrote:
> > > >
> > > >     Valid point. Makes sense to drive this decision from the user and
> > the
> > > >     framework.
> > > >
> > > >     On Tue, Oct 11, 2016 at 9:32 PM, Jie Yu <yujie....@gmail.com>
> > wrote:
> > > >
> > > >     > >
> > > >     > > While I believe this particular logic of setting
> > > > LIBPROCESS_ADVERTISE_IP
> > > >     > > to agent IP can be done in the agent (it could look at the
> port
> > > > mapping
> > > >     > > as well)
> > > >     >
> > > >     >
> > > >     > What if there are multiple port mappings? How can the agent
> > decide
> > > > which
> > > >     > port to be used as  LIBPROCESS_ADVERTISE_PORT?
> > > >     >
> > > >     > On Tue, Oct 11, 2016 at 9:27 PM, Avinash Sridharan <
> > > > avin...@mesosphere.io>
> > > >     > wrote:
> > > >     >
> > > >     > > Definitely a +1 for executor binding to 0.0.0.0, instead of
> > > doing a
> > > >     > > `gethostname` and `getaddrinfo`. But I am assuming this
> > semantics
> > > > would
> > > >     > > kick in only if LIBPROCESS_IP is not set, which should be the
> > > norm.
> > > >     > >
> > > >     > > +1 for LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT
> > and
> > > > the onus
> > > >     > > being on the frameworks to set these variables. I guess the
> > > > framework can
> > > >     > > set the LIBPROCESS_ADVERTISE_IP to the agent IP and
> > > >     > > LIBPROCESS_ADVERTISE_PORT to the host port when it specifies
> a
> > > >     > > port-mapping. While I believe this particular logic of
> > > >     > > setting LIBPROCESS_ADVERTISE_IP to agent IP can be done in
> the
> > > > agent (it
> > > >     > > could look at the port mapping as well), when to actually set
> > > these
> > > >     > > variables (whether the executors even need to advertise their
> > IP
> > > >     > addresses,
> > > >     > > is a decision that the Frameworks should be privy too and not
> > > left
> > > > to the
> > > >     > > agent.
> > > >     > >
> > > >     > > On Tue, Oct 11, 2016 at 7:31 PM, haosdent <
> haosd...@gmail.com>
> > > > wrote:
> > > >     > >
> > > >     > > > > libprocess should always bind to 0.0.0.0
> > > >     > > > + 1 for this
> > > >     > > >
> > > >     > > > On Wed, Oct 12, 2016 at 2:33 AM, Jie Yu <
> yujie....@gmail.com
> > >
> > > > wrote:
> > > >     > > >
> > > >     > > > > Hi folks,
> > > >     > > > >
> > > >     > > > > I was in the process of cleaning up some tech debt
> related
> > to
> > > > env
> > > >     > > > variables
> > > >     > > > > in our code base. I created an epic ticket
> > > >     > > > > <https://issues.apache.org/jira/browse/MESOS-6341> to
> > > track. I
> > > >     > > searched
> > > >     > > > > relevant tickets fired previously, and found MESOS-3740
> > > >     > > > > <https://issues.apache.org/jira/browse/MESOS-3740>. I
> did
> > > some
> > > >     > digging
> > > >     > > > on
> > > >     > > > > how we handle LIBPROCESS_IP currently, and here are my
> > > > findings:
> > > >     > > > >
> > > >     > > > > 1) We always set LIBPROCESS_IP in the executor
> environment
> > > > variables:
> > > >     > > > > https://github.com/apache/mesos/blob/master/src/slave/
> > > >     > > > > slave.cpp#L6793-L6796
> > > >     > > > >
> > > >     > > > > This is not an issue for an executor that runs on host
> > > network.
> > > >     > > However,
> > > >     > > > if
> > > >     > > > > the executor wants to run on non-host network (e.g.,
> > > overlay),
> > > > this
> > > >     > > might
> > > >     > > > > be problematic, because libprocess for the executor will
> > try
> > > > to bind
> > > >     > to
> > > >     > > > > LIBPROCESS_IP, but the IP is not valid inside the
> > container.
> > > >     > > > >
> > > >     > > > > 2) As mentioned in MESOS-3740
> > > >     > > > > <https://issues.apache.org/jira/browse/MESOS-3740>, some
> > > user
> > > > wants
> > > >     > to
> > > >     > > > run
> > > >     > > > > a Mesos framework in a Mesos container. The old style
> > > framework
> > > >     > driver
> > > >     > > > > assumes a 2 way communication channel between the
> framework
> > > > and the
> > > >     > > Mesos
> > > >     > > > > master. In order for the master to reach the framework
> > > running
> > > >     > inside a
> > > >     > > > > Mesos container, the framework's libprocess should
> > advertise
> > > > its ip
> > > >     > and
> > > >     > > > > port properly. This problem gets tricky because the
> > > networking
> > > > for
> > > >     > the
> > > >     > > > > Mesos container:
> > > >     > > > >
> > > >     > > > > 2.a) If the container uses host network, libprocess
> should
> > > > bind to
> > > >     > > > 0.0.0.0,
> > > >     > > > > and advertise itself using the agent ip and the relevant
> > port
> > > >     > > > > 2.b) If the container has a routable ip (e.g., using
> calico
> > > or
> > > >     > > overlay),
> > > >     > > > > libprocess should still bind to 0.0.0.0, and advertise
> > itself
> > > > using
> > > >     > the
> > > >     > > > > container ip and the relevant port. Currently, it binds
> to
> > > > agent ip
> > > >     > > > (which
> > > >     > > > > will fail), and advertise itself using agnet ip and the
> > port
> > > > in the
> > > >     > > > > container (which will fail as well)
> > > >     > > > > 2.c) If the container has a private ip (e.g., bridge),
> > > > libprocess
> > > >     > > should
> > > >     > > > > still bind to 0.0.0.0, and advertise itself using the
> agent
> > > ip
> > > > and
> > > >     > > > _mapped_
> > > >     > > > > host port. Currently, it binds to agent ip (which will
> > fail),
> > > > and
> > > >     > > > advertise
> > > >     > > > > itself using agent ip and the port in the container
> (which
> > > > will fail
> > > >     > as
> > > >     > > > > well)
> > > >     > > > >
> > > >     > > > > Therefore, the workaround
> > > >     > > > > <https://github.com/mesosphere/mesos/commit/
> > > >     > > > b9c622b53b3ffcc27911fcdcefc37a
> > > >     > > > > 52ebe33bdd>
> > > >     > > > > suggested in MESOS-3740 <https://issues.apache.org/
> > > >     > > > jira/browse/MESOS-3740>
> > > >     > > > > is not ideal. It does not consider 2.b) and 2.c)
> > > >     > > > >
> > > >     > > > > Libprocess now supports both LIBPROCESS_IP and
> > > >     > LIBPROCESS_ADVERTISE_IP
> > > >     > > so
> > > >     > > > > the bind address does not have to be the address that is
> > > being
> > > >     > > > advertised.
> > > >     > > > >
> > > >     > > > > For the 2.c) case, Mesos don't have a way to determine
> the
> > > > advertise
> > > >     > > port
> > > >     > > > > (mapped port). This information is only known to the
> > > framework
> > > > (which
> > > >     > > > host
> > > >     > > > > port it'll use to serve as the mapped port for the
> > > libprocess).
> > > >     > > > >
> > > >     > > > > Given that, I think Mesos should not bindly set
> > LIBPROCESS_IP
> > > > to
> > > >     > agent
> > > >     > > IP
> > > >     > > > > in executor environment variables. Framework should be
> the
> > > one
> > > > that
> > > >     > > sets
> > > >     > > > > LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT
> > > > appropriately
> > > >     > if
> > > >     > > it
> > > >     > > > > tries to launch another Mesos framework so that Master
> can
> > > > reach the
> > > >     > > new
> > > >     > > > > framework. If the framework just wants to launch a
> regular
> > > > container
> > > >     > > that
> > > >     > > > > does not depends on libprocess, it should simply not set
> > > these
> > > > env
> > > >     > > > > variables.
> > > >     > > > >
> > > >     > > > > Also, I think libprocess should always bind to 0.0.0.0,
> > > rather
> > > > than
> > > >     > > > doing a
> > > >     > > > > hostname lookup and bind to the IP found for the
> hostname.
> > > >     > > > > LIBPROCESS_ADVERTISE_IP can be used to overwrite the ip
> > > > address it
> > > >     > > wants
> > > >     > > > to
> > > >     > > > > advertise to peers. If that's not specified, it'll try to
> > do
> > > a
> > > >     > hostname
> > > >     > > > > lookup to guess a routable ip.
> > > >     > > > >
> > > >     > > > > Thoughts?
> > > >     > > > > - Jie
> > > >     > > > >
> > > >     > > >
> > > >     > > >
> > > >     > > >
> > > >     > > > --
> > > >     > > > Best Regards,
> > > >     > > > Haosdent Huang
> > > >     > > >
> > > >     > >
> > > >     > >
> > > >     > >
> > > >     > > --
> > > >     > > Avinash Sridharan, Mesosphere
> > > >     > > +1 (323) 702 5245
> > > >     > >
> > > >     >
> > > >
> > > >
> > > >
> > > >     --
> > > >     Avinash Sridharan, Mesosphere
> > > >     +1 (323) 702 5245
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Deshi Xiao
> > Twitter: xds2000
> > E-mail: xiaods(AT)gmail.com
> >
>



-- 
Best Regards,
Haosdent Huang

Reply via email to