>Framework should be the one that sets
>LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT appropriately if it
>tries to launch another Mesos framework so that Master can reach the new
>framework.

As a framework/executor author this is not possible in all scenarios: There is 
no way to discover IP addresses assigned via CNI before the first StatusUpdate 
has been received. It is therefore not possible to set LIBPROCESS_ADVERTISE_IP 
appropriately at launch time. 

Please see https://issues.apache.org/jira/browse/MESOS-6281 for details.


On 12/10/16 06:42, "Avinash Sridharan" <avin...@mesosphere.io> wrote:

    Valid point. Makes sense to drive this decision from the user and the
    framework.
    
    On Tue, Oct 11, 2016 at 9:32 PM, Jie Yu <yujie....@gmail.com> wrote:
    
    > >
    > > While I believe this particular logic of setting LIBPROCESS_ADVERTISE_IP
    > > to agent IP can be done in the agent (it could look at the port mapping
    > > as well)
    >
    >
    > What if there are multiple port mappings? How can the agent decide which
    > port to be used as  LIBPROCESS_ADVERTISE_PORT?
    >
    > On Tue, Oct 11, 2016 at 9:27 PM, Avinash Sridharan <avin...@mesosphere.io>
    > wrote:
    >
    > > Definitely a +1 for executor binding to 0.0.0.0, instead of doing a
    > > `gethostname` and `getaddrinfo`. But I am assuming this semantics would
    > > kick in only if LIBPROCESS_IP is not set, which should be the norm.
    > >
    > > +1 for LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT and the 
onus
    > > being on the frameworks to set these variables. I guess the framework 
can
    > > set the LIBPROCESS_ADVERTISE_IP to the agent IP and
    > > LIBPROCESS_ADVERTISE_PORT to the host port when it specifies a
    > > port-mapping. While I believe this particular logic of
    > > setting LIBPROCESS_ADVERTISE_IP to agent IP can be done in the agent (it
    > > could look at the port mapping as well), when to actually set these
    > > variables (whether the executors even need to advertise their IP
    > addresses,
    > > is a decision that the Frameworks should be privy too and not left to 
the
    > > agent.
    > >
    > > On Tue, Oct 11, 2016 at 7:31 PM, haosdent <haosd...@gmail.com> wrote:
    > >
    > > > > libprocess should always bind to 0.0.0.0
    > > > + 1 for this
    > > >
    > > > On Wed, Oct 12, 2016 at 2:33 AM, Jie Yu <yujie....@gmail.com> wrote:
    > > >
    > > > > Hi folks,
    > > > >
    > > > > I was in the process of cleaning up some tech debt related to env
    > > > variables
    > > > > in our code base. I created an epic ticket
    > > > > <https://issues.apache.org/jira/browse/MESOS-6341> to track. I
    > > searched
    > > > > relevant tickets fired previously, and found MESOS-3740
    > > > > <https://issues.apache.org/jira/browse/MESOS-3740>. I did some
    > digging
    > > > on
    > > > > how we handle LIBPROCESS_IP currently, and here are my findings:
    > > > >
    > > > > 1) We always set LIBPROCESS_IP in the executor environment 
variables:
    > > > > https://github.com/apache/mesos/blob/master/src/slave/
    > > > > slave.cpp#L6793-L6796
    > > > >
    > > > > This is not an issue for an executor that runs on host network.
    > > However,
    > > > if
    > > > > the executor wants to run on non-host network (e.g., overlay), this
    > > might
    > > > > be problematic, because libprocess for the executor will try to bind
    > to
    > > > > LIBPROCESS_IP, but the IP is not valid inside the container.
    > > > >
    > > > > 2) As mentioned in MESOS-3740
    > > > > <https://issues.apache.org/jira/browse/MESOS-3740>, some user wants
    > to
    > > > run
    > > > > a Mesos framework in a Mesos container. The old style framework
    > driver
    > > > > assumes a 2 way communication channel between the framework and the
    > > Mesos
    > > > > master. In order for the master to reach the framework running
    > inside a
    > > > > Mesos container, the framework's libprocess should advertise its ip
    > and
    > > > > port properly. This problem gets tricky because the networking for
    > the
    > > > > Mesos container:
    > > > >
    > > > > 2.a) If the container uses host network, libprocess should bind to
    > > > 0.0.0.0,
    > > > > and advertise itself using the agent ip and the relevant port
    > > > > 2.b) If the container has a routable ip (e.g., using calico or
    > > overlay),
    > > > > libprocess should still bind to 0.0.0.0, and advertise itself using
    > the
    > > > > container ip and the relevant port. Currently, it binds to agent ip
    > > > (which
    > > > > will fail), and advertise itself using agnet ip and the port in the
    > > > > container (which will fail as well)
    > > > > 2.c) If the container has a private ip (e.g., bridge), libprocess
    > > should
    > > > > still bind to 0.0.0.0, and advertise itself using the agent ip and
    > > > _mapped_
    > > > > host port. Currently, it binds to agent ip (which will fail), and
    > > > advertise
    > > > > itself using agent ip and the port in the container (which will fail
    > as
    > > > > well)
    > > > >
    > > > > Therefore, the workaround
    > > > > <https://github.com/mesosphere/mesos/commit/
    > > > b9c622b53b3ffcc27911fcdcefc37a
    > > > > 52ebe33bdd>
    > > > > suggested in MESOS-3740 <https://issues.apache.org/
    > > > jira/browse/MESOS-3740>
    > > > > is not ideal. It does not consider 2.b) and 2.c)
    > > > >
    > > > > Libprocess now supports both LIBPROCESS_IP and
    > LIBPROCESS_ADVERTISE_IP
    > > so
    > > > > the bind address does not have to be the address that is being
    > > > advertised.
    > > > >
    > > > > For the 2.c) case, Mesos don't have a way to determine the advertise
    > > port
    > > > > (mapped port). This information is only known to the framework 
(which
    > > > host
    > > > > port it'll use to serve as the mapped port for the libprocess).
    > > > >
    > > > > Given that, I think Mesos should not bindly set LIBPROCESS_IP to
    > agent
    > > IP
    > > > > in executor environment variables. Framework should be the one that
    > > sets
    > > > > LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT appropriately
    > if
    > > it
    > > > > tries to launch another Mesos framework so that Master can reach the
    > > new
    > > > > framework. If the framework just wants to launch a regular container
    > > that
    > > > > does not depends on libprocess, it should simply not set these env
    > > > > variables.
    > > > >
    > > > > Also, I think libprocess should always bind to 0.0.0.0, rather than
    > > > doing a
    > > > > hostname lookup and bind to the IP found for the hostname.
    > > > > LIBPROCESS_ADVERTISE_IP can be used to overwrite the ip address it
    > > wants
    > > > to
    > > > > advertise to peers. If that's not specified, it'll try to do a
    > hostname
    > > > > lookup to guess a routable ip.
    > > > >
    > > > > Thoughts?
    > > > > - Jie
    > > > >
    > > >
    > > >
    > > >
    > > > --
    > > > Best Regards,
    > > > Haosdent Huang
    > > >
    > >
    > >
    > >
    > > --
    > > Avinash Sridharan, Mesosphere
    > > +1 (323) 702 5245
    > >
    >
    
    
    
    -- 
    Avinash Sridharan, Mesosphere
    +1 (323) 702 5245
    


Reply via email to