[ 
https://issues.apache.org/jira/browse/MESOS-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133466#comment-15133466
 ] 

Till Toenshoff commented on MESOS-4598:
---------------------------------------

Seems we have two problems which are related and solved locally;

1. any executor forked by an agent should share the same {{LIBPROCESS_IP}} to 
prevent reverse dns failures.
see 
https://github.com/apache/mesos/blame/master/src/slave/containerizer/containerizer.cpp#L265

2. the logger forked by an agent should not have the same {{LIBPROCESS_PORT}} 
to prevent bind failures.
see your patch

So the problem here is that initially, the logger fixed the problem of 
bind-errors when starting with completely unset LIBPROCESS_ vars while 
exec'ing. The above fix then keeps the IP as per issue #1. 
Currently, I believe that #1 and #2 are actually true for ANY libprocess parent 
and child process.

While the above RR might be a fine workaround, I am not sure that we should 
regard it as a proper fix.

The problem here is a rather central one I believe, not specific to the logger 
at all. 

Any libprocess os-process forked by a libprocess os-process does run into these 
issues. We should consider a central solution.

There seem options like;

- removing LIBPROCESS_PORT from the environment of a libprocess process once it 
gathered that value;
insert at 
https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L872

- inheriting LIBPROCESS_IP by default when forking via libprocess's subprocess

Are my assumptions wrong or any opinions and suggestions?

> Logrotate ContainerLogger should not remove IP from environment.
> ----------------------------------------------------------------
>
>                 Key: MESOS-4598
>                 URL: https://issues.apache.org/jira/browse/MESOS-4598
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.27.0
>            Reporter: Joseph Wu
>            Assignee: Joseph Wu
>              Labels: mesosphere
>
> The {{LogrotateContainerLogger}} starts libprocess-using subprocesses.  
> Libprocess initialization will attempt to resolve the IP from the hostname.  
> If a DNS service is not available, this step will fail, which terminates the 
> logger subprocess prematurely.
> Since the logger subprocesses live on the agent, they should use the same 
> {{LIBPROCESS_IP}} supplied to the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to