[
https://issues.apache.org/jira/browse/MESOS-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133466#comment-15133466
]
Till Toenshoff commented on MESOS-4598:
---------------------------------------
Seems we have two problems which are related and solved locally;
1. any executor forked by an agent should share the same {{LIBPROCESS_IP}} to
prevent reverse dns failures.
see
https://github.com/apache/mesos/blame/master/src/slave/containerizer/containerizer.cpp#L265
2. the logger forked by an agent should not have the same {{LIBPROCESS_PORT}}
to prevent bind failures.
see your patch
So the problem here is that initially, the logger fixed the problem of
bind-errors when starting with completely unset LIBPROCESS_ vars while
exec'ing. The above fix then keeps the IP as per issue #1.
Currently, I believe that #1 and #2 are actually true for ANY libprocess parent
and child process.
While the above RR might be a fine workaround, I am not sure that we should
regard it as a proper fix.
The problem here is a rather central one I believe, not specific to the logger
at all.
Any libprocess os-process forked by a libprocess os-process does run into these
issues. We should consider a central solution.
There seem options like;
- removing LIBPROCESS_PORT from the environment of a libprocess process once it
gathered that value;
insert at
https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L872
- inheriting LIBPROCESS_IP by default when forking via libprocess's subprocess
Are my assumptions wrong or any opinions and suggestions?
> Logrotate ContainerLogger should not remove IP from environment.
> ----------------------------------------------------------------
>
> Key: MESOS-4598
> URL: https://issues.apache.org/jira/browse/MESOS-4598
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.27.0
> Reporter: Joseph Wu
> Assignee: Joseph Wu
> Labels: mesosphere
>
> The {{LogrotateContainerLogger}} starts libprocess-using subprocesses.
> Libprocess initialization will attempt to resolve the IP from the hostname.
> If a DNS service is not available, this step will fail, which terminates the
> logger subprocess prematurely.
> Since the logger subprocesses live on the agent, they should use the same
> {{LIBPROCESS_IP}} supplied to the agent.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)