[ 
https://issues.apache.org/jira/browse/SLIDER-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373371#comment-16373371
 ] 

Lev Bronshtein edited comment on SLIDER-1259 at 2/22/18 8:20 PM:
-----------------------------------------------------------------

{panel}
For yarn.nodemanager.bind-host to work, it's got to be configured on each host 
to explicitly name the host. Is this how things get deployed?
{panel}
Yes, at least this is what we use in [https://github.com/bloomberg/chef-bach] 
to accomplish binding to the f-host interface as opposed to host, in fact all 
three are set.  Perhaps the process can be use this or other override setting 
and if not then use canonical host name.


was (Author: lbronshtein):
{quote}bq. For yarn.nodemanager.bind-host to work, it's got to be configured on 
each host to explicitly name the host. Is this how things get deployed?
{quote}

> Slider does not work in multi homed environments
> ------------------------------------------------
>
>                 Key: SLIDER-1259
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1259
>             Project: Slider
>          Issue Type: Bug
>          Components: appmaster
>    Affects Versions: Slider 0.92
>            Reporter: Lev Bronshtein
>            Assignee: Steve Loughran
>            Priority: Minor
>
> In an an environment where Hadoop Worker nodes bind the Node Manager to an 
> interface with a hostname different from the one returned by socket.getfqdn() 
> for example in our test environment a difference between f-bcpc-vm3 and just 
> bcpc-vm3, which is the hostname bound to the management interface, but not 
> the interface for hadoop/production traffic.  This results in our inability 
> to introspect running jobs.
>  
> For example running  *slider registry --name slider_poc --listexp* results in 
> the following output in the ResourceManager logs
> {quote}2018-01-26 17:30:32,147 INFO 
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: ubuntu is 
> accessing unchecked 
> [http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports] which 
> is the app master GUI of application_1516910361403_0094 owned by ubuntu 
>  2018-01-26 17:31:13,639 WARN org.mortbay.log: 
> /proxy/application_1516910361403_0094/ws/v1/slider/publisher/exports: 
> java.net.ConnectException: Connection timed out (Connection timed out) 
> {quote}
>  
> Note how the redirect is to 
> [http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports,] 
> where as it should have been to 
> [http://f-bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports.]  
> Renaming the host to f-bcpc-vm3 results in appropriate behavior.
>  
> perhaps *hostname.py* can be instructed to look at one of before registering 
> *yarn.nodemanager.address*
>  *yarn.nodemanager.bind-host*
>  *yarn.nodemanager.hostname*
>  
> When called in Register.py
> register = {'responseId': int(id),
>   'timestamp': timestamp,
>   'label': self.config.getLabel(),
>   *'publicHostname': hostname.public_hostname(),*
>   'agentVersion': version,
>   'actualState': actualState,
>   'expectedState': expectedState,
>   'allocatedPorts': allocated_ports,
>   'logFolders': log_folders,
>   'tags': tags
>  }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to