There is a discussion upstream about this, and ensuring that names for
pods can be resolved.  In the short term it is not possible to resolve
the pod hostname to itself.  Can the worker run without being able to
resolve the hostname?

On Sun, Mar 6, 2016 at 8:11 PM, HIGUCHI Daisuke
<[email protected]> wrote:
> Hello,
>
> I build Spark on YARN cluster on OpenShift Origin All-On-One VM.
>
> * All-In-One Virtual Machine (Version 1.1.3.1)
> * origin version
>   - origin v1.1.2
>   - kubernetes v1.2.0-alpha.4-851-g4a65fa1
>   - etcd 2.2.2
> * Spark on YARN version
>   - Spark 1.6.0
>   - Hadoop 2.6.0 (CDH 5.6.0)
>   - Oracle Java 1.8.0_74
>
> There are one HDFS/YARN master and one HDFS/YARN worker on each pods.
>
> [vagrant@origin ~]$ oc get pods
> NAME                        READY     STATUS    RESTARTS   AGE
> docker-registry-1-z3skh     1/1       Running   0          20d
> router-1-3jatj              1/1       Running   0          20d
> spark-yarn-master-1-sxegt   1/1       Running   0          31m
> spark-yarn-worker-1-pshqi   1/1       Running   0          31m
> [vagrant@origin ~]$
>
> [vagrant@origin ~]$ oc get svc
> NAME                CLUSTER_IP       EXTERNAL_IP   PORT(S)                    
>                                                                      SELECTOR 
>                             AGE
> docker-registry     172.30.93.207    <none>        5000/TCP                   
>                                                                      
> docker-registry=default              20d
> kubernetes          172.30.0.1       <none>        443/TCP,53/UDP,53/TCP      
>                                                                      <none>   
>                             20d
> router              172.30.241.166   <none>        80/TCP                     
>                                                                      
> router=router                        20d
> spark-yarn-master   172.30.242.57    <none>        
> 8020/TCP,50070/TCP,50090/TCP,8030/TCP,8031/TCP,8032/TCP,8033/TCP,8088/TCP,10020/TCP,19888/TCP
>    deploymentconfig=spark-yarn-master   35m
> spark-yarn-worker   172.30.1.53      <none>        
> 50010/TCP,50020/TCP,50075/TCP,8040/TCP,8042/TCP                               
>                   deploymentconfig=spark-yarn-worker   35m
> [vagrant@origin ~]$
>
> spark-yarn-master container has below hostname and IP addr.
>
> hostname: spark-yarn-master-1-sxegt (pod name)
> IP addr.: 172.17.0.11
>
> hostname: spark-yarn-master (service name)
> IP addr.: 172.30.242.57
>
> bash-4.2$ cat /etc/hosts
> # Kubernetes-managed hosts file.
> 127.0.0.1       localhost
> ::1     localhost ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> fe00::0 ip6-mcastprefix
> fe00::1 ip6-allnodes
> fe00::2 ip6-allrouters
> 172.17.0.11     spark-yarn-master-1-sxegt
> bash-4.2$
>
> bash-4.2$ ip -4 addr show dev eth0
> 50: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
> UP  link-netnsid 0
>     inet 172.17.0.11/16 scope global eth0
>        valid_lft forever preferred_lft forever
> bash-4.2$
>
> bash-4.2$ hostname -f
> spark-yarn-master-1-sxegt
> bash-4.2$
>
> bash-4.2$ curl -v spark-yarn-master:8020
> * About to connect() to spark-yarn-master port 8020 (#0)
> *   Trying 172.30.242.57...
> * Connected to spark-yarn-master (172.30.242.57) port 8020 (#0)
>> GET / HTTP/1.1
>> User-Agent: curl/7.29.0
>> Host: spark-yarn-master:8020
>> Accept: */*
>>
> < HTTP/1.1 404 Not Found
> < Content-type: text/plain
> * no chunk, no close, no size. Assume close to signal end
> <
> It looks like you are making an HTTP request to a Hadoop IPC port. This is 
> not the correct port for the web interface on this daemon.
> * Closing connection 0
> bash-4.2$
>
> spark-yarn-worker container has below hostname and IP addr.
>
> hostname: spark-yarn-worker-1-pshqi (pod name)
> IP addr.: 172.17.0.12
>
> hostname: spark-yarn-worker (service name)
> IP addr.: 172.30.1.53
>
> bash-4.2$ cat /etc/hosts
> # Kubernetes-managed hosts file.
> 127.0.0.1       localhost
> ::1     localhost ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> fe00::0 ip6-mcastprefix
> fe00::1 ip6-allnodes
> fe00::2 ip6-allrouters
> 172.17.0.12     spark-yarn-worker-1-pshqi
> bash-4.2$
>
> bash-4.2$ ip -4 addr show dev eth0
> 52: eth0@if53: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
> UP  link-netnsid 0
>     inet 172.17.0.12/16 scope global eth0
>        valid_lft forever preferred_lft forever
> bash-4.2$
>
> bash-4.2$ hostname -f
> spark-yarn-worker-1-pshqi
> bash-4.2$
>
> bash-4.2$ curl -v spark-yarn-worker:8040
> * About to connect() to spark-yarn-worker port 8040 (#0)
> *   Trying 172.30.1.53...
> * Connected to spark-yarn-worker (172.30.1.53) port 8040 (#0)
>> GET / HTTP/1.1
>> User-Agent: curl/7.29.0
>> Host: spark-yarn-worker:8040
>> Accept: */*
>>
> < HTTP/1.1 404 Not Found
> < Content-type: text/plain
> * no chunk, no close, no size. Assume close to signal end
> <
> It looks like you are making an HTTP request to a Hadoop IPC port. This is 
> not the correct port for the web interface on this daemon.
> * Closing connection 0
> bash-4.2$
>
> Spark YARN master/worker nodes can connect each other by service name.
>
> On master, To worker (service name):
>
> bash-4.2$ hostname -f ; curl spark-yarn-worker:8040
> spark-yarn-master-1-sxegt
> It looks like you are making an HTTP request to a Hadoop IPC port. This is 
> not the correct port for the web interface on this daemon.
> bash-4.2$
>
> On worker, To master (service name):
>
> bash-4.2$ hostname -f ; curl spark-yarn-master:8020
> spark-yarn-worker-1-pshqi
> It looks like you are making an HTTP request to a Hadoop IPC port. This is 
> not the correct port for the web interface on this daemon.
> bash-4.2$
>
> But, they cannot connect each other by hostname (pod name).
>
> On master, To worker (hostname):
>
> bash-4.2$ hostname -f ; curl spark-yarn-master-1-sxegt:8020
> spark-yarn-worker-1-pshqi
> curl: (6) Could not resolve host: spark-yarn-master-1-sxegt; Name or service 
> not known
> bash-4.2$
>
> On worker, To master (hostname):
>
> bash-4.2$ hostname -f ; curl spark-yarn-worker-1-pshqi:8040
> spark-yarn-master-1-sxegt
> curl: (6) Could not resolve host: spark-yarn-worker-1-pshqi; Name or service 
> not known
> bash-4.2$
>
> This problem may be resolved by below either way:
>
> * let hostname (pod name) resolved by OpenShift internal DNS as same as 
> service name.
> * let service name (or arbitrary name) set at /etc/hosts (or hostname) by any 
> configurations.
>
> But I do not know how can I do.  Could you please let me know?
>
> Regards,
>         dai
> --
> HIGUCHI Daisuke <[email protected]>
>
> _______________________________________________
> users mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to