Hello,

I build Spark on YARN cluster on OpenShift Origin All-On-One VM.

* All-In-One Virtual Machine (Version 1.1.3.1)
* origin version
  - origin v1.1.2
  - kubernetes v1.2.0-alpha.4-851-g4a65fa1
  - etcd 2.2.2
* Spark on YARN version
  - Spark 1.6.0
  - Hadoop 2.6.0 (CDH 5.6.0)
  - Oracle Java 1.8.0_74

There are one HDFS/YARN master and one HDFS/YARN worker on each pods.

[vagrant@origin ~]$ oc get pods
NAME                        READY     STATUS    RESTARTS   AGE
docker-registry-1-z3skh     1/1       Running   0          20d
router-1-3jatj              1/1       Running   0          20d
spark-yarn-master-1-sxegt   1/1       Running   0          31m
spark-yarn-worker-1-pshqi   1/1       Running   0          31m
[vagrant@origin ~]$ 

[vagrant@origin ~]$ oc get svc
NAME                CLUSTER_IP       EXTERNAL_IP   PORT(S)                      
                                                                   SELECTOR     
                        AGE
docker-registry     172.30.93.207    <none>        5000/TCP                     
                                                                   
docker-registry=default              20d
kubernetes          172.30.0.1       <none>        443/TCP,53/UDP,53/TCP        
                                                                   <none>       
                        20d
router              172.30.241.166   <none>        80/TCP                       
                                                                   
router=router                        20d
spark-yarn-master   172.30.242.57    <none>        
8020/TCP,50070/TCP,50090/TCP,8030/TCP,8031/TCP,8032/TCP,8033/TCP,8088/TCP,10020/TCP,19888/TCP
   deploymentconfig=spark-yarn-master   35m
spark-yarn-worker   172.30.1.53      <none>        
50010/TCP,50020/TCP,50075/TCP,8040/TCP,8042/TCP                                 
                deploymentconfig=spark-yarn-worker   35m
[vagrant@origin ~]$ 

spark-yarn-master container has below hostname and IP addr.

hostname: spark-yarn-master-1-sxegt (pod name)
IP addr.: 172.17.0.11

hostname: spark-yarn-master (service name)
IP addr.: 172.30.242.57

bash-4.2$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
172.17.0.11     spark-yarn-master-1-sxegt
bash-4.2$

bash-4.2$ ip -4 addr show dev eth0
50: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UP  link-netnsid 0
    inet 172.17.0.11/16 scope global eth0
       valid_lft forever preferred_lft forever
bash-4.2$

bash-4.2$ hostname -f
spark-yarn-master-1-sxegt
bash-4.2$ 

bash-4.2$ curl -v spark-yarn-master:8020
* About to connect() to spark-yarn-master port 8020 (#0)
*   Trying 172.30.242.57...
* Connected to spark-yarn-master (172.30.242.57) port 8020 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: spark-yarn-master:8020
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< Content-type: text/plain
* no chunk, no close, no size. Assume close to signal end
< 
It looks like you are making an HTTP request to a Hadoop IPC port. This is not 
the correct port for the web interface on this daemon.
* Closing connection 0
bash-4.2$ 

spark-yarn-worker container has below hostname and IP addr.

hostname: spark-yarn-worker-1-pshqi (pod name)
IP addr.: 172.17.0.12

hostname: spark-yarn-worker (service name)
IP addr.: 172.30.1.53

bash-4.2$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
172.17.0.12     spark-yarn-worker-1-pshqi
bash-4.2$

bash-4.2$ ip -4 addr show dev eth0
52: eth0@if53: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UP  link-netnsid 0
    inet 172.17.0.12/16 scope global eth0
       valid_lft forever preferred_lft forever
bash-4.2$ 

bash-4.2$ hostname -f
spark-yarn-worker-1-pshqi
bash-4.2$ 

bash-4.2$ curl -v spark-yarn-worker:8040
* About to connect() to spark-yarn-worker port 8040 (#0)
*   Trying 172.30.1.53...
* Connected to spark-yarn-worker (172.30.1.53) port 8040 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: spark-yarn-worker:8040
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< Content-type: text/plain
* no chunk, no close, no size. Assume close to signal end
< 
It looks like you are making an HTTP request to a Hadoop IPC port. This is not 
the correct port for the web interface on this daemon.
* Closing connection 0
bash-4.2$ 

Spark YARN master/worker nodes can connect each other by service name.

On master, To worker (service name):

bash-4.2$ hostname -f ; curl spark-yarn-worker:8040
spark-yarn-master-1-sxegt
It looks like you are making an HTTP request to a Hadoop IPC port. This is not 
the correct port for the web interface on this daemon.
bash-4.2$ 

On worker, To master (service name):

bash-4.2$ hostname -f ; curl spark-yarn-master:8020
spark-yarn-worker-1-pshqi
It looks like you are making an HTTP request to a Hadoop IPC port. This is not 
the correct port for the web interface on this daemon.
bash-4.2$ 

But, they cannot connect each other by hostname (pod name).

On master, To worker (hostname):

bash-4.2$ hostname -f ; curl spark-yarn-master-1-sxegt:8020
spark-yarn-worker-1-pshqi
curl: (6) Could not resolve host: spark-yarn-master-1-sxegt; Name or service 
not known
bash-4.2$ 

On worker, To master (hostname):

bash-4.2$ hostname -f ; curl spark-yarn-worker-1-pshqi:8040                    
spark-yarn-master-1-sxegt
curl: (6) Could not resolve host: spark-yarn-worker-1-pshqi; Name or service 
not known
bash-4.2$ 

This problem may be resolved by below either way:

* let hostname (pod name) resolved by OpenShift internal DNS as same as service 
name.
* let service name (or arbitrary name) set at /etc/hosts (or hostname) by any 
configurations.

But I do not know how can I do.  Could you please let me know?

Regards,
        dai
-- 
HIGUCHI Daisuke <[email protected]>

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to