Hello,
I build Spark on YARN cluster on OpenShift Origin All-On-One VM.
* All-In-One Virtual Machine (Version 1.1.3.1)
* origin version
- origin v1.1.2
- kubernetes v1.2.0-alpha.4-851-g4a65fa1
- etcd 2.2.2
* Spark on YARN version
- Spark 1.6.0
- Hadoop 2.6.0 (CDH 5.6.0)
- Oracle Java 1.8.0_74
There are one HDFS/YARN master and one HDFS/YARN worker on each pods.
[vagrant@origin ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
docker-registry-1-z3skh 1/1 Running 0 20d
router-1-3jatj 1/1 Running 0 20d
spark-yarn-master-1-sxegt 1/1 Running 0 31m
spark-yarn-worker-1-pshqi 1/1 Running 0 31m
[vagrant@origin ~]$
[vagrant@origin ~]$ oc get svc
NAME CLUSTER_IP EXTERNAL_IP PORT(S)
SELECTOR
AGE
docker-registry 172.30.93.207 <none> 5000/TCP
docker-registry=default 20d
kubernetes 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP
<none>
20d
router 172.30.241.166 <none> 80/TCP
router=router 20d
spark-yarn-master 172.30.242.57 <none>
8020/TCP,50070/TCP,50090/TCP,8030/TCP,8031/TCP,8032/TCP,8033/TCP,8088/TCP,10020/TCP,19888/TCP
deploymentconfig=spark-yarn-master 35m
spark-yarn-worker 172.30.1.53 <none>
50010/TCP,50020/TCP,50075/TCP,8040/TCP,8042/TCP
deploymentconfig=spark-yarn-worker 35m
[vagrant@origin ~]$
spark-yarn-master container has below hostname and IP addr.
hostname: spark-yarn-master-1-sxegt (pod name)
IP addr.: 172.17.0.11
hostname: spark-yarn-master (service name)
IP addr.: 172.30.242.57
bash-4.2$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
172.17.0.11 spark-yarn-master-1-sxegt
bash-4.2$
bash-4.2$ ip -4 addr show dev eth0
50: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP link-netnsid 0
inet 172.17.0.11/16 scope global eth0
valid_lft forever preferred_lft forever
bash-4.2$
bash-4.2$ hostname -f
spark-yarn-master-1-sxegt
bash-4.2$
bash-4.2$ curl -v spark-yarn-master:8020
* About to connect() to spark-yarn-master port 8020 (#0)
* Trying 172.30.242.57...
* Connected to spark-yarn-master (172.30.242.57) port 8020 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: spark-yarn-master:8020
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Content-type: text/plain
* no chunk, no close, no size. Assume close to signal end
<
It looks like you are making an HTTP request to a Hadoop IPC port. This is not
the correct port for the web interface on this daemon.
* Closing connection 0
bash-4.2$
spark-yarn-worker container has below hostname and IP addr.
hostname: spark-yarn-worker-1-pshqi (pod name)
IP addr.: 172.17.0.12
hostname: spark-yarn-worker (service name)
IP addr.: 172.30.1.53
bash-4.2$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
172.17.0.12 spark-yarn-worker-1-pshqi
bash-4.2$
bash-4.2$ ip -4 addr show dev eth0
52: eth0@if53: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP link-netnsid 0
inet 172.17.0.12/16 scope global eth0
valid_lft forever preferred_lft forever
bash-4.2$
bash-4.2$ hostname -f
spark-yarn-worker-1-pshqi
bash-4.2$
bash-4.2$ curl -v spark-yarn-worker:8040
* About to connect() to spark-yarn-worker port 8040 (#0)
* Trying 172.30.1.53...
* Connected to spark-yarn-worker (172.30.1.53) port 8040 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: spark-yarn-worker:8040
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Content-type: text/plain
* no chunk, no close, no size. Assume close to signal end
<
It looks like you are making an HTTP request to a Hadoop IPC port. This is not
the correct port for the web interface on this daemon.
* Closing connection 0
bash-4.2$
Spark YARN master/worker nodes can connect each other by service name.
On master, To worker (service name):
bash-4.2$ hostname -f ; curl spark-yarn-worker:8040
spark-yarn-master-1-sxegt
It looks like you are making an HTTP request to a Hadoop IPC port. This is not
the correct port for the web interface on this daemon.
bash-4.2$
On worker, To master (service name):
bash-4.2$ hostname -f ; curl spark-yarn-master:8020
spark-yarn-worker-1-pshqi
It looks like you are making an HTTP request to a Hadoop IPC port. This is not
the correct port for the web interface on this daemon.
bash-4.2$
But, they cannot connect each other by hostname (pod name).
On master, To worker (hostname):
bash-4.2$ hostname -f ; curl spark-yarn-master-1-sxegt:8020
spark-yarn-worker-1-pshqi
curl: (6) Could not resolve host: spark-yarn-master-1-sxegt; Name or service
not known
bash-4.2$
On worker, To master (hostname):
bash-4.2$ hostname -f ; curl spark-yarn-worker-1-pshqi:8040
spark-yarn-master-1-sxegt
curl: (6) Could not resolve host: spark-yarn-worker-1-pshqi; Name or service
not known
bash-4.2$
This problem may be resolved by below either way:
* let hostname (pod name) resolved by OpenShift internal DNS as same as service
name.
* let service name (or arbitrary name) set at /etc/hosts (or hostname) by any
configurations.
But I do not know how can I do. Could you please let me know?
Regards,
dai
--
HIGUCHI Daisuke <[email protected]>
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users