There is a discussion upstream about this, and ensuring that names for pods can be resolved. In the short term it is not possible to resolve the pod hostname to itself. Can the worker run without being able to resolve the hostname?
On Sun, Mar 6, 2016 at 8:11 PM, HIGUCHI Daisuke <[email protected]> wrote: > Hello, > > I build Spark on YARN cluster on OpenShift Origin All-On-One VM. > > * All-In-One Virtual Machine (Version 1.1.3.1) > * origin version > - origin v1.1.2 > - kubernetes v1.2.0-alpha.4-851-g4a65fa1 > - etcd 2.2.2 > * Spark on YARN version > - Spark 1.6.0 > - Hadoop 2.6.0 (CDH 5.6.0) > - Oracle Java 1.8.0_74 > > There are one HDFS/YARN master and one HDFS/YARN worker on each pods. > > [vagrant@origin ~]$ oc get pods > NAME READY STATUS RESTARTS AGE > docker-registry-1-z3skh 1/1 Running 0 20d > router-1-3jatj 1/1 Running 0 20d > spark-yarn-master-1-sxegt 1/1 Running 0 31m > spark-yarn-worker-1-pshqi 1/1 Running 0 31m > [vagrant@origin ~]$ > > [vagrant@origin ~]$ oc get svc > NAME CLUSTER_IP EXTERNAL_IP PORT(S) > SELECTOR > AGE > docker-registry 172.30.93.207 <none> 5000/TCP > > docker-registry=default 20d > kubernetes 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP > <none> > 20d > router 172.30.241.166 <none> 80/TCP > > router=router 20d > spark-yarn-master 172.30.242.57 <none> > 8020/TCP,50070/TCP,50090/TCP,8030/TCP,8031/TCP,8032/TCP,8033/TCP,8088/TCP,10020/TCP,19888/TCP > deploymentconfig=spark-yarn-master 35m > spark-yarn-worker 172.30.1.53 <none> > 50010/TCP,50020/TCP,50075/TCP,8040/TCP,8042/TCP > deploymentconfig=spark-yarn-worker 35m > [vagrant@origin ~]$ > > spark-yarn-master container has below hostname and IP addr. > > hostname: spark-yarn-master-1-sxegt (pod name) > IP addr.: 172.17.0.11 > > hostname: spark-yarn-master (service name) > IP addr.: 172.30.242.57 > > bash-4.2$ cat /etc/hosts > # Kubernetes-managed hosts file. > 127.0.0.1 localhost > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > fe00::0 ip6-mcastprefix > fe00::1 ip6-allnodes > fe00::2 ip6-allrouters > 172.17.0.11 spark-yarn-master-1-sxegt > bash-4.2$ > > bash-4.2$ ip -4 addr show dev eth0 > 50: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state > UP link-netnsid 0 > inet 172.17.0.11/16 scope global eth0 > valid_lft forever preferred_lft forever > bash-4.2$ > > bash-4.2$ hostname -f > spark-yarn-master-1-sxegt > bash-4.2$ > > bash-4.2$ curl -v spark-yarn-master:8020 > * About to connect() to spark-yarn-master port 8020 (#0) > * Trying 172.30.242.57... > * Connected to spark-yarn-master (172.30.242.57) port 8020 (#0) >> GET / HTTP/1.1 >> User-Agent: curl/7.29.0 >> Host: spark-yarn-master:8020 >> Accept: */* >> > < HTTP/1.1 404 Not Found > < Content-type: text/plain > * no chunk, no close, no size. Assume close to signal end > < > It looks like you are making an HTTP request to a Hadoop IPC port. This is > not the correct port for the web interface on this daemon. > * Closing connection 0 > bash-4.2$ > > spark-yarn-worker container has below hostname and IP addr. > > hostname: spark-yarn-worker-1-pshqi (pod name) > IP addr.: 172.17.0.12 > > hostname: spark-yarn-worker (service name) > IP addr.: 172.30.1.53 > > bash-4.2$ cat /etc/hosts > # Kubernetes-managed hosts file. > 127.0.0.1 localhost > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > fe00::0 ip6-mcastprefix > fe00::1 ip6-allnodes > fe00::2 ip6-allrouters > 172.17.0.12 spark-yarn-worker-1-pshqi > bash-4.2$ > > bash-4.2$ ip -4 addr show dev eth0 > 52: eth0@if53: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state > UP link-netnsid 0 > inet 172.17.0.12/16 scope global eth0 > valid_lft forever preferred_lft forever > bash-4.2$ > > bash-4.2$ hostname -f > spark-yarn-worker-1-pshqi > bash-4.2$ > > bash-4.2$ curl -v spark-yarn-worker:8040 > * About to connect() to spark-yarn-worker port 8040 (#0) > * Trying 172.30.1.53... > * Connected to spark-yarn-worker (172.30.1.53) port 8040 (#0) >> GET / HTTP/1.1 >> User-Agent: curl/7.29.0 >> Host: spark-yarn-worker:8040 >> Accept: */* >> > < HTTP/1.1 404 Not Found > < Content-type: text/plain > * no chunk, no close, no size. Assume close to signal end > < > It looks like you are making an HTTP request to a Hadoop IPC port. This is > not the correct port for the web interface on this daemon. > * Closing connection 0 > bash-4.2$ > > Spark YARN master/worker nodes can connect each other by service name. > > On master, To worker (service name): > > bash-4.2$ hostname -f ; curl spark-yarn-worker:8040 > spark-yarn-master-1-sxegt > It looks like you are making an HTTP request to a Hadoop IPC port. This is > not the correct port for the web interface on this daemon. > bash-4.2$ > > On worker, To master (service name): > > bash-4.2$ hostname -f ; curl spark-yarn-master:8020 > spark-yarn-worker-1-pshqi > It looks like you are making an HTTP request to a Hadoop IPC port. This is > not the correct port for the web interface on this daemon. > bash-4.2$ > > But, they cannot connect each other by hostname (pod name). > > On master, To worker (hostname): > > bash-4.2$ hostname -f ; curl spark-yarn-master-1-sxegt:8020 > spark-yarn-worker-1-pshqi > curl: (6) Could not resolve host: spark-yarn-master-1-sxegt; Name or service > not known > bash-4.2$ > > On worker, To master (hostname): > > bash-4.2$ hostname -f ; curl spark-yarn-worker-1-pshqi:8040 > spark-yarn-master-1-sxegt > curl: (6) Could not resolve host: spark-yarn-worker-1-pshqi; Name or service > not known > bash-4.2$ > > This problem may be resolved by below either way: > > * let hostname (pod name) resolved by OpenShift internal DNS as same as > service name. > * let service name (or arbitrary name) set at /etc/hosts (or hostname) by any > configurations. > > But I do not know how can I do. Could you please let me know? > > Regards, > dai > -- > HIGUCHI Daisuke <[email protected]> > > _______________________________________________ > users mailing list > [email protected] > http://lists.openshift.redhat.com/openshiftmm/listinfo/users _______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
