We are deploying Spark on k8s cluster. We are facing one issue with respect to Spark master IP from a worker perspective. The Spark master is exposed as a service @ 10.3.0.175:7077.
Spark worker registers with the master, but saves the pod IP, instead of the service IP. Following are related logs. 17/03/18 03:33:15 SPARK_WORKER INFO Utils: Successfully started service 'WorkerUI' on port 8081. 17/03/18 03:33:15 SPARK_WORKER INFO WorkerWebUI: Bound WorkerWebUI to 0.0.0.0, and started at http://10.2.58.40:8081 17/03/18 03:33:15 SPARK_WORKER INFO Worker: Connecting to master 10.3.0.175:7077... 17/03/18 03:33:15 SPARK_WORKER INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1302ede4{/metrics/json,null,AVAILABLE} 17/03/18 03:33:15 SPARK_WORKER INFO TransportClientFactory: Successfully created connection to /10.3.0.175:7077 after 89 ms (0 ms spent in bootstraps) 17/03/18 03:33:15 SPARK_WORKER INFO Worker: Successfully registered with master spark://10.2.58.68:7077 As can be seen, what gets registered is the pod IP - spark://10.2.58.68:7077. The issue due to this behavior is that when spark master pod dies or restarts, the spark worker does not act on the disconnected message, since it checks if the disconnected IP (here, we get service IP - 10.3.0.175) matches with the locally stored IP (in this case, it is 10.2.58.68). Any suggestions how I can override this behavior, without changing spark code? If this can be achieved only by changing spark behavior, please share ideas how to go about. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-master-IP-on-Kubernetes-tp28507.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org