[ 
https://issues.apache.org/jira/browse/SPARK-30850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Steadman updated SPARK-30850:
---------------------------------
    Description: 
Running Spark Standalone on IPv6-first infrastructure requires that RPC 
listeners bind IPv6 sockets e.g. {{:::7077}} instead of {{127.0.0.1:7077}} 
whilst advertising a public DNS name using {{SPARK_PUBLIC_DNS}}. Currently, the 
various RPC listeners can be configured with a *hostname/IPv4 address* 
(enforced via 
[{{Utils.chekcHost}})|[https://github.com/apache/spark/blob/3858e94ef99e3c1b5f6157db3968ab6ed417ccbb/core/src/main/scala/org/apache/spark/util/Utils.scala]]
 only e.g. via {{spark.driver.bindAddress}}, {{spark.driver.host}}, 
{{SPARK_LOCAL_IP}} e.t.c. which are eventually resolved using 
{{InetSocketAddress}} in 
[{{TransportServer}}|https://github.com/apache/spark/blob/84bd8085c74b5292cdccaa287023e39703062b07/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L136]

Initial experimentation led to a *hack* (absolutely not a real solution) to 
force this behaviour by setting {{bindAddress = null}} when the different 
`RPCEnv`s are created ([link to 
diff|https://gist.github.com/SteadBytes/b206335c8e2429486a13421b7cfcdc88]).

Since {{TransportServer}} does support the wildcard variant of 
{{InetSocketAddress}} if {{hostToBind = null}} 
[here|https://github.com/apache/spark/blob/84bd8085c74b5292cdccaa287023e39703062b07/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L137]
 this should be possible.

Would using the wildcard as a default instead of the local hostname be an 
option?

I'm prepared to contribute the work required and upstream any changes.

  was:
Running Spark Standalone on IPv6-first infrastructure requires that RPC 
listeners bind IPv6 sockets e.g. {{:::7077}} instead of {{127.0.0.1:7077}} 
whilst advertising a public DNS name using {{SPARK_PUBLIC_DNS}}. Currently, the 
various RPC listeners can be configured with a *hostname/IPv4 address* only 
e.g. via {{spark.driver.bindAddress}}, {{spark.driver.host}}, 
{{SPARK_LOCAL_IP}} e.t.c. which are eventually resolved using 
{{InetSocketAddress}} in 
[{{TransportServer}}|https://github.com/apache/spark/blob/84bd8085c74b5292cdccaa287023e39703062b07/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L136]

Initial experimentation led to a *hack* (absolutely not a real solution) to 
force this behaviour by setting {{bindAddress = null}} when the different 
`RPCEnv`s are created ([link to 
diff|https://gist.github.com/SteadBytes/b206335c8e2429486a13421b7cfcdc88]).

Since {{TransportServer}} does support the wildcard variant of 
{{InetSocketAddress}} if {{hostToBind = null}} 
[here|https://github.com/apache/spark/blob/84bd8085c74b5292cdccaa287023e39703062b07/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L137]
 this should be possible.

Would using the wildcard as a default instead of the local hostname be an 
option?

I'm prepared to contribute the work required and upstream any changes.


> RPC listener wildcard bindings (IPv6)
> -------------------------------------
>
>                 Key: SPARK-30850
>                 URL: https://issues.apache.org/jira/browse/SPARK-30850
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.5
>            Reporter: Ben Steadman
>            Priority: Major
>
> Running Spark Standalone on IPv6-first infrastructure requires that RPC 
> listeners bind IPv6 sockets e.g. {{:::7077}} instead of {{127.0.0.1:7077}} 
> whilst advertising a public DNS name using {{SPARK_PUBLIC_DNS}}. Currently, 
> the various RPC listeners can be configured with a *hostname/IPv4 address* 
> (enforced via 
> [{{Utils.chekcHost}})|[https://github.com/apache/spark/blob/3858e94ef99e3c1b5f6157db3968ab6ed417ccbb/core/src/main/scala/org/apache/spark/util/Utils.scala]]
>  only e.g. via {{spark.driver.bindAddress}}, {{spark.driver.host}}, 
> {{SPARK_LOCAL_IP}} e.t.c. which are eventually resolved using 
> {{InetSocketAddress}} in 
> [{{TransportServer}}|https://github.com/apache/spark/blob/84bd8085c74b5292cdccaa287023e39703062b07/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L136]
> Initial experimentation led to a *hack* (absolutely not a real solution) to 
> force this behaviour by setting {{bindAddress = null}} when the different 
> `RPCEnv`s are created ([link to 
> diff|https://gist.github.com/SteadBytes/b206335c8e2429486a13421b7cfcdc88]).
> Since {{TransportServer}} does support the wildcard variant of 
> {{InetSocketAddress}} if {{hostToBind = null}} 
> [here|https://github.com/apache/spark/blob/84bd8085c74b5292cdccaa287023e39703062b07/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L137]
>  this should be possible.
> Would using the wildcard as a default instead of the local hostname be an 
> option?
> I'm prepared to contribute the work required and upstream any changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to