Re: [PR] [SPARK-47952][CORE][CONNECT] Support retrieving the real SparkConnectService GRPC address and port programmatically when running on Yarn [spark]

via GitHub Mon, 29 Apr 2024 23:10:26 -0700


TakawaAkirayo commented on PR #46182:
URL: https://github.com/apache/spark/pull/46182#issuecomment-2084451692


   > One thing I'm wondering if it might work out of the box is the ability to 
specify an ephemeral port for the spark connect service and pick this up during 
startup.
   > 
   > This might alleviate you from using the startup utils to find an open port 
and rely on the system to provide you with one. Since you don't care about the 
actual port, that might be good enough.
   
   @grundprinzip I think your idea could definitely work if there are no port 
limitations. 
   
   However, one issue we're currently facing internally is that the connection 
between the client side and the server side is restricted on any port due to 
internal software firewalls between the Hadoop cluster and provisioned 
notebooks. The server could pick up a port that not allowed by the firewall. 
   So we have to request a whitelist for a range of ports (One time request for 
a pool that provisioned on k8s, can not request a big range of ports like 0 ~ 
65535). 
   
   Therefore, using retry with monotone increasing port numbers can help limit 
the port selection within the expected port range.
   
   Not sure if our case is a corner case, please share your thoughts, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-47952][CORE][CONNECT] Support retrieving the real SparkConnectService GRPC address and port programmatically when running on Yarn [spark]

Reply via email to