Github user ash211 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1777#discussion_r15796591
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1331,4 +1331,49 @@ private[spark] object Utils extends Logging {
.map { case (k, v) => s"-D$k=$v" }
}
+ /**
+ * Attempt to start a service on the given port, or fail after a number
of attempts.
+ * Each subsequent attempt uses 1 + the port used in the previous
attempt.
+ *
+ * @param startPort The initial port to start the service on.
+ * @param maxRetries Maximum number of retries to attempt.
+ * A value of 3 means attempting ports n, n+1, n+2,
and n+3, for example.
+ * @param startService Function to start service on a given port.
+ * This is expected to throw java.net.BindException
on port collision.
+ * @throws SparkException When unable to start the service after a given
number of attempts
+ */
+ def startServiceOnPort[T](
+ startPort: Int,
+ startService: Int => (T, Int),
+ serviceName: String = "",
+ maxRetries: Int = 3): (T, Int) = {
--- End diff --
Part of me is worried that the maxRetries parameter here is an effective
cap on the number of concurrent shells/executors/drivers that can be run on one
machine at once when in restricted firewall mode. Because in Standalone mode
each app gets its own set of executors across the cluster, this is a cap on the
number of concurrent applications on a cluster.
What do you think of creating a config option, say spark.ports.maxRetries
that can be set to change this? I might change the default to be a bit higher,
like 16 or so.
The way I'd expect network teams to run this then, is set
spark.ports.maxRetries to say 20, and then open up a range of size 20 starting
from each of the relevant ports in the config list you pasted in the summary
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]