Hi All, Why is it not possible to specify cluster as the deploy mode for Spark Connect?
As discussed in the following thread, it appears that there is an "arbitrary decision" within spark-submit that "Cluster mode is not applicable" to Spark Connect. GitHub Issue Comment: https://github.com/kubeflow/spark-operator/issues/1801#issuecomment-2000494607 > This will circumvent the submission error you may have gotten if you tried to just run the SparkConnectServer directly. From my investigation, that looks to be an arbitrary decision within spark-submit that Cluster mode is "not applicable" to SparkConnect. Which is sort of true except when using this operator :) I have reviewed the following commit and pull request, but I could not find any discussion or reason explaining why cluster mode is not available: Related Commit: https://github.com/apache/spark/commit/11260310f65e1a30f6b00b380350e414609c5fd4 Related Pull Request: https://github.com/apache/spark/pull/39928 This restriction poses a significant obstacle when trying to use Spark Connect with the Spark Operator. If there is a technical reason for this, I would like to know more about it. Additionally, if this issue is being tracked on JIRA or elsewhere, I would appreciate it if you could provide a link. Thank you in advance. Best regards, Yasukazu Nagatomi