pan3793 commented on code in PR #47402:
URL: https://github.com/apache/spark/pull/47402#discussion_r1701610965
##########
bin/spark-shell:
##########
@@ -44,8 +44,53 @@ Scala REPL options:
# through spark.driver.extraClassPath is not automatically propagated.
SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Dscala.usejavacp=true"
+# In order to start Spark Connect shell, we should identify if spark.remote
+# or --remote is set. Spark Connect does not support loading configurations
+# yet.
+connect_shell=false
+cur_arg="$0"
+for arg in "${@:1}"
+do
+ # --conf spark.remote=... or -c spark.remote=...
+ if [[ $cur_arg == "--conf" || $cur_arg == "-c" ]]; then
+ if [[ $arg == "spark.remote"* ]]; then
+ connect_shell=true
+ fi
+ fi
+
+ # --conf=spark.remote=... or -c=spark.remote=...
+ if [[ $arg == "--conf=spark.remote"* || $arg == "-c=spark.remote"* ]]; then
+ connect_shell=true
+ fi
+
+ # --remote= or --remote
+ if [[ $arg == "--remote"* ]]; then
+ connect_shell=true
+ fi
+ cur_arg=$arg
+done
+
function main() {
- if $cygwin; then
+ if $connect_shell; then
+ export SPARK_SUBMIT_OPTS
+ export SPARK_CONNECT_SHELL=1
+ if [ -d "${SPARK_HOME}/jars" ]; then
+ # Production code path
+ coordinate=$(find "${SPARK_HOME}/jars" -type f -name 'spark-connect_*')
+ coordinate=$(basename $coordinate)
+ sparkver=${${${coordinate##*_}%.jar*}#*-}
+ scalaver=${${${coordinate##*_}%.jar*}%%-*}
+ "${SPARK_HOME}"/bin/spark-submit \
+ --class org.apache.spark.sql.application.ConnectRepl \
+ --packages
com.lihaoyi:ammonite_2.13.14:3.0.0-M2,org.apache.spark:spark-connect-client-jvm_$scalaver:$sparkver
--name "Connect shell" "$@"
Review Comment:
is this a temporary workaround or designed as a long-term solution?
I know that we can not put `spark-connect-client-jvm` into `jars` due to
class name issues, but this requires the user to download the jars from the
internet (or an private maven repo) when executing `spark-shell --remote xxx`
for the first time , this may not be a good solution for users deploying Spark
to environments with restricted Internet access.
How about including these jars into spark binary tgz in a different folder?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]