This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.4 by this push:
new fc29b07a31d [SPARK-42656][CONNECT][FOLLOWUP] Fix the spark-connect
script
fc29b07a31d is described below
commit fc29b07a31da4be9142b6a0a7cdff5e72ab4edb7
Author: Zhen Li <[email protected]>
AuthorDate: Thu Mar 9 10:29:47 2023 +0900
[SPARK-42656][CONNECT][FOLLOWUP] Fix the spark-connect script
### What changes were proposed in this pull request?
The spark-connect script is broken as it need a jar at the end.
Also ensured when scala 2.13 is set, all commands in the scripts runs with
`-PScala-2.13`
Example usage:
Start spark connect with default settings:
* `./connector/connect/bin/spark-connect-shell`
* or `./connector/connect/bin/spark-connect` (Enter "q" <new line> to exit
the program)
Start Scala client with default settings:
`./connector/connect/bin/spark-connect-scala-client`
Start spark connect with extra configs:
* `./connector/connect/bin/spark-connect-shell --conf
spark.connect.grpc.binding.port=8888`
* or `./connector/connect/bin/spark-connect --conf
spark.connect.grpc.binding.port=8888`
Start Scala client with a connection string:
```
export SPARK_REMOTE="sc://localhost:8888/"
./connector/connect/bin/spark-connect-scala-client
```
### Why are the changes needed?
Bug fix
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manually tested on 2.12 and 2.13 for all the scripts changed.
Test example with expected results:
`./connector/connect/bin/spark-connect-shell` :
<img width="1050" alt="Screen Shot 2023-03-08 at 2 14 31 PM"
src="https://user-images.githubusercontent.com/4190164/223863343-d5d159d9-da7c-47c7-b55a-a2854c5f5d76.png">
Verify the spark connect server is started at the correct port, e.g.
```
>Telnet localhost 15002
Trying ::1...
Connected to localhost.
Escape character is '^]'.
```
`./connector/connect/bin/spark-connect`:
<img width="1680" alt="Screen Shot 2023-03-08 at 2 13 09 PM"
src="https://user-images.githubusercontent.com/4190164/223863099-41195599-c49d-4db4-a1e2-e129a649cd81.png">
Server started successfully when seeing the last line output.
`./connector/connect/bin/spark-connect-scala-client`:
<img width="1658" alt="Screen Shot 2023-03-08 at 2 11 58 PM"
src="https://user-images.githubusercontent.com/4190164/223862992-c8a3a36a-9f69-40b8-b82e-5dab85ed14ce.png">
Verify the client can run some simple quries.
Closes #40344 from zhenlineo/fix-scripts.
Authored-by: Zhen Li <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit b5243d7f7f9ede78a711eb168cf951f4bde7a8fa)
Signed-off-by: Hyukjin Kwon <[email protected]>
---
connector/connect/bin/spark-connect | 11 +++++++++--
connector/connect/bin/spark-connect-scala-client | 19 ++++++++++---------
connector/connect/bin/spark-connect-shell | 10 +++++++---
3 files changed, 26 insertions(+), 14 deletions(-)
diff --git a/connector/connect/bin/spark-connect
b/connector/connect/bin/spark-connect
index 62d0d36b441..772a88a04f3 100755
--- a/connector/connect/bin/spark-connect
+++ b/connector/connect/bin/spark-connect
@@ -26,7 +26,14 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR
+# Determine the Scala version used in Spark
+SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head
-n1 | awk -F '[<>]' '{print $3}'`
+SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"
+
# Build the jars needed for spark submit and spark connect
-build/sbt -Phive -Pconnect package
+build/sbt "${SCALA_ARG}" -Phive -Pconnect package
+
+# This jar is already in the classpath, but the submit commands wants a jar as
the input.
+CONNECT_JAR=`ls
"${SPARK_HOME}"/assembly/target/scala-"${SCALA_BINARY_VER}"/jars/spark-connect_*.jar
| paste -sd ',' -`
-exec "${SPARK_HOME}"/bin/spark-submit --class
org.apache.spark.sql.connect.SimpleSparkConnectService "$@"
\ No newline at end of file
+exec "${SPARK_HOME}"/bin/spark-submit "$@" --class
org.apache.spark.sql.connect.SimpleSparkConnectService "$CONNECT_JAR"
diff --git a/connector/connect/bin/spark-connect-scala-client
b/connector/connect/bin/spark-connect-scala-client
index 902091a74de..8c5e687ef24 100755
--- a/connector/connect/bin/spark-connect-scala-client
+++ b/connector/connect/bin/spark-connect-scala-client
@@ -34,17 +34,18 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR
-# Build the jars needed for spark connect JVM client
-build/sbt "sql/package;connect-client-jvm/assembly"
-
-CONNECT_CLASSPATH="$(build/sbt -DcopyDependencies=false "export
connect-client-jvm/fullClasspath" | grep jar | tail -n1)"
-SQL_CLASSPATH="$(build/sbt -DcopyDependencies=false "export sql/fullClasspath"
| grep jar | tail -n1)"
-
-INIT_SCRIPT="${SPARK_HOME}"/connector/connect/bin/spark-connect-scala-client.sc
-
# Determine the Scala version used in Spark
SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head
-n1 | awk -F '[<>]' '{print $3}'`
SCALA_VER=`grep "scala.version" "${SPARK_HOME}/pom.xml" | grep
${SCALA_BINARY_VER} | head -n1 | awk -F '[<>]' '{print $3}'`
SCALA_BIN="${SPARK_HOME}/build/scala-${SCALA_VER}/bin/scala"
+SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"
+
+# Build the jars needed for spark connect JVM client
+build/sbt "${SCALA_ARG}" "sql/package;connect-client-jvm/assembly"
+
+CONNECT_CLASSPATH="$(build/sbt "${SCALA_ARG}" -DcopyDependencies=false "export
connect-client-jvm/fullClasspath" | grep jar | tail -n1)"
+SQL_CLASSPATH="$(build/sbt "${SCALA_ARG}" -DcopyDependencies=false "export
sql/fullClasspath" | grep jar | tail -n1)"
+
+INIT_SCRIPT="${SPARK_HOME}"/connector/connect/bin/spark-connect-scala-client.sc
-exec "${SCALA_BIN}" -cp "$CONNECT_CLASSPATH:$SQL_CLASSPATH" -i $INIT_SCRIPT
\ No newline at end of file
+exec "${SCALA_BIN}" -cp "$CONNECT_CLASSPATH:$SQL_CLASSPATH" -i $INIT_SCRIPT
diff --git a/connector/connect/bin/spark-connect-shell
b/connector/connect/bin/spark-connect-shell
index b31ba1bf140..0fcf831e03d 100755
--- a/connector/connect/bin/spark-connect-shell
+++ b/connector/connect/bin/spark-connect-shell
@@ -26,7 +26,11 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR
-# Build the jars needed for spark shell and spark connect
-build/sbt -Phive -Pconnect package
+# Determine the Scala version used in Spark
+SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head
-n1 | awk -F '[<>]' '{print $3}'`
+SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"
-exec "${SPARK_HOME}"/bin/spark-shell --conf
spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin "$@"
\ No newline at end of file
+# Build the jars needed for spark submit and spark connect
+build/sbt "${SCALA_ARG}" -Phive -Pconnect package
+
+exec "${SPARK_HOME}"/bin/spark-shell --conf
spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin "$@"
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]