This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
     new fc29b07a31d [SPARK-42656][CONNECT][FOLLOWUP] Fix the spark-connect 
script
fc29b07a31d is described below

commit fc29b07a31da4be9142b6a0a7cdff5e72ab4edb7
Author: Zhen Li <[email protected]>
AuthorDate: Thu Mar 9 10:29:47 2023 +0900

    [SPARK-42656][CONNECT][FOLLOWUP] Fix the spark-connect script
    
    ### What changes were proposed in this pull request?
    The spark-connect script is broken as it need a jar at the end.
    Also ensured when scala 2.13 is set, all commands in the scripts runs with 
`-PScala-2.13`
    
    Example usage:
    Start spark connect with default settings:
    * `./connector/connect/bin/spark-connect-shell`
    * or `./connector/connect/bin/spark-connect` (Enter "q" <new line> to exit 
the program)
    
    Start Scala client with default settings: 
`./connector/connect/bin/spark-connect-scala-client`
    
    Start spark connect with extra configs:
    * `./connector/connect/bin/spark-connect-shell --conf 
spark.connect.grpc.binding.port=8888`
    * or `./connector/connect/bin/spark-connect --conf 
spark.connect.grpc.binding.port=8888`
    
    Start Scala client with a connection string:
    ```
    export SPARK_REMOTE="sc://localhost:8888/"
    ./connector/connect/bin/spark-connect-scala-client
    ```
    
    ### Why are the changes needed?
    Bug fix
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Manually tested on 2.12 and 2.13 for all the scripts changed.
    
    Test example with expected results:
    `./connector/connect/bin/spark-connect-shell` :
    <img width="1050" alt="Screen Shot 2023-03-08 at 2 14 31 PM" 
src="https://user-images.githubusercontent.com/4190164/223863343-d5d159d9-da7c-47c7-b55a-a2854c5f5d76.png";>
    
    Verify the spark connect server is started at the correct port, e.g.
    ```
    >Telnet localhost 15002
    Trying ::1...
    Connected to localhost.
    Escape character is '^]'.
    ```
    
    `./connector/connect/bin/spark-connect`:
    <img width="1680" alt="Screen Shot 2023-03-08 at 2 13 09 PM" 
src="https://user-images.githubusercontent.com/4190164/223863099-41195599-c49d-4db4-a1e2-e129a649cd81.png";>
    Server started successfully when seeing the last line output.
    
    `./connector/connect/bin/spark-connect-scala-client`:
    <img width="1658" alt="Screen Shot 2023-03-08 at 2 11 58 PM" 
src="https://user-images.githubusercontent.com/4190164/223862992-c8a3a36a-9f69-40b8-b82e-5dab85ed14ce.png";>
    Verify the client can run some simple quries.
    
    Closes #40344 from zhenlineo/fix-scripts.
    
    Authored-by: Zhen Li <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
    (cherry picked from commit b5243d7f7f9ede78a711eb168cf951f4bde7a8fa)
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 connector/connect/bin/spark-connect              | 11 +++++++++--
 connector/connect/bin/spark-connect-scala-client | 19 ++++++++++---------
 connector/connect/bin/spark-connect-shell        | 10 +++++++---
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/connector/connect/bin/spark-connect 
b/connector/connect/bin/spark-connect
index 62d0d36b441..772a88a04f3 100755
--- a/connector/connect/bin/spark-connect
+++ b/connector/connect/bin/spark-connect
@@ -26,7 +26,14 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
 cd "$FWDIR"
 export SPARK_HOME=$FWDIR
 
+# Determine the Scala version used in Spark
+SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head 
-n1 | awk -F '[<>]' '{print $3}'`
+SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"
+
 # Build the jars needed for spark submit and spark connect
-build/sbt -Phive -Pconnect package
+build/sbt "${SCALA_ARG}" -Phive -Pconnect package
+
+# This jar is already in the classpath, but the submit commands wants a jar as 
the input.
+CONNECT_JAR=`ls 
"${SPARK_HOME}"/assembly/target/scala-"${SCALA_BINARY_VER}"/jars/spark-connect_*.jar
 | paste -sd ',' -`
 
-exec "${SPARK_HOME}"/bin/spark-submit --class 
org.apache.spark.sql.connect.SimpleSparkConnectService "$@"
\ No newline at end of file
+exec "${SPARK_HOME}"/bin/spark-submit "$@" --class 
org.apache.spark.sql.connect.SimpleSparkConnectService "$CONNECT_JAR"
diff --git a/connector/connect/bin/spark-connect-scala-client 
b/connector/connect/bin/spark-connect-scala-client
index 902091a74de..8c5e687ef24 100755
--- a/connector/connect/bin/spark-connect-scala-client
+++ b/connector/connect/bin/spark-connect-scala-client
@@ -34,17 +34,18 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
 cd "$FWDIR"
 export SPARK_HOME=$FWDIR
 
-# Build the jars needed for spark connect JVM client
-build/sbt "sql/package;connect-client-jvm/assembly"
-
-CONNECT_CLASSPATH="$(build/sbt -DcopyDependencies=false "export 
connect-client-jvm/fullClasspath" | grep jar | tail -n1)"
-SQL_CLASSPATH="$(build/sbt -DcopyDependencies=false "export sql/fullClasspath" 
| grep jar | tail -n1)"
-
-INIT_SCRIPT="${SPARK_HOME}"/connector/connect/bin/spark-connect-scala-client.sc
-
 # Determine the Scala version used in Spark
 SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head 
-n1 | awk -F '[<>]' '{print $3}'`
 SCALA_VER=`grep "scala.version" "${SPARK_HOME}/pom.xml" | grep 
${SCALA_BINARY_VER} | head -n1 | awk -F '[<>]' '{print $3}'`
 SCALA_BIN="${SPARK_HOME}/build/scala-${SCALA_VER}/bin/scala"
+SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"
+
+# Build the jars needed for spark connect JVM client
+build/sbt "${SCALA_ARG}" "sql/package;connect-client-jvm/assembly"
+
+CONNECT_CLASSPATH="$(build/sbt "${SCALA_ARG}" -DcopyDependencies=false "export 
connect-client-jvm/fullClasspath" | grep jar | tail -n1)"
+SQL_CLASSPATH="$(build/sbt "${SCALA_ARG}" -DcopyDependencies=false "export 
sql/fullClasspath" | grep jar | tail -n1)"
+
+INIT_SCRIPT="${SPARK_HOME}"/connector/connect/bin/spark-connect-scala-client.sc
 
-exec "${SCALA_BIN}" -cp "$CONNECT_CLASSPATH:$SQL_CLASSPATH" -i $INIT_SCRIPT
\ No newline at end of file
+exec "${SCALA_BIN}" -cp "$CONNECT_CLASSPATH:$SQL_CLASSPATH" -i $INIT_SCRIPT
diff --git a/connector/connect/bin/spark-connect-shell 
b/connector/connect/bin/spark-connect-shell
index b31ba1bf140..0fcf831e03d 100755
--- a/connector/connect/bin/spark-connect-shell
+++ b/connector/connect/bin/spark-connect-shell
@@ -26,7 +26,11 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
 cd "$FWDIR"
 export SPARK_HOME=$FWDIR
 
-# Build the jars needed for spark shell and spark connect
-build/sbt -Phive -Pconnect package
+# Determine the Scala version used in Spark
+SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head 
-n1 | awk -F '[<>]' '{print $3}'`
+SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"
 
-exec "${SPARK_HOME}"/bin/spark-shell --conf 
spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin "$@"
\ No newline at end of file
+# Build the jars needed for spark submit and spark connect
+build/sbt "${SCALA_ARG}" -Phive -Pconnect package
+
+exec "${SPARK_HOME}"/bin/spark-shell --conf 
spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin "$@"


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to