(incubator-gluten) branch main updated: [VL] Fix link issues found in release process (#9851)

philo Wed, 04 Jun 2025 06:35:21 -0700

This is an automated email from the ASF dual-hosted git repository.

philo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git



The following commit(s) were added to refs/heads/main by this push:
     new eda660b572 [VL] Fix link issues found in release process (#9851)
eda660b572 is described below

commit eda660b572c78a8aaf5ea0f9d217e5d0ca6340c7
Author: PHILO-HE <[email protected]>
AuthorDate: Wed Jun 4 21:33:53 2025 +0800

    [VL] Fix link issues found in release process (#9851)
---
 tools/gluten-it/README.md         | 10 +++++-----
 tools/gluten-it/sbin/gluten-it.sh |  2 ++
 tools/gluten-it/spark-home/jars   |  1 -
 tools/workload/tpch/README.md     | 23 +----------------------
 4 files changed, 8 insertions(+), 28 deletions(-)

diff --git a/tools/gluten-it/README.md b/tools/gluten-it/README.md
index 37ed7e82b4..a1f7ccc891 100644
--- a/tools/gluten-it/README.md
+++ b/tools/gluten-it/README.md
@@ -2,7 +2,7 @@
 
 The project makes it easy to test Gluten build locally.
 
-## Gluten ?
+## Gluten
 
 Gluten is a native Spark SQL implementation as a standard Spark plug-in.
 
@@ -10,11 +10,11 @@ https://github.com/apache/incubator-gluten
 
 ## Getting Started
 
-### 1. Install Gluten in your local machine
+### 1. Build Gluten
 
-See official Gluten build guidance 
https://github.com/apache/incubator-gluten#how-to-use-gluten
+See official Gluten build guidance 
https://github.com/apache/incubator-gluten#build-from-source.
 
-### 2. Install and run gluten-it with Spark version
+### 2. Build and run gluten-it
 
 ```sh
 cd gluten/tools/gluten-it
@@ -22,7 +22,7 @@ mvn clean package -P{Spark-Version}
 sbin/gluten-it.sh
 ```
 
-> Note: *Spark-Version* support *spark-3.2* and *spark-3.3* only
+Note: **Spark-Version** can only be **spark-3.2**, **spark-3.3**, 
**spark-3.4** or **spark-3.5**.
 
 ## Usage
 
diff --git a/tools/gluten-it/sbin/gluten-it.sh 
b/tools/gluten-it/sbin/gluten-it.sh
index 8c1a6413b5..23c3512470 100755
--- a/tools/gluten-it/sbin/gluten-it.sh
+++ b/tools/gluten-it/sbin/gluten-it.sh
@@ -30,6 +30,8 @@ SPARK_JVM_OPTIONS=$($JAVA_HOME/bin/java -cp $JAR_PATH 
org.apache.gluten.integrat
 
 EMBEDDED_SPARK_HOME=$BASEDIR/../spark-home
 
+mkdir $EMBEDDED_SPARK_HOME && ln -snf $BASEDIR/../package/target/lib 
$EMBEDDED_SPARK_HOME/jars
+
 # We temporarily disallow setting these two variables by caller.
 SPARK_HOME=""
 SPARK_SCALA_VERSION=""
diff --git a/tools/gluten-it/spark-home/jars b/tools/gluten-it/spark-home/jars
deleted file mode 120000
index 2939305caa..0000000000
--- a/tools/gluten-it/spark-home/jars
+++ /dev/null
@@ -1 +0,0 @@
-../package/target/lib
\ No newline at end of file
diff --git a/tools/workload/tpch/README.md b/tools/workload/tpch/README.md
index 4180df60f8..10a8930583 100644
--- a/tools/workload/tpch/README.md
+++ b/tools/workload/tpch/README.md
@@ -1,7 +1,7 @@
 # Test on Velox backend with TPC-H workload
 
 ## Test datasets
-Parquet and DWRF(a fork of the ORC file format) format files are both 
supported. Here are the steps to generate the testing datasets:
+Parquet and DWRF (a fork of the ORC file format) format files are both 
supported. Here are the steps to generate the testing datasets:
 
 ### Generate the Parquet dataset
 Please refer to the scripts in [parquet_dataset](./gen_data/parquet_dataset/) 
directory to generate parquet dataset. Note this script relies on the 
[spark-sql-perf](https://github.com/databricks/spark-sql-perf) and 
[tpch-dbgen](https://github.com/databricks/tpch-dbgen) package from Databricks. 
Note in the tpch-dbgen kits, we need to do a slight modification to allow Spark 
to convert the csv based content to parquet, please make sure to use this 
commit: [0469309147b42abac8857fa61b4cf69a6d [...]
@@ -26,27 +26,6 @@ val rootDir = "/PATH/TO/TPCH_PARQUET_PATH" // root directory 
of location to crea
 val dbgenDir = "/PATH/TO/TPCH_DBGEN" // location of dbgen
 ```
 
-Currently, Gluten with Velox can support both Parquet and DWRF file format and 
three compression codec including snappy, gzip, zstd.
-Below step, to convert Parquet to DWRF, is optional if you are using Parquet 
format to run the testing.
-
-### Convert the Parquet dataset to DWRF dataset(OPTIONAL)
-And then please refer to the scripts in 
[dwrf_dataset](./gen_data/dwrf_dataset/) directory to convert the Parquet 
dataset to DWRF dataset.
-
-In tpch_convert_parquet_dwrf.sh, spark configures should be set according to 
the system.
-
-```
-export GLUTEN_HOME=/PATH/TO/gluten
-...
---executor-cores 8                                      \
---num-executors 14                                       \
-```
-
-In tpch_convert_parquet_dwrf.scala, the table path should be configured.
-```
-val parquet_file_path = "/PATH/TO/TPCH_PARQUET_PATH"
-val dwrf_file_path = "/PATH/TO/TPCH_DWRF_PATH"
-```
-
 ## Test Queries
 We provide the test queries in [TPC-H 
queries](../../../tools/gluten-it/common/src/main/resources/tpch-queries).
 We also provide a scala script in [Run TPC-H](./run_tpch/) directory about how 
to run TPC-H queries.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(incubator-gluten) branch main updated: [VL] Fix link issues found in release process (#9851)

Reply via email to