This is an automated email from the ASF dual-hosted git repository.
philo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git
The following commit(s) were added to refs/heads/main by this push:
new 65046f066a [MINOR] Fix script name typos for TPC-H/DS data generation
(#11305)
65046f066a is described below
commit 65046f066af44f7fc6c6c7504e3694d68885be12
Author: PHILO-HE <[email protected]>
AuthorDate: Wed Dec 17 18:22:32 2025 +0800
[MINOR] Fix script name typos for TPC-H/DS data generation (#11305)
---
docs/get-started/Velox.md | 4 ++--
tools/workload/tpcds-delta/README.md | 4 ++--
.../gen_data/{tpcds-dategen-delta.sh => tpcds-datagen-delta.sh} | 0
tools/workload/tpcds/README.md | 2 +-
.../{tpcds-dategen-parquet.sh => tpcds-datagen-parquet.sh} | 0
tools/workload/tpch/README.md | 2 +-
.../{tpch-dategen-parquet.sh => tpch-datagen-parquet.sh} | 0
7 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/docs/get-started/Velox.md b/docs/get-started/Velox.md
index 2439db55cd..f722b596c4 100644
--- a/docs/get-started/Velox.md
+++ b/docs/get-started/Velox.md
@@ -473,8 +473,8 @@ All TPC-H and TPC-DS queries are supported in Gluten Velox
backend. You may refe
## Data preparation
-The data generation scripts are [TPC-H dategen
script](../../tools/workload/tpch/gen_data/parquet_dataset/tpch-dategen-parquet.sh)
and
-[TPC-DS dategen
script](../../tools/workload/tpcds/gen_data/parquet_dataset/tpcds-dategen-parquet.sh).
+The data generation scripts are [TPC-H dategen
script](../../tools/workload/tpch/gen_data/parquet_dataset/tpch-datagen-parquet.sh)
and
+[TPC-DS dategen
script](../../tools/workload/tpcds/gen_data/parquet_dataset/tpcds-datagen-parquet.sh).
The used TPC-H and TPC-DS queries are the original ones, and can be accessed
from [TPC-DS
queries](../../tools/gluten-it/common/src/main/resources/tpcds-queries)
and [TPC-H
queries](../../tools/gluten-it/common/src/main/resources/tpch-queries).
diff --git a/tools/workload/tpcds-delta/README.md
b/tools/workload/tpcds-delta/README.md
index 11e9ea01dc..6193d561a0 100644
--- a/tools/workload/tpcds-delta/README.md
+++ b/tools/workload/tpcds-delta/README.md
@@ -4,7 +4,7 @@ This workload example is verified with JDK 8, Spark 3.4.4 and
Delta 2.4.0.
## Test dataset
-Use bash script `tpcds-dategen-delta.sh` to generate the data. The script
relies on a already-built gluten-it
+Use bash script `tpcds-datagen-delta.sh` to generate the data. The script
relies on a already-built gluten-it
executable. To build it, following the steps:
```bash
@@ -16,7 +16,7 @@ Then call the data generator script:
```bash
cd ${GLUTEN_HOME}/tools/workload/tpcds-delta/gen_data
-./tpcds-dategen-delta.sh
+./tpcds-datagen-delta.sh
```
Meanings of the commands that are used in the script are explained as follows:
diff --git a/tools/workload/tpcds-delta/gen_data/tpcds-dategen-delta.sh
b/tools/workload/tpcds-delta/gen_data/tpcds-datagen-delta.sh
similarity index 100%
rename from tools/workload/tpcds-delta/gen_data/tpcds-dategen-delta.sh
rename to tools/workload/tpcds-delta/gen_data/tpcds-datagen-delta.sh
diff --git a/tools/workload/tpcds/README.md b/tools/workload/tpcds/README.md
index a64701089e..2fc5e20055 100644
--- a/tools/workload/tpcds/README.md
+++ b/tools/workload/tpcds/README.md
@@ -7,7 +7,7 @@ Parquet format is supported. Here are the steps to generate the
testing datasets
Please refer to the scripts in [parquet_dataset](./gen_data/parquet_dataset/)
directory to generate parquet dataset.
Note this script relies on the
[spark-sql-perf](https://github.com/databricks/spark-sql-perf) and the
[tpcds-kit](https://github.com/databricks/tpcds-kit) package from Databricks.
-In tpcds-dategen-parquet.sh, several parameters should be configured according
to the system.
+In tpcds-datagen-parquet.sh, several parameters should be configured according
to the system.
```
spark_sql_perf_jar=/PATH/TO/spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar
...
diff --git
a/tools/workload/tpcds/gen_data/parquet_dataset/tpcds-dategen-parquet.sh
b/tools/workload/tpcds/gen_data/parquet_dataset/tpcds-datagen-parquet.sh
similarity index 100%
rename from
tools/workload/tpcds/gen_data/parquet_dataset/tpcds-dategen-parquet.sh
rename to tools/workload/tpcds/gen_data/parquet_dataset/tpcds-datagen-parquet.sh
diff --git a/tools/workload/tpch/README.md b/tools/workload/tpch/README.md
index 5d288229f9..d672c84d30 100644
--- a/tools/workload/tpch/README.md
+++ b/tools/workload/tpch/README.md
@@ -7,7 +7,7 @@ Parquet and DWRF (a fork of the ORC file format) format files
are both supported
Please refer to the scripts in [parquet_dataset](./gen_data/parquet_dataset/)
directory to generate parquet dataset. Note this script relies on the
[spark-sql-perf](https://github.com/databricks/spark-sql-perf) and
[tpch-dbgen](https://github.com/databricks/tpch-dbgen) package from Databricks.
Note in the tpch-dbgen kits, we need to do a slight modification to allow Spark
to convert the csv based content to parquet, please make sure to use this
commit: [0469309147b42abac8857fa61b4cf69a6d [...]
-In tpch-dategen-parquet.sh, several parameters should be configured according
to the system.
+In tpch-datagen-parquet.sh, several parameters should be configured according
to the system.
```
spark_sql_perf_jar=/PATH/TO/spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar
...
diff --git
a/tools/workload/tpch/gen_data/parquet_dataset/tpch-dategen-parquet.sh
b/tools/workload/tpch/gen_data/parquet_dataset/tpch-datagen-parquet.sh
similarity index 100%
rename from tools/workload/tpch/gen_data/parquet_dataset/tpch-dategen-parquet.sh
rename to tools/workload/tpch/gen_data/parquet_dataset/tpch-datagen-parquet.sh
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]