This is an automated email from the ASF dual-hosted git repository.
kazuyukitanimura pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git
The following commit(s) were added to refs/heads/main by this push:
new c819bc0b Small changes in docs (#512)
c819bc0b is described below
commit c819bc0b0d3d1c98e6b36fcafcf184f5bb4b2c2c
Author: Semyon <[email protected]>
AuthorDate: Thu Jun 6 00:13:04 2024 +0200
Small changes in docs (#512)
## Which issue does this PR close?
Closes #503
Closes #191
## Rationale for this change
1. Provide a way to build Comet from the source on an isolated environments
with an access to github.com
2. Update documentation in part, related to compatibility of Spark AQE and
Comet Shuffle
## What changes are included in this PR?
- Update tuning section about the compatibility of Shuffle and Spark AQE
- Add `release-nogit` for building on an isolated environments
- Update docs in the section about an installation process
Changes to be committed:
modified: Makefile
modified: docs/source/user-guide/installation.md
modified: docs/source/user-guide/tuning.md
## How are these changes tested?
I run both `make release` and `make release-nogit`. The first one created
properties file in `common/target/classes` but the second did not. The flag
`-Dmaven.gitcommitid.skip=true` is described in [this
comment](https://github.com/git-commit-id/git-commit-id-maven-plugin/issues/392#issuecomment-432309487).
---
Makefile | 3 +++
docs/source/user-guide/installation.md | 6 ++++++
docs/source/user-guide/tuning.md | 2 ++
3 files changed, 11 insertions(+)
diff --git a/Makefile b/Makefile
index b9b9707b..573a7f95 100644
--- a/Makefile
+++ b/Makefile
@@ -77,6 +77,9 @@ release-linux: clean
release:
cd core && RUSTFLAGS="-Ctarget-cpu=native" cargo build --release
./mvnw install -Prelease -DskipTests $(PROFILES)
+release-nogit:
+ cd core && RUSTFLAGS="-Ctarget-cpu=native" cargo build --features
nightly --release
+ ./mvnw install -Prelease -DskipTests $(PROFILES)
-Dmaven.gitcommitid.skip=true
benchmark-%: clean release
cd spark && COMET_CONF_DIR=$(shell pwd)/conf MAVEN_OPTS='-Xmx20g'
../mvnw exec:java -Dexec.mainClass="$*" -Dexec.classpathScope="test"
-Dexec.cleanupDaemonThreads="false" -Dexec.args="$(filter-out
$@,$(MAKECMDGOALS))" $(PROFILES)
.DEFAULT:
diff --git a/docs/source/user-guide/installation.md
b/docs/source/user-guide/installation.md
index 03ecc53e..7335a488 100644
--- a/docs/source/user-guide/installation.md
+++ b/docs/source/user-guide/installation.md
@@ -57,6 +57,12 @@ Note that the project builds for Scala 2.12 by default but
can be built for Scal
make release PROFILES="-Pspark-3.4 -Pscala-2.13"
```
+To build Comet from the source distribution on an isolated environment without
an access to `github.com` it is necessary to disable
`git-commit-id-maven-plugin`, otherwise you will face errors that there is no
access to the git during the build process. In that case you may use:
+
+```console
+make release-nogit PROFILES="-Pspark-3.4"
+```
+
## Run Spark Shell with Comet enabled
Make sure `SPARK_HOME` points to the same Spark version as Comet was built for.
diff --git a/docs/source/user-guide/tuning.md b/docs/source/user-guide/tuning.md
index 5a3100bd..f46ab9e0 100644
--- a/docs/source/user-guide/tuning.md
+++ b/docs/source/user-guide/tuning.md
@@ -39,6 +39,8 @@ It must be set before the Spark context is created. You can
enable or disable Co
at runtime by setting `spark.comet.exec.shuffle.enabled` to `true` or `false`.
Once it is disabled, Comet will fallback to the default Spark shuffle manager.
+> **_NOTE:_** At the moment Comet Shuffle is not compatible with Spark AQE
partition coalesce. To disable set
`spark.sql.adaptive.coalescePartitions.enabled` to `false`.
+
### Shuffle Mode
Comet provides three shuffle modes: Columnar Shuffle, Native Shuffle and Auto
Mode.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]