This is an automated email from the ASF dual-hosted git repository.
agrove pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git
The following commit(s) were added to refs/heads/main by this push:
new 365df8ed7 feat: Add CI check to ensure generated docs are in sync with
code (#2779)
365df8ed7 is described below
commit 365df8ed7cec0ed271c8248c9ab7d575cd0c7d94
Author: Andy Grove <[email protected]>
AuthorDate: Fri Nov 14 15:33:12 2025 -0700
feat: Add CI check to ensure generated docs are in sync with code (#2779)
---
.github/actions/java-test/action.yaml | 27 ++++++++++++++++++++-
dev/ci/check-working-tree-clean.sh | 41 ++++++++++++++++++++++++++++++++
docs/source/user-guide/latest/configs.md | 3 ++-
3 files changed, 69 insertions(+), 2 deletions(-)
diff --git a/.github/actions/java-test/action.yaml
b/.github/actions/java-test/action.yaml
index 5afb52033..ab2bcd74f 100644
--- a/.github/actions/java-test/action.yaml
+++ b/.github/actions/java-test/action.yaml
@@ -62,7 +62,32 @@ runs:
- name: Run Maven compile
shell: bash
run: |
- ./mvnw -B compile test-compile scalafix:scalafix -Dscalafix.mode=CHECK
-Psemanticdb ${{ inputs.maven_opts }}
+ ./mvnw -B package -DskipTests scalafix:scalafix -Dscalafix.mode=CHECK
-Psemanticdb ${{ inputs.maven_opts }}
+
+ - name: Setup Node.js
+ uses: actions/setup-node@v6
+ with:
+ node-version: '24'
+
+ - name: Install prettier
+ shell: bash
+ run: |
+ npm install -g prettier
+
+ - name: Run prettier
+ shell: bash
+ run: |
+ npx prettier "**/*.md" --write
+
+ - name: Mark workspace as safe for git
+ shell: bash
+ run: |
+ git config --global --add safe.directory "$GITHUB_WORKSPACE"
+
+ - name: Check for any local git changes (such as generated docs)
+ shell: bash
+ run: |
+ ./dev/ci/check-working-tree-clean.sh
- name: Run all tests
shell: bash
diff --git a/dev/ci/check-working-tree-clean.sh
b/dev/ci/check-working-tree-clean.sh
new file mode 100755
index 000000000..15e79bb27
--- /dev/null
+++ b/dev/ci/check-working-tree-clean.sh
@@ -0,0 +1,41 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+set -euo pipefail # Exit on errors, undefined vars, pipe failures
+
+# Check if we're in a git repository
+if ! git rev-parse --git-dir > /dev/null 2>&1; then
+ echo "Error: Not in a git repository"
+ # exit 1
+fi
+
+# Fail if there are any local changes (staged, unstaged, or untracked)
+if [ -n "$(git status --porcelain)" ]; then
+ echo "Working tree is not clean:"
+ git status --short
+ echo "Full diff:"
+ git diff
+ echo ""
+ echo "Please commit, stash, or clean these changes before proceeding."
+ exit 1
+else
+ echo "Working tree is clean"
+fi
+
diff --git a/docs/source/user-guide/latest/configs.md
b/docs/source/user-guide/latest/configs.md
index 50e40cc90..ea8589e94 100644
--- a/docs/source/user-guide/latest/configs.md
+++ b/docs/source/user-guide/latest/configs.md
@@ -67,7 +67,7 @@ Comet provides the following configuration settings.
| `spark.comet.exceptionOnDatetimeRebase` | Whether to throw exception when
seeing dates/timestamps from the legacy hybrid (Julian + Gregorian) calendar.
Since Spark 3, dates/timestamps were written according to the Proleptic
Gregorian calendar. When this is true, Comet will throw exceptions when seeing
these dates/timestamps that were written by Spark version before 3.0. If this
is false, these dates/timestamps will be read as if they were written to the
Proleptic Gregorian calendar [...]
| `spark.comet.exec.enabled` | Whether to enable Comet native
vectorized execution for Spark. This controls whether Spark should convert
operators into their Comet counterparts and execute them in native space. Note:
each operator is associated with a separate config in the format of
`spark.comet.exec.<operator_name>.enabled` at the moment, and both the config
and this need to be turned on, in order for the operator to be executed in
native. [...]
| `spark.comet.exec.replaceSortMergeJoin` | Experimental feature to force
Spark to replace SortMergeJoin with ShuffledHashJoin for improved performance.
This feature is not stable yet. For more information, refer to the [Comet
Tuning Guide](https://datafusion.apache.org/comet/user-guide/tuning.html).
[...]
-| `spark.comet.exec.strictFloatingPoint` | When enabled, fall back to
Spark for floating-point operations that differ from Spark, such as when
comparing or sorting -0.0 and 0.0. For more information, refer to the [Comet
Compatibility
Guide](https://datafusion.apache.org/comet/user-guide/compatibility.html).
[...]
+| `spark.comet.exec.strictFloatingPoint` | When enabled, fall back to
Spark for floating-point operations that may differ from Spark, such as when
comparing or sorting -0.0 and 0.0. For more information, refer to the [Comet
Compatibility
Guide](https://datafusion.apache.org/comet/user-guide/compatibility.html).
[...]
| `spark.comet.expression.allowIncompatible` | Comet is not currently fully
compatible with Spark for all expressions. Set this config to true to allow
them anyway. For more information, refer to the [Comet Compatibility
Guide](https://datafusion.apache.org/comet/user-guide/compatibility.html).
[...]
| `spark.comet.maxTempDirectorySize` | The maximum amount of data (in
bytes) stored inside the temporary directories.
[...]
| `spark.comet.metrics.updateInterval` | The interval in milliseconds to
update metrics. If interval is negative, metrics will be updated upon task
completion.
[...]
@@ -225,6 +225,7 @@ These settings can be used to determine which parts of the
plan are accelerated
| `spark.comet.expression.ConcatWs.enabled` | Enable Comet
acceleration for `ConcatWs` | true |
| `spark.comet.expression.Contains.enabled` | Enable Comet
acceleration for `Contains` | true |
| `spark.comet.expression.Cos.enabled` | Enable Comet
acceleration for `Cos` | true |
+| `spark.comet.expression.Cot.enabled` | Enable Comet
acceleration for `Cot` | true |
| `spark.comet.expression.CreateArray.enabled` | Enable Comet
acceleration for `CreateArray` | true |
| `spark.comet.expression.CreateNamedStruct.enabled` | Enable Comet
acceleration for `CreateNamedStruct` | true |
| `spark.comet.expression.DateAdd.enabled` | Enable Comet
acceleration for `DateAdd` | true |
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]