This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 67b93184 Publish built docs triggered by 
77c9a6cc03b98e60d6c1b3d2805293826b5d3c2f
67b93184 is described below

commit 67b9318469ecb430cdbff37a740d42908bdd681e
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Jun 3 23:03:31 2024 +0000

    Publish built docs triggered by 77c9a6cc03b98e60d6c1b3d2805293826b5d3c2f
---
 _sources/contributor-guide/development.md.txt |  4 ++
 _sources/user-guide/configs.md.txt            |  1 -
 contributor-guide/development.html            |  4 ++
 searchindex.js                                |  2 +-
 user-guide/configs.html                       | 54 +++++++++++++--------------
 5 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/_sources/contributor-guide/development.md.txt 
b/_sources/contributor-guide/development.md.txt
index 0121d9f4..913eea40 100644
--- a/_sources/contributor-guide/development.md.txt
+++ b/_sources/contributor-guide/development.md.txt
@@ -92,11 +92,13 @@ The plan stability testing framework is located in the 
`spark` module and can be
 
 ```sh
 ./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" test
+./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" 
-Pspark-4.0 -nsu test
 ```
 
 and
 ```sh
 ./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" test
+./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" 
-Pspark-4.0 -nsu test
 ```
 
 If your pull request changes the query plans generated by Comet, you should 
regenerate the golden files.
@@ -104,11 +106,13 @@ To regenerate the golden files, you can run the following 
command:
 
 ```sh
 SPARK_GENERATE_GOLDEN_FILES=1 ./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" test
+SPARK_GENERATE_GOLDEN_FILES=1 ./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" 
-Pspark-4.0 -nsu test
 ```
 
 and
 ```sh
 SPARK_GENERATE_GOLDEN_FILES=1 ./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" test
+SPARK_GENERATE_GOLDEN_FILES=1 ./mvnw -pl spark 
-Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" 
-Pspark-4.0 -nsu test
 ```
 
 ## Benchmark
diff --git a/_sources/user-guide/configs.md.txt 
b/_sources/user-guide/configs.md.txt
index eb349b34..104f29ce 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -23,7 +23,6 @@ Comet provides the following configuration settings.
 
 | Config | Description | Default Value |
 |--------|-------------|---------------|
-| spark.comet.ansi.enabled | Comet does not respect ANSI mode in most cases 
and by default will not accelerate queries when ansi mode is enabled. Enable 
this setting to test Comet's experimental support for ANSI mode. This should 
not be used in production. | false |
 | spark.comet.batchSize | The columnar batch size, i.e., the maximum number of 
rows that a batch can contain. | 8192 |
 | spark.comet.cast.allowIncompatible | Comet is not currently fully compatible 
with Spark for all cast operations. Set this config to true to allow them 
anyway. See compatibility guide for more information. | false |
 | spark.comet.columnar.shuffle.async.enabled | Whether to enable asynchronous 
shuffle for Arrow-based shuffle. By default, this config is false. | false |
diff --git a/contributor-guide/development.html 
b/contributor-guide/development.html
index c2cd5d22..4a2e604f 100644
--- a/contributor-guide/development.html
+++ b/contributor-guide/development.html
@@ -442,19 +442,23 @@ in their name in <code class="docutils literal 
notranslate"><span class="pre">or
 <p>Comet has a plan stability testing framework that can be used to test the 
stability of the query plans generated by Comet.
 The plan stability testing framework is located in the <code class="docutils 
literal notranslate"><span class="pre">spark</span></code> module and can be 
run using the following command:</p>
 <div class="highlight-sh notranslate"><div 
class="highlight"><pre><span></span>./mvnw<span class="w"> </span>-pl<span 
class="w"> </span>spark<span class="w"> </span>-Dsuites<span 
class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite&quot;</span><span
 class="w"> </span><span class="nb">test</span>
+./mvnw<span class="w"> </span>-pl<span class="w"> </span>spark<span class="w"> 
</span>-Dsuites<span class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite&quot;</span><span
 class="w"> </span>-Pspark-4.0<span class="w"> </span>-nsu<span class="w"> 
</span><span class="nb">test</span>
 </pre></div>
 </div>
 <p>and</p>
 <div class="highlight-sh notranslate"><div 
class="highlight"><pre><span></span>./mvnw<span class="w"> </span>-pl<span 
class="w"> </span>spark<span class="w"> </span>-Dsuites<span 
class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite&quot;</span><span
 class="w"> </span><span class="nb">test</span>
+./mvnw<span class="w"> </span>-pl<span class="w"> </span>spark<span class="w"> 
</span>-Dsuites<span class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite&quot;</span><span
 class="w"> </span>-Pspark-4.0<span class="w"> </span>-nsu<span class="w"> 
</span><span class="nb">test</span>
 </pre></div>
 </div>
 <p>If your pull request changes the query plans generated by Comet, you should 
regenerate the golden files.
 To regenerate the golden files, you can run the following command:</p>
 <div class="highlight-sh notranslate"><div 
class="highlight"><pre><span></span><span 
class="nv">SPARK_GENERATE_GOLDEN_FILES</span><span class="o">=</span><span 
class="m">1</span><span class="w"> </span>./mvnw<span class="w"> 
</span>-pl<span class="w"> </span>spark<span class="w"> </span>-Dsuites<span 
class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite&quot;</span><span
 class="w"> </span><span class="nb">test</span>
+<span class="nv">SPARK_GENERATE_GOLDEN_FILES</span><span 
class="o">=</span><span class="m">1</span><span class="w"> </span>./mvnw<span 
class="w"> </span>-pl<span class="w"> </span>spark<span class="w"> 
</span>-Dsuites<span class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite&quot;</span><span
 class="w"> </span>-Pspark-4.0<span class="w"> </span>-nsu<span class="w"> 
</span><span class="nb">test</span>
 </pre></div>
 </div>
 <p>and</p>
 <div class="highlight-sh notranslate"><div 
class="highlight"><pre><span></span><span 
class="nv">SPARK_GENERATE_GOLDEN_FILES</span><span class="o">=</span><span 
class="m">1</span><span class="w"> </span>./mvnw<span class="w"> 
</span>-pl<span class="w"> </span>spark<span class="w"> </span>-Dsuites<span 
class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite&quot;</span><span
 class="w"> </span><span class="nb">test</span>
+<span class="nv">SPARK_GENERATE_GOLDEN_FILES</span><span 
class="o">=</span><span class="m">1</span><span class="w"> </span>./mvnw<span 
class="w"> </span>-pl<span class="w"> </span>spark<span class="w"> 
</span>-Dsuites<span class="o">=</span><span 
class="s2">&quot;org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite&quot;</span><span
 class="w"> </span>-Pspark-4.0<span class="w"> </span>-nsu<span class="w"> 
</span><span class="nb">test</span>
 </pre></div>
 </div>
 </section>
diff --git a/searchindex.js b/searchindex.js
index a7028338..c95c0488 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"ANSI mode": [[8, "ansi-mode"]], "API 
Differences Between Spark Versions": [[0, 
"api-differences-between-spark-versions"]], "ASF Links": [[7, null]], "Adding 
Spark-side Tests for the New Expression": [[0, 
"adding-spark-side-tests-for-the-new-expression"]], "Adding a New Expression": 
[[0, "adding-a-new-expression"]], "Adding a New Scalar Function Expression": 
[[0, "adding-a-new-scalar-function-expression"]], "Adding the Expression To the 
Protobuf Definition" [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"ANSI mode": [[8, "ansi-mode"]], "API 
Differences Between Spark Versions": [[0, 
"api-differences-between-spark-versions"]], "ASF Links": [[7, null]], "Adding 
Spark-side Tests for the New Expression": [[0, 
"adding-spark-side-tests-for-the-new-expression"]], "Adding a New Expression": 
[[0, "adding-a-new-expression"]], "Adding a New Scalar Function Expression": 
[[0, "adding-a-new-scalar-function-expression"]], "Adding the Expression To the 
Protobuf Definition" [...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index f248982d..4730e524 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -313,107 +313,103 @@ under the License.
 </tr>
 </thead>
 <tbody>
-<tr class="row-even"><td><p>spark.comet.ansi.enabled</p></td>
-<td><p>Comet does not respect ANSI mode in most cases and by default will not 
accelerate queries when ansi mode is enabled. Enable this setting to test 
Comet’s experimental support for ANSI mode. This should not be used in 
production.</p></td>
-<td><p>false</p></td>
-</tr>
-<tr class="row-odd"><td><p>spark.comet.batchSize</p></td>
+<tr class="row-even"><td><p>spark.comet.batchSize</p></td>
 <td><p>The columnar batch size, i.e., the maximum number of rows that a batch 
can contain.</p></td>
 <td><p>8192</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.cast.allowIncompatible</p></td>
+<tr class="row-odd"><td><p>spark.comet.cast.allowIncompatible</p></td>
 <td><p>Comet is not currently fully compatible with Spark for all cast 
operations. Set this config to true to allow them anyway. See compatibility 
guide for more information.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.columnar.shuffle.async.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.columnar.shuffle.async.enabled</p></td>
 <td><p>Whether to enable asynchronous shuffle for Arrow-based shuffle. By 
default, this config is false.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr 
class="row-even"><td><p>spark.comet.columnar.shuffle.async.max.thread.num</p></td>
+<tr 
class="row-odd"><td><p>spark.comet.columnar.shuffle.async.max.thread.num</p></td>
 <td><p>Maximum number of threads on an executor used for Comet async columnar 
shuffle. By default, this config is 100. This is the upper bound of total 
number of shuffle threads per executor. In other words, if the number of cores 
* the number of shuffle threads per task <code class="docutils literal 
notranslate"><span 
class="pre">spark.comet.columnar.shuffle.async.thread.num</span></code> is 
larger than this config. Comet will use this config as the number of shuffle 
threads per executo [...]
 <td><p>100</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>spark.comet.columnar.shuffle.async.thread.num</p></td>
+<tr 
class="row-even"><td><p>spark.comet.columnar.shuffle.async.thread.num</p></td>
 <td><p>Number of threads used for Comet async columnar shuffle per shuffle 
task. By default, this config is 3. Note that more threads means more memory 
requirement to buffer shuffle data before flushing to disk. Also, more threads 
may not always improve performance, and should be set based on the number of 
cores available.</p></td>
 <td><p>3</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.columnar.shuffle.memory.factor</p></td>
+<tr class="row-odd"><td><p>spark.comet.columnar.shuffle.memory.factor</p></td>
 <td><p>Fraction of Comet memory to be allocated per executor process for Comet 
shuffle. Comet memory size is specified by <code class="docutils literal 
notranslate"><span class="pre">spark.comet.memoryOverhead</span></code> or 
calculated by <code class="docutils literal notranslate"><span 
class="pre">spark.comet.memory.overhead.factor</span></code> * <code 
class="docutils literal notranslate"><span 
class="pre">spark.executor.memory</span></code>. By default, this config is 
1.0.</p></td>
 <td><p>1.0</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.debug.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.debug.enabled</p></td>
 <td><p>Whether to enable debug mode for Comet. By default, this config is 
false. When enabled, Comet will do additional checks for debugging purpose. For 
example, validating array when importing arrays from JVM at native side. Note 
that these checks may be expensive in performance and should only be enabled 
for debugging purpose.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.enabled</p></td>
 <td><p>Whether to enable Comet extension for Spark. When this is turned on, 
Spark will use Comet to read Parquet data source. Note that to enable native 
vectorized execution, both this config and ‘spark.comet.exec.enabled’ need to 
be enabled. By default, this config is the value of the env var <code 
class="docutils literal notranslate"><span 
class="pre">ENABLE_COMET</span></code> if set, or true otherwise.</p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.exceptionOnDatetimeRebase</p></td>
+<tr class="row-even"><td><p>spark.comet.exceptionOnDatetimeRebase</p></td>
 <td><p>Whether to throw exception when seeing dates/timestamps from the legacy 
hybrid (Julian + Gregorian) calendar. Since Spark 3, dates/timestamps were 
written according to the Proleptic Gregorian calendar. When this is true, Comet 
will throw exceptions when seeing these dates/timestamps that were written by 
Spark version before 3.0. If this is false, these dates/timestamps will be read 
as if they were written to the Proleptic Gregorian calendar and will not be 
rebased.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.exec.all.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.exec.all.enabled</p></td>
 <td><p>Whether to enable all Comet operators. By default, this config is 
false. Note that this config precedes all separate config 
‘spark.comet.exec.&lt;operator_name&gt;.enabled’. That being said, if this 
config is enabled, separate configs are ignored.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.exec.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.exec.enabled</p></td>
 <td><p>Whether to enable Comet native vectorized execution for Spark. This 
controls whether Spark should convert operators into their Comet counterparts 
and execute them in native space. Note: each operator is associated with a 
separate config in the format of 
‘spark.comet.exec.&lt;operator_name&gt;.enabled’ at the moment, and both the 
config and this need to be turned on, in order for the operator to be executed 
in native. By default, this config is false.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.exec.memoryFraction</p></td>
+<tr class="row-odd"><td><p>spark.comet.exec.memoryFraction</p></td>
 <td><p>The fraction of memory from Comet memory overhead that the native 
memory manager can use for execution. The purpose of this config is to set 
aside memory for untracked data structures, as well as imprecise size 
estimation during memory acquisition. Default value is 0.7.</p></td>
 <td><p>0.7</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.exec.shuffle.codec</p></td>
+<tr class="row-even"><td><p>spark.comet.exec.shuffle.codec</p></td>
 <td><p>The codec of Comet native shuffle used to compress shuffle data. Only 
zstd is supported.</p></td>
 <td><p>zstd</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.exec.shuffle.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.exec.shuffle.enabled</p></td>
 <td><p>Whether to enable Comet native shuffle. By default, this config is 
false. Note that this requires setting ‘spark.shuffle.manager’ to 
‘org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager’. 
‘spark.shuffle.manager’ must be set before starting the Spark application and 
cannot be changed during the application.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.exec.shuffle.mode</p></td>
+<tr class="row-even"><td><p>spark.comet.exec.shuffle.mode</p></td>
 <td><p>The mode of Comet shuffle. This config is only effective if Comet 
shuffle is enabled. Available modes are ‘native’, ‘jvm’, and ‘auto’. ‘native’ 
is for native shuffle which has best performance in general. ‘jvm’ is for 
jvm-based columnar shuffle which has higher coverage than native shuffle. 
‘auto’ is for Comet to choose the best shuffle mode based on the query plan. By 
default, this config is ‘jvm’.</p></td>
 <td><p>jvm</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.explainFallback.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.explainFallback.enabled</p></td>
 <td><p>When this setting is enabled, Comet will provide logging explaining the 
reason(s) why a query stage cannot be executed natively.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.memory.overhead.factor</p></td>
+<tr class="row-even"><td><p>spark.comet.memory.overhead.factor</p></td>
 <td><p>Fraction of executor memory to be allocated as additional non-heap 
memory per executor process for Comet. Default value is 0.2.</p></td>
 <td><p>0.2</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.memory.overhead.min</p></td>
+<tr class="row-odd"><td><p>spark.comet.memory.overhead.min</p></td>
 <td><p>Minimum amount of additional memory to be allocated per executor 
process for Comet, in MiB.</p></td>
 <td><p>402653184b</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.nativeLoadRequired</p></td>
+<tr class="row-even"><td><p>spark.comet.nativeLoadRequired</p></td>
 <td><p>Whether to require Comet native library to load successfully when Comet 
is enabled. If not, Comet will silently fallback to Spark when it fails to load 
the native lib. Otherwise, an error will be thrown and the Spark job will be 
aborted.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.parquet.enable.directBuffer</p></td>
+<tr class="row-odd"><td><p>spark.comet.parquet.enable.directBuffer</p></td>
 <td><p>Whether to use Java direct byte buffer when reading Parquet. By 
default, this is false</p></td>
 <td><p>false</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>spark.comet.rowToColumnar.supportedOperatorList</p></td>
+<tr 
class="row-even"><td><p>spark.comet.rowToColumnar.supportedOperatorList</p></td>
 <td><p>A comma-separated list of row-based operators that will be converted to 
columnar format when ‘spark.comet.rowToColumnar.enabled’ is true</p></td>
 <td><p>Range,InMemoryTableScan</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.scan.enabled</p></td>
+<tr class="row-odd"><td><p>spark.comet.scan.enabled</p></td>
 <td><p>Whether to enable Comet scan. When this is turned on, Spark will use 
Comet to read Parquet data source. Note that to enable native vectorized 
execution, both this config and ‘spark.comet.exec.enabled’ need to be enabled. 
By default, this config is true.</p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.scan.preFetch.enabled</p></td>
+<tr class="row-even"><td><p>spark.comet.scan.preFetch.enabled</p></td>
 <td><p>Whether to enable pre-fetching feature of CometScan. By default is 
disabled.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>spark.comet.scan.preFetch.threadNum</p></td>
+<tr class="row-odd"><td><p>spark.comet.scan.preFetch.threadNum</p></td>
 <td><p>The number of threads running pre-fetching for CometScan. Effective if 
spark.comet.scan.preFetch.enabled is enabled. By default it is 2. Note that 
more pre-fetching threads means more memory requirement to store pre-fetched 
row groups.</p></td>
 <td><p>2</p></td>
 </tr>
-<tr class="row-odd"><td><p>spark.comet.shuffle.preferDictionary.ratio</p></td>
+<tr class="row-even"><td><p>spark.comet.shuffle.preferDictionary.ratio</p></td>
 <td><p>The ratio of total values to distinct values in a string column to 
decide whether to prefer dictionary encoding when shuffling the column. If the 
ratio is higher than this config, dictionary encoding will be used on shuffling 
string column. This config is effective if it is higher than 1.0. By default, 
this config is 10.0. Note that this config is only used when 
‘spark.comet.columnar.shuffle.enabled’ is true.</p></td>
 <td><p>10.0</p></td>
 </tr>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to