(datafusion-comet) branch asf-site updated: Publish built docs triggered by a6b340e4bc988094aae90767eb9f8dc85f441598

github-bot Mon, 02 Mar 2026 14:14:17 -0800

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 9843597fd Publish built docs triggered by 
a6b340e4bc988094aae90767eb9f8dc85f441598
9843597fd is described below

commit 9843597fd84f0a98947a35588be9cb87bb609cb9
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Mar 2 22:13:57 2026 +0000

    Publish built docs triggered by a6b340e4bc988094aae90767eb9f8dc85f441598
---
 _sources/contributor-guide/parquet_scans.md.txt | 8 ++++----
 _sources/user-guide/latest/configs.md.txt       | 2 --
 contributor-guide/parquet_scans.html            | 8 ++++----
 searchindex.js                                  | 2 +-
 user-guide/latest/configs.html                  | 8 --------
 5 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/_sources/contributor-guide/parquet_scans.md.txt 
b/_sources/contributor-guide/parquet_scans.md.txt
index 7df939488..c8e960a15 100644
--- a/_sources/contributor-guide/parquet_scans.md.txt
+++ b/_sources/contributor-guide/parquet_scans.md.txt
@@ -49,10 +49,10 @@ The following features are not supported by either scan 
implementation, and Come
 
 The following shared limitation may produce incorrect results without falling 
back to Spark:
 
-- No support for datetime rebasing detection or the 
`spark.comet.exceptionOnDatetimeRebase` configuration. When
-  reading Parquet files containing dates or timestamps written before Spark 
3.0 (which used a hybrid
-  Julian/Gregorian calendar), dates/timestamps will be read as if they were 
written using the Proleptic Gregorian
-  calendar. This may produce incorrect results for dates before October 15, 
1582.
+- No support for datetime rebasing. When reading Parquet files containing 
dates or timestamps written before
+  Spark 3.0 (which used a hybrid Julian/Gregorian calendar), dates/timestamps 
will be read as if they were
+  written using the Proleptic Gregorian calendar. This may produce incorrect 
results for dates before
+  October 15, 1582.
 
 The `native_datafusion` scan has some additional limitations, mostly related 
to Parquet metadata. All of these
 cause Comet to fall back to Spark.
diff --git a/_sources/user-guide/latest/configs.md.txt 
b/_sources/user-guide/latest/configs.md.txt
index 9a3accc0c..1a4f7e7cd 100644
--- a/_sources/user-guide/latest/configs.md.txt
+++ b/_sources/user-guide/latest/configs.md.txt
@@ -30,8 +30,6 @@ Comet provides the following configuration settings.
 | `spark.comet.scan.enabled` | Whether to enable native scans. When this is 
turned on, Spark will use Comet to read supported data sources (currently only 
Parquet is supported natively). Note that to enable native vectorized 
execution, both this config and `spark.comet.exec.enabled` need to be enabled. 
| true |
 | `spark.comet.scan.icebergNative.dataFileConcurrencyLimit` | The number of 
Iceberg data files to read concurrently within a single task. Higher values 
improve throughput for tables with many small files by overlapping I/O latency, 
but increase memory usage. Values between 2 and 8 are suggested. | 1 |
 | `spark.comet.scan.icebergNative.enabled` | Whether to enable native Iceberg 
table scan using iceberg-rust. When enabled, Iceberg tables are read directly 
through native execution, bypassing Spark's DataSource V2 API for better 
performance. | false |
-| `spark.comet.scan.preFetch.enabled` | Whether to enable pre-fetching feature 
of CometScan. | false |
-| `spark.comet.scan.preFetch.threadNum` | The number of threads running 
pre-fetching for CometScan. Effective if spark.comet.scan.preFetch.enabled is 
enabled. Note that more pre-fetching threads means more memory requirement to 
store pre-fetched row groups. | 2 |
 | `spark.comet.scan.unsignedSmallIntSafetyCheck` | Parquet files may contain 
unsigned 8-bit integers (UINT_8) which Spark maps to ShortType. When this 
config is true (default), Comet falls back to Spark for ShortType columns 
because we cannot distinguish signed INT16 (safe) from unsigned UINT_8 (may 
produce different results). Set to false to allow native execution of ShortType 
columns if you know your data does not contain unsigned UINT_8 columns from 
improperly encoded Parquet files. F [...]
 | `spark.hadoop.fs.comet.libhdfs.schemes` | Defines filesystem schemes (e.g., 
hdfs, webhdfs) that the native side accesses via libhdfs, separated by commas. 
Valid only when built with hdfs feature enabled. | |
 <!-- prettier-ignore-end -->
diff --git a/contributor-guide/parquet_scans.html 
b/contributor-guide/parquet_scans.html
index 52544b2ea..5e4364e51 100644
--- a/contributor-guide/parquet_scans.html
+++ b/contributor-guide/parquet_scans.html
@@ -496,10 +496,10 @@ V2 API for Parquet scans. The DataFusion-based 
implementations only support the
 </ul>
 <p>The following shared limitation may produce incorrect results without 
falling back to Spark:</p>
 <ul class="simple">
-<li><p>No support for datetime rebasing detection or the <code class="docutils 
literal notranslate"><span 
class="pre">spark.comet.exceptionOnDatetimeRebase</span></code> configuration. 
When
-reading Parquet files containing dates or timestamps written before Spark 3.0 
(which used a hybrid
-Julian/Gregorian calendar), dates/timestamps will be read as if they were 
written using the Proleptic Gregorian
-calendar. This may produce incorrect results for dates before October 15, 
1582.</p></li>
+<li><p>No support for datetime rebasing. When reading Parquet files containing 
dates or timestamps written before
+Spark 3.0 (which used a hybrid Julian/Gregorian calendar), dates/timestamps 
will be read as if they were
+written using the Proleptic Gregorian calendar. This may produce incorrect 
results for dates before
+October 15, 1582.</p></li>
 </ul>
 <p>The <code class="docutils literal notranslate"><span 
class="pre">native_datafusion</span></code> scan has some additional 
limitations, mostly related to Parquet metadata. All of these
 cause Comet to fall back to Spark.</p>
diff --git a/searchindex.js b/searchindex.js
index dd309a87a..5b9e77fb9 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"1. Format Your Code": [[12, 
"format-your-code"]], "1. Install Comet": [[22, "install-comet"]], "1. Native 
Operators (nativeExecs map)": [[4, "native-operators-nativeexecs-map"]], "2. 
Build and Verify": [[12, "build-and-verify"]], "2. Clone Spark and Apply Diff": 
[[22, "clone-spark-and-apply-diff"]], "2. Sink Operators (sinks map)": [[4, 
"sink-operators-sinks-map"]], "3. Comet JVM Operators": [[4, 
"comet-jvm-operators"]], "3. Run Clippy (Recommended)": [[12 [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"1. Format Your Code": [[12, 
"format-your-code"]], "1. Install Comet": [[22, "install-comet"]], "1. Native 
Operators (nativeExecs map)": [[4, "native-operators-nativeexecs-map"]], "2. 
Build and Verify": [[12, "build-and-verify"]], "2. Clone Spark and Apply Diff": 
[[22, "clone-spark-and-apply-diff"]], "2. Sink Operators (sinks map)": [[4, 
"sink-operators-sinks-map"]], "3. Comet JVM Operators": [[4, 
"comet-jvm-operators"]], "3. Run Clippy (Recommended)": [[12 [...]
\ No newline at end of file
diff --git a/user-guide/latest/configs.html b/user-guide/latest/configs.html
index eaaca6158..a83ba5f1b 100644
--- a/user-guide/latest/configs.html
+++ b/user-guide/latest/configs.html
@@ -485,14 +485,6 @@ under the License.
 <td><p>Whether to enable native Iceberg table scan using iceberg-rust. When 
enabled, Iceberg tables are read directly through native execution, bypassing 
Spark’s DataSource V2 API for better performance.</p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span 
class="pre">spark.comet.scan.preFetch.enabled</span></code></p></td>
-<td><p>Whether to enable pre-fetching feature of CometScan.</p></td>
-<td><p>false</p></td>
-</tr>
-<tr class="row-even"><td><p><code class="docutils literal notranslate"><span 
class="pre">spark.comet.scan.preFetch.threadNum</span></code></p></td>
-<td><p>The number of threads running pre-fetching for CometScan. Effective if 
spark.comet.scan.preFetch.enabled is enabled. Note that more pre-fetching 
threads means more memory requirement to store pre-fetched row groups.</p></td>
-<td><p>2</p></td>
-</tr>
 <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span 
class="pre">spark.comet.scan.unsignedSmallIntSafetyCheck</span></code></p></td>
 <td><p>Parquet files may contain unsigned 8-bit integers (UINT_8) which Spark 
maps to ShortType. When this config is true (default), Comet falls back to 
Spark for ShortType columns because we cannot distinguish signed INT16 (safe) 
from unsigned UINT_8 (may produce different results). Set to false to allow 
native execution of ShortType columns if you know your data does not contain 
unsigned UINT_8 columns from improperly encoded Parquet files. For more 
information, refer to the <a class=" [...]
 <td><p>true</p></td>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion-comet) branch asf-site updated: Publish built docs triggered by a6b340e4bc988094aae90767eb9f8dc85f441598

Reply via email to