(datafusion) branch main updated: Fix documentation warnings and error if anymore occur (#14952)

alamb Tue, 04 Mar 2025 03:46:53 -0800

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git



The following commit(s) were added to refs/heads/main by this push:
     new ec222096b9 Fix documentation warnings and error if anymore occur 
(#14952)
ec222096b9 is described below

commit ec222096b9d750614a2e5a27950129b172ea11f4
Author: Amos Aidoo <[email protected]>
AuthorDate: Tue Mar 4 12:38:16 2025 +0100

    Fix documentation warnings and error if anymore occur (#14952)
    
    * feat: treat sphinx-build warnings as errors
    
    this should fail during build until warnings are fixed
    
    * fix: ./gen is not valid link in current context
    
    replaces the link with a code backticks
    
    * fix: replace with valid docs.rs link
    
    * fix: respect hierarchy from H2 -> H3
    
    * apply prettier fixes
    
    * Add build.sh to the CI check
    
    * tweak
    
    * fix
    
    * Update spans elsewhere
    
    ---------
    
    Co-authored-by: Andrew Lamb <[email protected]>
---
 .github/workflows/docs.yaml                        | 17 ++++++++++++
 .github/workflows/docs_pr.yaml                     | 30 +++++++++++++++++++++-
 datafusion/common/src/config.rs                    |  2 +-
 .../sqllogictest/test_files/information_schema.slt |  2 +-
 docs/build.sh                                      |  2 +-
 docs/source/contributor-guide/howtos.md            |  4 +--
 docs/source/library-user-guide/query-optimizer.md  |  4 +--
 docs/source/user-guide/configs.md                  |  2 +-
 8 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/.github/workflows/docs.yaml b/.github/workflows/docs.yaml
index 0b43339f57..5f1b2c1395 100644
--- a/.github/workflows/docs.yaml
+++ b/.github/workflows/docs.yaml
@@ -1,3 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 on:
   push:
     branches:
diff --git a/.github/workflows/docs_pr.yaml b/.github/workflows/docs_pr.yaml
index 3fad08643a..d3c901c5b7 100644
--- a/.github/workflows/docs_pr.yaml
+++ b/.github/workflows/docs_pr.yaml
@@ -15,6 +15,7 @@
 # specific language governing permissions and limitations
 # under the License.
 
+# Tests for Docs that runs on PRs
 name: Docs
 
 concurrency:
@@ -48,7 +49,34 @@ jobs:
         uses: ./.github/actions/setup-builder
         with:
           rust-version: stable
-      - name: Run doctests
+      - name: Run doctests (embedded rust examples)
         run: cargo test --doc --features avro,json
       - name: Verify Working Directory Clean
         run: git diff --exit-code
+
+  # Test doc build
+  linux-test-doc-build:
+    name: Test doc build
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          submodules: true
+          fetch-depth: 1
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - name: Install doc dependencies
+        run: |
+          set -x
+          python3 -m venv venv
+          source venv/bin/activate
+          pip install -r docs/requirements.txt
+      - name: Build docs html and check for warnings
+        run: |
+          set -x
+          source venv/bin/activate
+          cd docs
+          ./build.sh # fails on errors
+
diff --git a/datafusion/common/src/config.rs b/datafusion/common/src/config.rs
index 65ecd40327..e2f61cfc08 100644
--- a/datafusion/common/src/config.rs
+++ b/datafusion/common/src/config.rs
@@ -253,7 +253,7 @@ config_namespace! {
         pub support_varchar_with_length: bool, default = true
 
         /// When set to true, the source locations relative to the original SQL
-        /// query (i.e. [`Span`](sqlparser::tokenizer::Span)) will be collected
+        /// query (i.e. 
[`Span`](https://docs.rs/sqlparser/latest/sqlparser/tokenizer/struct.Span.html))
 will be collected
         /// and recorded in the logical plan nodes.
         pub collect_spans: bool, default = false
 
diff --git a/datafusion/sqllogictest/test_files/information_schema.slt 
b/datafusion/sqllogictest/test_files/information_schema.slt
index b0538b5e65..5c2f730a35 100644
--- a/datafusion/sqllogictest/test_files/information_schema.slt
+++ b/datafusion/sqllogictest/test_files/information_schema.slt
@@ -355,7 +355,7 @@ datafusion.optimizer.repartition_sorts true Should 
DataFusion execute sorts in a
 datafusion.optimizer.repartition_windows true Should DataFusion repartition 
data using the partitions keys to execute window functions in parallel using 
the provided `target_partitions` level
 datafusion.optimizer.skip_failed_rules false When set to true, the logical 
plan optimizer will produce warning messages if any optimization rules produce 
errors and then proceed to the next rule. When set to false, any rules that 
produce errors will cause the query to fail
 datafusion.optimizer.top_down_join_key_reordering true When set to true, the 
physical plan optimizer will run a top down process to reorder the join keys
-datafusion.sql_parser.collect_spans false When set to true, the source 
locations relative to the original SQL query (i.e. 
[`Span`](sqlparser::tokenizer::Span)) will be collected and recorded in the 
logical plan nodes.
+datafusion.sql_parser.collect_spans false When set to true, the source 
locations relative to the original SQL query (i.e. 
[`Span`](https://docs.rs/sqlparser/latest/sqlparser/tokenizer/struct.Span.html))
 will be collected and recorded in the logical plan nodes.
 datafusion.sql_parser.dialect generic Configure the SQL dialect used by 
DataFusion's parser; supported values include: Generic, MySQL, PostgreSQL, 
Hive, SQLite, Snowflake, Redshift, MsSQL, ClickHouse, BigQuery, Ansi, DuckDB 
and Databricks.
 datafusion.sql_parser.enable_ident_normalization true When set to true, SQL 
parser will normalize ident (convert ident to lowercase when not quoted)
 datafusion.sql_parser.enable_options_value_normalization false When set to 
true, SQL parser will normalize options value (convert value to lowercase). 
Note that this option is ignored and will be removed in the future. All 
case-insensitive values are normalized automatically.
diff --git a/docs/build.sh b/docs/build.sh
index 14464fab40..73516e8e9c 100755
--- a/docs/build.sh
+++ b/docs/build.sh
@@ -28,4 +28,4 @@ sed -i -e 
's/\.\.\/\.\.\/\.\.\//https:\/\/github.com\/apache\/arrow-datafusion\/
 
 python rustdoc_trim.py
 
-make SOURCEDIR=`pwd`/temp html
+make SOURCEDIR=`pwd`/temp SPHINXOPTS=-W html
diff --git a/docs/source/contributor-guide/howtos.md 
b/docs/source/contributor-guide/howtos.md
index 556242751f..89a1bc7360 100644
--- a/docs/source/contributor-guide/howtos.md
+++ b/docs/source/contributor-guide/howtos.md
@@ -141,9 +141,9 @@ taplo fmt
 
 ## How to update protobuf/gen dependencies
 
-The prost/tonic code can be generated by running `./regen.sh`, which in turn 
invokes the Rust binary located in [gen](./gen)
+The prost/tonic code can be generated by running `./regen.sh`, which in turn 
invokes the Rust binary located in `./gen`
 
-This is necessary after modifying the protobuf definitions or altering the 
dependencies of [gen](./gen), and requires a
+This is necessary after modifying the protobuf definitions or altering the 
dependencies of `./gen`, and requires a
 valid installation of [protoc] (see [installation instructions] for details).
 
 ```bash
diff --git a/docs/source/library-user-guide/query-optimizer.md 
b/docs/source/library-user-guide/query-optimizer.md
index af27bb7505..03cd7b5bbb 100644
--- a/docs/source/library-user-guide/query-optimizer.md
+++ b/docs/source/library-user-guide/query-optimizer.md
@@ -401,7 +401,7 @@ interval arithmetic to take an expression such as `a > 2500 
AND a <= 5000` and
 build an accurate selectivity estimate that can then be used to find more 
efficient
 plans.
 
-#### `AnalysisContext` API
+### `AnalysisContext` API
 
 The `AnalysisContext` serves as a shared knowledge base during expression 
evaluation
 and boundary analysis. Think of it as a dynamic repository that maintains 
information about:
@@ -414,7 +414,7 @@ What makes `AnalysisContext` particularly powerful is its 
ability to propagate i
 through the expression tree. As each node in the expression tree is analyzed, 
it can both
 read from and write to this shared context, allowing for sophisticated 
boundary analysis and inference.
 
-#### `ColumnStatistics` for Cardinality Estimation
+### `ColumnStatistics` for Cardinality Estimation
 
 Column statistics form the foundation of optimization decisions. Rather than 
just tracking
 simple metrics, DataFusion's `ColumnStatistics` provides a rich set of 
information including:
diff --git a/docs/source/user-guide/configs.md 
b/docs/source/user-guide/configs.md
index 8c4aad5107..f29fbb6745 100644
--- a/docs/source/user-guide/configs.md
+++ b/docs/source/user-guide/configs.md
@@ -127,5 +127,5 @@ Environment variables are read during `SessionConfig` 
initialisation so they mus
 | datafusion.sql_parser.enable_options_value_normalization                | 
false                     | When set to true, SQL parser will normalize options 
value (convert value to lowercase). Note that this option is ignored and will 
be removed in the future. All case-insensitive values are normalized 
automatically.                                                                  
                                                                                
                              [...]
 | datafusion.sql_parser.dialect                                           | 
generic                   | Configure the SQL dialect used by DataFusion's 
parser; supported values include: Generic, MySQL, PostgreSQL, Hive, SQLite, 
Snowflake, Redshift, MsSQL, ClickHouse, BigQuery, Ansi, DuckDB and Databricks.  
                                                                                
                                                                                
                          [...]
 | datafusion.sql_parser.support_varchar_with_length                       | 
true                      | If true, permit lengths for `VARCHAR` such as 
`VARCHAR(20)`, but ignore the length. If false, error if a `VARCHAR` with a 
length is specified. The Arrow type system does not have a notion of maximum 
string length and thus DataFusion can not enforce such limits.                  
                                                                                
                              [...]
-| datafusion.sql_parser.collect_spans                                     | 
false                     | When set to true, the source locations relative to 
the original SQL query (i.e. [`Span`](sqlparser::tokenizer::Span)) will be 
collected and recorded in the logical plan nodes.                               
                                                                                
                                                                                
                       [...]
+| datafusion.sql_parser.collect_spans                                     | 
false                     | When set to true, the source locations relative to 
the original SQL query (i.e. 
[`Span`](https://docs.rs/sqlparser/latest/sqlparser/tokenizer/struct.Span.html))
 will be collected and recorded in the logical plan nodes.                      
                                                                                
                                                                     [...]
 | datafusion.sql_parser.recursion_limit                                   | 50 
                       | Specifies the recursion depth limit when parsing 
complex SQL Queries                                                             
                                                                                
                                                                                
                                                                                
                    [...]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion) branch main updated: Fix documentation warnings and error if anymore occur (#14952)

Reply via email to