This is an automated email from the ASF dual-hosted git repository. alamb pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push: new 0e840415af Document Table Constraint Enforcement Behavior in Custom Table Providers Guide (#16340) 0e840415af is described below commit 0e840415af4bbce6918daeb3dba19a84886b3ae1 Author: kosiew <kos...@gmail.com> AuthorDate: Thu Jun 12 12:30:12 2025 +0800 Document Table Constraint Enforcement Behavior in Custom Table Providers Guide (#16340) * Add documentation for table constraint enforcement in DataFusion * Add link to table constraints documentation in index.rst * fix: correct markdown link references for table constraints documentation * Update docs/source/library-user-guide/table-constraints.md Co-authored-by: Andrew Lamb <and...@nerdnetworks.org> * fix: remove incorrect information about optimizer constraints in table constraints documentation * prettier fix --------- Co-authored-by: Andrew Lamb <and...@nerdnetworks.org> --- docs/source/index.rst | 1 + .../library-user-guide/custom-table-providers.md | 3 ++ .../source/library-user-guide/table-constraints.md | 42 ++++++++++++++++++++++ 3 files changed, 46 insertions(+) diff --git a/docs/source/index.rst b/docs/source/index.rst index e920a0f036..4b407e4e49 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -135,6 +135,7 @@ To get started, see library-user-guide/catalogs library-user-guide/adding-udfs library-user-guide/custom-table-providers + library-user-guide/table-constraints library-user-guide/extending-operators library-user-guide/profiling library-user-guide/query-optimizer diff --git a/docs/source/library-user-guide/custom-table-providers.md b/docs/source/library-user-guide/custom-table-providers.md index 54f79a4218..695cb16ac8 100644 --- a/docs/source/library-user-guide/custom-table-providers.md +++ b/docs/source/library-user-guide/custom-table-providers.md @@ -23,6 +23,9 @@ Like other areas of DataFusion, you extend DataFusion's functionality by impleme This section describes how to create a [`TableProvider`] and how to configure DataFusion to use it for reading. +For details on how table constraints such as primary keys or unique +constraints are handled, see [Table Constraint Enforcement](table-constraints.md). + ## Table Provider and Scan The [`TableProvider::scan`] method reads data from the table and is likely the most important. It returns an [`ExecutionPlan`] that DataFusion will use to read the actual data during execution of the query. The [`TableProvider::insert_into`] method is used to `INSERT` data into the table. diff --git a/docs/source/library-user-guide/table-constraints.md b/docs/source/library-user-guide/table-constraints.md new file mode 100644 index 0000000000..dea746463d --- /dev/null +++ b/docs/source/library-user-guide/table-constraints.md @@ -0,0 +1,42 @@ +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# Table Constraint Enforcement + +Table providers can describe table constraints using the +[`TableConstraint`] and [`Constraints`] APIs. These constraints include +primary keys, unique keys, foreign keys and check constraints. + +DataFusion does **not** currently enforce these constraints at runtime. +They are provided for informational purposes and can be used by custom +`TableProvider` implementations or other parts of the system. + +- **Nullability**: The only property enforced by DataFusion is the + nullability of each [`Field`] in a schema. Returning data with null values + for Columns marked as not nullable will result in runtime errors during execution. DataFusion + does not check or enforce nullability when data is ingested. +- **Primary and unique keys**: DataFusion does not verify that the data + satisfies primary or unique key constraints. Table providers that + require this behaviour must implement their own checks. +- **Foreign keys and check constraints**: These constraints are parsed + but are not validated or used during query planning. + +[`tableconstraint`]: https://docs.rs/datafusion/latest/datafusion/sql/planner/enum.TableConstraint.html +[`constraints`]: https://docs.rs/datafusion/latest/datafusion/common/functional_dependencies/struct.Constraints.html +[`field`]: https://docs.rs/arrow/latest/arrow/datatype/struct.Field.html --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org