kaxil commented on code in PR #64998:
URL: https://github.com/apache/airflow/pull/64998#discussion_r3316978092


##########
airflow-core/docs/extra-packages-ref.rst:
##########
@@ -192,6 +192,8 @@ custom bash/python providers).
 
+---------------------+-----------------------------------------------------+------------------------------------------------+
 | apache-beam         | ``pip install 'apache-airflow[apache-beam]'``       | 
Apache Beam operators & hooks                  |
 
+---------------------+-----------------------------------------------------+------------------------------------------------+
+| apache-datafusion   | ``pip install 'apache-airflow[apache-datafusion]'`` | 
Apache DataFusion provider package             |

Review Comment:
   Alphabetical order: `apache-cassandra` should come before 
`apache-datafusion` here. The entry was inserted between `apache-beam` and 
`apache-cassandra`, which breaks the alphabetical sort the rest of the table 
follows. Move it down two rows so the order reads beam, cassandra, datafusion, 
drill.



##########
.github/boring-cyborg.yml:
##########
@@ -30,6 +30,9 @@ labelPRBasedOnFilePath:
   provider:apache-beam:
     - providers/apache/beam/**
 
+  provider:apache-datafusion:

Review Comment:
   Same alphabetical-order issue as the docs table. Current order is 
`apache-beam` -> `apache-datafusion` -> `apache-cassandra` -> `apache-drill`. 
The labeler entries are alphabetical elsewhere in this file. Move the 
`apache-datafusion` block below `apache-cassandra`.



##########
providers/apache/datafusion/README.rst:
##########
@@ -0,0 +1,65 @@
+
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!
+
+.. IF YOU WANT TO MODIFY TEMPLATE FOR THIS FILE, YOU SHOULD MODIFY THE TEMPLATE
+   ``PROVIDER_README_TEMPLATE.rst.jinja2`` IN the 
``dev/breeze/src/airflow_breeze/templates`` DIRECTORY
+
+Package ``apache-airflow-providers-apache-datafusion``
+
+Release: ``0.1.0``
+
+
+`Apache DataFusion <https://datafusion.apache.org/>`__
+
+
+Provider package
+----------------
+
+This is a provider package for ``apache.datafusion`` provider. All classes for 
this provider package
+are in ``airflow.providers.apache.datafusion`` python package.
+
+You can find package information and changelog for the provider
+in the `documentation 
<https://airflow.apache.org/docs/apache-airflow-providers-apache-datafusion/0.1.0/>`_.
+
+Installation
+------------
+
+You can install this package on top of an existing Airflow installation (see 
``Requirements`` below
+for the minimum Airflow version supported) via
+``pip install apache-airflow-providers-apache-datafusion``
+
+The package supports the following python versions: 3.10,3.11,3.12,3.13,3.14
+
+Requirements
+------------
+
+===================  ==================
+PIP package          Version required
+===================  ==================
+``apache-airflow``   ``>=2.11.0``
+===================  ==================
+
+.. note::
+
+   This provider is currently not ready and only contains the initial package 
skeleton.

Review Comment:
   This file is marked "AUTOMATICALLY GENERATED" at the top, and the "currently 
not ready / package skeleton" sentence isn't in 
`PROVIDER_README_TEMPLATE.rst.jinja2`. It will be clobbered the next time the 
README is regenerated (at release time). If you want a durable disclaimer, the 
right place is either the template or the `description:` field in 
`provider.yaml`.



##########
providers/apache/datafusion/pyproject.toml:
##########
@@ -0,0 +1,111 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!
+
+# IF YOU WANT TO MODIFY THIS FILE EXCEPT DEPENDENCIES, YOU SHOULD MODIFY THE 
TEMPLATE
+# `pyproject_TEMPLATE.toml.jinja2` IN the 
`dev/breeze/src/airflow_breeze/templates` DIRECTORY
+[build-system]
+requires = ["flit_core==3.12.0"]
+build-backend = "flit_core.buildapi"
+
+[project]
+name = "apache-airflow-providers-apache-datafusion"
+version = "0.1.0"
+description = "Provider package apache-airflow-providers-apache-datafusion for 
Apache Airflow"
+readme = "README.rst"
+license = "Apache-2.0"
+license-files = ['LICENSE', 'NOTICE']
+authors = [
+    {name="Apache Software Foundation", email="[email protected]"},
+]
+maintainers = [
+    {name="Apache Software Foundation", email="[email protected]"},
+]
+keywords = [ "airflow-provider", "apache.datafusion", "airflow", "integration" 
]
+classifiers = [
+    "Development Status :: 5 - Production/Stable",
+    "Environment :: Console",
+    "Environment :: Web Environment",
+    "Intended Audience :: Developers",
+    "Intended Audience :: System Administrators",
+    "Framework :: Apache Airflow",
+    "Framework :: Apache Airflow :: Provider",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Programming Language :: Python :: 3.13",
+    "Programming Language :: Python :: 3.14",
+    "Topic :: System :: Monitoring",
+]
+requires-python = ">=3.10"
+
+# The dependencies should be modified in place in the generated file.
+# Any change in the dependencies is preserved when the file is regenerated
+# Make sure to run ``prek update-providers-dependencies --all-files``
+# After you modify the dependencies, and rebuild your Breeze CI image with 
``breeze ci-image build``
+dependencies = [
+    "apache-airflow>=2.11.0",
+]
+
+[dependency-groups]
+dev = [
+    "apache-airflow",
+    "apache-airflow-task-sdk",
+    "apache-airflow-devel-common",
+    # Additional devel dependencies (do not remove this line and add extra 
development dependencies)
+]
+
+# To build docs:
+#
+#    uv run --group docs build-docs
+#
+# To enable auto-refreshing build with server:
+#
+#    uv run --group docs build-docs --autobuild
+#
+# To see more options:
+#
+#    uv run --group docs build-docs --help
+#
+docs = [
+    "apache-airflow-devel-common[docs]"
+]
+
+[tool.uv.sources]
+# These names must match the names as defined in the pyproject.toml of the 
workspace items,
+# *not* the workspace folder paths
+apache-airflow = {workspace = true}
+apache-airflow-devel-common = {workspace = true}
+apache-airflow-task-sdk = {workspace = true}
+apache-airflow-providers-common-sql = {workspace = true}
+apache-airflow-providers-standard = {workspace = true}
+
+[project.urls]
+"Documentation" = 
"https://airflow.apache.org/docs/apache-airflow-providers-apache-datafusion/0.1.0";
+"Changelog" = 
"https://airflow.apache.org/docs/apache-airflow-providers-apache-datafusion/0.1.0/changelog.html";
+"Bug Tracker" = "https://github.com/apache/airflow/issues";
+"Source Code" = "https://github.com/apache/airflow";
+"Slack Chat" = "https://s.apache.org/airflow-slack";
+"Mastodon" = "https://fosstodon.org/@airflow";
+"YouTube" = "https://www.youtube.com/channel/UCSXwxpWZQ7XZ1WL3wqevChA/";
+
+[project.entry-points."apache_airflow_provider"]
+provider_info = 
"airflow.providers.apache.datafusion.get_provider_info:get_provider_info"
+
+[tool.flit.module]
+name = "airflow.providers.apache.datafusion"

Review Comment:
   Other in-tree flit providers (cassandra, vespa, akeyless, ...) carry an 
explicit `[tool.flit.sdist]` block directly after `[tool.flit.module]`, with 
the comment "Explicit sdist contents so the build does not rely on VCS 
information (flit 4.0 makes --no-use-vcs the default -- see pypa/flit#782)." 
This file is missing that block, so the sdist contents will depend on VCS 
state. Looks like the pyproject was generated from an older template -- 
regenerating (or copying the block from a recent provider like 
`providers/apache/cassandra/pyproject.toml`) should add it.



##########
providers/apache/datafusion/provider.yaml:
##########
@@ -0,0 +1,32 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-apache-datafusion
+name: Apache DataFusion
+description: |
+    `Apache DataFusion <https://datafusion.apache.org/>`__
+
+state: not-ready

Review Comment:
   Other in-tree skeleton providers (vespa, akeyless, common-ai, informatica) 
use `state: ready` with `lifecycle: incubation` even when the package only 
contains the skeleton. `state: not-ready` here will exclude the provider from 
regular builds and releases (see `valid_states` handling in 
`dev/breeze/src/airflow_breeze/utils/packages.py`). Is the intent to defer the 
first release until hooks/operators land? If so, fine. If you wanted "release 
at 0.1.0 as an incubating provider," switch to `state: ready` to match the 
others.



##########
providers/apache/datafusion/tests/unit/apache/datafusion/test_example.py:
##########
@@ -0,0 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Example test
+from __future__ import annotations
+
+
+def test_example():
+    assert True

Review Comment:
   `assert True` will pass even if the provider metadata is wrong, the entry 
point is broken, or the package fails to import. Since this PR adds a new 
provider package, it would be worth asserting something real -- e.g. that 
`airflow.providers.apache.datafusion.get_provider_info.get_provider_info()` 
returns the expected `package-name` and `name`. That catches both the import 
path and the provider registration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to