alexbegg opened a new issue, #33596:
URL: https://github.com/apache/airflow/issues/33596

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   When starting Airflow (the problem seems to be in both 2.6.3 and in 2.7.0, 
see my deployment details below) I am getting the following warning:
   ```
   {providers_manager.py:253} WARNING - Exception when importing 
'airflow.providers.apache.pinot.hooks.pinot.PinotHook' from 
'apache-airflow-providers-apache-pinot' package
   Traceback (most recent call last):
     File 
"/opt/bitnami/airflow/venv/lib/python3.9/site-packages/airflow/utils/module_loading.py",
 line 39, in import_string
       return getattr(module, class_name)
   AttributeError: module 'airflow.providers.apache.pinot.hooks.pinot' has no 
attribute 'PinotHook'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File 
"/opt/bitnami/airflow/venv/lib/python3.9/site-packages/airflow/providers_manager.py",
 line 285, in _sanity_check
       imported_class = import_string(class_name)
     File 
"/opt/bitnami/airflow/venv/lib/python3.9/site-packages/airflow/utils/module_loading.py",
 line 41, in import_string
       raise ImportError(f'Module "{module_path}" does not define a 
"{class_name}" attribute/class')
   ImportError: Module "airflow.providers.apache.pinot.hooks.pinot" does not 
define a "PinotHook" attribute/class
   ```
   
   I looked into the issue and it appears the problem in the Apache Pinot 
provider. The `airflow/providers/apache/pinot/provider.yaml` (which is loaded 
by the `_sanity_check` in `providers_manager`) is referencing a `PinotHook` 
class that does not exist: 
https://github.com/apache/airflow/blob/487b174073c01e03ae64760405a8d88f6a488ca6/airflow/providers/apache/pinot/provider.yaml#L57-L59
   
   The module `airflow.providers.apache.pinot.hooks.pinot` contains 
`PinotAdminHook` and `PinotDbApiHook`, but not `PinotHook` (and the classes 
have been separate since before the Apache classes were split into the Apache 
provider).
   
   I am willing to fix this, but I am not sure which is a better fix:
   
   1. I could list both classes in `connection-types` of `provider.yaml`, but 
keep both as `connection-type: pinot`, but there will be two connection types 
with the same name (which may not be possible?):
       ```yaml
       connection-types:
         - hook-class-name: 
airflow.providers.apache.pinot.hooks.pinot.PinotAdminHook
           connection-type: pinot
         - hook-class-name: 
airflow.providers.apache.pinot.hooks.pinot.PinotDbApiHook
           connection-type: pinot
       ```
       - Note: `create_default_connections` in `airflow/utils/db.py` is 
currently including both connection with the same `conn_type="pinot"`: 
https://github.com/apache/airflow/blob/487b174073c01e03ae64760405a8d88f6a488ca6/airflow/utils/db.py#L474-L492
   2. or we change one (or both) of the connection types to a different name. 
`PinotAdminHook` already uses a default connection name of 
`pinot_admin_default`, and `PinotDbApiHook` already uses a default connection 
name of `pinot_broker_default`, so it might make sense to name these connection 
types `pinot_admin` and `pinot_broker`:
       ```yaml
       connection-types:
         - hook-class-name: 
airflow.providers.apache.pinot.hooks.pinot.PinotAdminHook
           connection-type: pinot_admin
         - hook-class-name: 
airflow.providers.apache.pinot.hooks.pinot.PinotDbApiHook
           connection-type: pinot_broker
       ```
       - I know we will need to change `create_default_connections` in 
`airflow/utils/db.py` (as shown above) if we end up changing the connection 
types. Possibly other places, but I have not seen any other references of 
`conn_type="pinot"` beside the default connections.
   
   Thoughts on which approach is better / less disruptive to users?
   
   ### What you think should happen instead
   
   When starting Airflow the `_sanity_check` in `providers_manager` should not 
trigger a warning. Both connection types should be useable.
   
   ### How to reproduce
   
   I am seeing this every time I start Airflow with the Docker image 
`bitnami/airflow:2.6.3`, but I will also test using Breeze and other ways to 
see if I see the same warning in 2.7.0. But since the problem line of code is 
unchanged in the `main`branch I am sure this is still an issue.
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-apache-pinot==4.1.1
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Docker image `bitnami/airflow:2.6.3` in Docker Compose (this is the latest 
Airflow version for Bitnami's image, they have not yet pushed up a 2.7.0 
version)
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to