dabla commented on PR #41327: URL: https://github.com/apache/airflow/pull/41327#issuecomment-2326411953
> We actually already use entrypoints - the `provider.yaml` "subset" is already exposed in providers and various provider's capabilities are available this way. They are even automatically extracted from provider.yaml's and exported to documentation https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html > > You can also see decription of it in https://airflow.apache.org/docs/apache-airflow-providers/howto/create-custom-providers.html#custom-provider-packages > > Generally speaking you will have two do few things: > > * Update `provider.yaml` schema https://github.com/apache/airflow/blob/main/airflow/provider.yaml.schema.json to add dialects > * Update `proovider.info` schema https://github.com/apache/airflow/blob/main/airflow/provider_info.schema.json - this is the subset of provider.yaml that gets exposed via provider's entrypoint as dictionary > * Update ProvidersManager https://github.com/apache/airflow/blob/main/airflow/providers_manager.py to discover dialects automatically and expose them via Python API - but with care about efficiency - there is some caching and lazy loading implemented there so you should follow what other components are doing there. > * Add documentation generation to include dialect information in documentation: https://github.com/apache/airflow/blob/main/docs/exts/providers_packages_ref.py + https://github.com/apache/airflow/blob/main/docs/exts/providers_extensions.py > * Update `airflow providers` cli to expose that information as well https://github.com/apache/airflow/blob/main/airflow/cli/commands/provider_command.py > * Add provider.yaml entries for all the providers that need to expose them > > Once you do it all, the `breeze release-management prepare-provider-packages` command wil automatically build and expose the dictionaries retrieved from provider.yaml into entrypoints. This is happening dynamically - we are building `pyproject.toml` for providers while preparing the packages from this JINJA templates: > > * [The entrypoint template](https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/templates/get_provider_info_TEMPLATE.py.jinja2) > * [The pyproject template](https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/templates/pyproject_TEMPLATE.toml.jinja2) > > Another benefit of doing it this way is that "ProvidersManager" will see if provider is available directly in airflow sources and when you run `breeze` where providers are just available in `airflow/providers` sources and not installed as separate packages, ProvidersManager will read the dialect information directly from provider.yaml rather than from entrypoint, so inside breeze it will work as if it was installed as package. Thank you @potiuk for the explanation, will have a look at it and see how to implement it for common sql provider. Another question, at the moment the MsSqlDialect class is also located within the common sql provider just so that it works but where would you put it? 1. in another new (mssql) dialect provider (bit overkill for just a dialect I would say, unless we would create a dialects provider but still seems odd to me)? 2. or in the mssql provider (but that one is actually only needed for pymssql and in fact not necessary for odbc, but seems most logical)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
