kaxil commented on code in PR #62261: URL: https://github.com/apache/airflow/pull/62261#discussion_r2836972872
########## dev/registry/extract_metadata.py: ########## Review Comment: Good question, and I get the concern for sure. On Sphinx overlap, the registry and Sphinx docs serve different purposes. Sphinx gives us per-provider API reference as HTML. The registry needs structured JSON for a searchable cross-provider catalog: constructor parameters with types/defaults, PyPI download stats, connection form metadata (from get_connection_form_widgets() at runtime), module-to-category mappings, etc. None of that exists in Sphinx output. We do use Sphinx `objects.inv` files for docs URL resolution though. Embedding this into Sphinx extensions would tie the registry build to the full docs pipeline, which is heavier and slower than what we need. I am going to explore some prek hook integration next week and will look at this with fresh eyes. The other thing I need to add is backfill script, which current is not checked in. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
