potiuk commented on PR #33742: URL: https://github.com/apache/airflow/pull/33742#issuecomment-1694719364
Needs conflict resoiving due to changes commands in main. > This adds an option to spin up `breeze` using `--integration openlineage`, which will spin up a simple implementation of [Marquez](https://marquezproject.ai) and send metrics to namespace `airflow`. > > Notes before merge - > > 1. Marquez in their docs require PostgreSQL 12.1, but it seems to be working with Postgres 13 and 14 as well (but not 15). I can add this to the docs, but a better option may be to spin up a separate Postgres container (though this does significantly affect resourcing). Any thoughts on appropriately handling this? We can print error and exit for 11 and 15 if you try to use openlineage integration. We already do similar checks for mysql/ARM combination. Look for: `def enter_shell(**kwargs) -> RunCommandResult`. > 2. I was only able to get metrics to send by setting Breeze to install `2.7.0`. It did not work in the `main`. I'm not sure whether that indicates a bug in my configuration or a bug that has been introduced in the `main`. That would be surprising. The one difference I can think of - naybe this is connected with the way openlineage provider (and any other) is used in "main" - the openlineage code is loaded directly from sources not from package. So some code might not work the same way (for example queryng package metada will not work if you run it from main (INSTALL_PROVIDERS_FROM_SOURCES="true" is set in breeze and ProvidersManager will look in sources for provider.yaml files rather than querying provider_info endpoint to get all the provider information). @mobuchowski - maybe that rings a bell? > 3. Marquez webserver conflicts with Grafana in that they both spin up on Port `3000`, so the standard mapping pattern of `2<port>` does not work. I opted for `23100` so that this PR did not break `--integration all`, but if there's another preferred pattern, it's an easy fix. It's fine. We do it for others tool -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
