Re: [PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
Lee-W merged PR #62823: URL: https://github.com/apache/airflow/pull/62823 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
Lee-W commented on code in PR #62823: URL: https://github.com/apache/airflow/pull/62823#discussion_r2887506992 ## devel-common/src/tests_common/test_utils/db.py: ## @@ -62,6 +65,8 @@ ) from tests_common.test_utils.version_compat import AIRFLOW_V_3_0_PLUS, AIRFLOW_V_3_1_PLUS, AIRFLOW_V_3_2_PLUS +_log = logging.getLogger(__name__) Review Comment: ```suggestion log = logging.getLogger(__name__) ``` We don't need to make it private -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
jason810496 commented on code in PR #62823: URL: https://github.com/apache/airflow/pull/62823#discussion_r2887405557 ## devel-common/src/tests_common/test_utils/db.py: ## @@ -74,6 +79,18 @@ from airflow.models.dag_favorite import DagFavorite +def _retry_db(func): +"""Retry on transient DB errors.""" + [email protected](func) +def wrapper(*args, **kwargs): +for attempt in run_with_db_retries(logger=_log): +with attempt: +return func(*args, **kwargs) + +return wrapper Review Comment: Could you please add some comments about the reason of retrying DB here. e.g The flaky error we encounter. In case the further developers don't know about the context and remove the retry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
haseebmalik18 commented on PR #62823: URL: https://github.com/apache/airflow/pull/62823#issuecomment-4000262234 > To be honest, I don't like this approach personally. Can we find a way to make sure the supervisor can get a new connection from the connection Pool? Pool_pre_ping is already enabled but it only checks connections at checkout. Error 2013 means the connection dies mid-query so pre-ping can't catch it. SQLAlchemy does invalidate the dead connection in the pool so the next one is fresh, but the current operation still fails. If I'm not mistaken, I don't think there is a pool config that can recover a failed mid-query operation, it has to be re-executed which is what the retry does here. If you have a approach in mind though I'd love to hear it, happy to collaborate on this together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
henry3260 commented on PR #62823: URL: https://github.com/apache/airflow/pull/62823#issuecomment-3997447094 To be honest, I don't like this approach personally. Can we find a way to make sure the supervisor can get a new connection from the connection Pool? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
haseebmalik18 commented on PR #62823: URL: https://github.com/apache/airflow/pull/62823#issuecomment-3995194290 @jason810496 If you can review please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
[PR] Fix flaky MySQL test failures by retrying test DB setup on transient errors #62768 [airflow]
haseebmalik18 opened a new pull request, #62823: URL: https://github.com/apache/airflow/pull/62823 Tests intermittently fail with `Lost connection to server during query` during test setup on the MySQL backend. The `clear_db_*` test utility functions had no retry logic for transient database errors. This adds a `_retry_db` decorator to all `clear_db_*` functions that retries on `DBAPIError` using the existing `run_with_db_retries` utility, following the same pattern already used in `pytest_plugin.py`. closes: #62768 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
