potiuk commented on PR #32612: URL: https://github.com/apache/airflow/pull/32612#issuecomment-1636128933
Yeah. I think it's a good idea and we already have some of that in a different form - not necessarily timing but definitely targetted to balooning time of certain operations We have a dedicated "perf_kit" util package for that and one part of it is for example counting the number of queries to DB which is another metrics of "performance" (I believe those tests helped or at least keep in check some of initial problems we had when adding some core features caused accidental changes in a number of queries to the DB. * utils here: https://github.com/apache/airflow/blob/main/tests/test_utils/perf/perf_kit/sqlalchemy.py * usage here: https://github.com/apache/airflow/blob/main/tests/models/test_dag.py (look for assert_queries_count) Those are better, because they do not depend on the environment. But we have a few pure time-based: https://github.com/apache/airflow/blob/main/tests/core/test_core.py#L82 And the problem there is we have to put hight numbers to account for "slow" environment" - public runner having another test running in parallel - which speeds up test time a lot, but introduces high variability of timing the tests. This is lilkey why you have 3 seconds there (answering @uranusjr comment). The tricky part is that our tests are executed in various "environments" - i.e. they can often be run in parallel with other tests in order to speed up the execution - and that might get very tricky to figure out if the slowness is just environmental or a real regression. But that discussion actually gave me an interesting idea. Maybe we coudl extract all "timing" tests to separate test type and execute it always "alone" only with sqlite - less CPU utilization but more stability of the environment. We could add a separate job for those kind of tests and exclude them from regular tests easily - I think we should do it this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
