potiuk commented on code in PR #35160:
URL: https://github.com/apache/airflow/pull/35160#discussion_r1376495329
##########
tests/operators/test_python.py:
##########
@@ -844,17 +848,30 @@ def f(exit_code):
assert ti.state == expected_state
+venv_cache_path = tempfile.mkdtemp(prefix="venv_cache_path")
Review Comment:
Yeah. Isolation is a key.
And we cannot get isolation when some (most of the DB ones) of our tests do
something like this:
```python
@pytest.fixture(autouse=True)
def base_tests_setup(self, request, create_task_instance_of_operator,
dag_maker):
self.dag_id = f"dag_{slugify(request.cls.__name__)}"
self.task_id = f"task_{slugify(request.node.name, max_length=40)}"
self.run_id = f"run_{slugify(request.node.name, max_length=40)}"
self.ds_templated = self.default_date.date().isoformat()
self.ti_maker = create_task_instance_of_operator
self.dag_maker = dag_maker
self.dag = self.dag_maker(self.dag_id,
template_searchpath=TEMPLATE_SEARCHPATH).dag
clear_db_runs()
yield
clear_db_runs()
```
This is actual code from `BasePythonTest` which is base for all the Python
tests. if you see at the last three lines ... all dag_runs will be cleared
alwasys before and after every test. This is the implementation:
```python
def clear_db_runs():
with create_session() as session:
session.query(Job).delete()
session.query(Trigger).delete()
session.query(DagRun).delete()
session.query(TaskInstance).delete()
```
This is of course key for the test to have proper setup/teardown but also it
means that if it is run via `pytest-xdist` it wil randomly delete all the
dag_runs while other tests are using the same database... (and likely most of
the 7000 other DB tests is doing something very similar to achieve
reproducibility).
Not good. Not good at all.
And even if we could fix it in some tests, there is no way we prevent
someone sneaking in a fixture or setup_method with
`session.query(DagRun).delete()` in order to "clean-up" before and after the
test (which is good idea as long as it does not impact other tests running in
paralllel) . Not when we merge 30 PRs a day. And one faulty test can start
randomly break any other tests, chasing such a "bad apple" would be a nightmare.
That's why I think there are three reasonable approaches:
1) Split your DB tests in serialized test groups where tests are executed
sequentially in a group and each group has their own DB. This is what we have
now. In our self-hosted runners in CI we have up to 8 (this is how many CPUS we
have) complete docker compose instances running - each of them having a
separate DB and each of them running sequentially a subset of tests
(TEST_TYPE). In Public Runners we have up to 2 of those (because we only have 2
CPUS). This powers our tests for years now and saves countless hours of build
time utilising parallelism of those instances - I implemented it long time ago
:)
2) separate out tests that do not use DB (this is what this PR aims for) and
run on them with xdist and to add mechanism to keep it working for the future -
detecting running DB tests in `xdist` environment. And good thing is that we
can "graduate" some tests from DB to non-DB easily after this PR gets merged
(just make them not use DB and remove the marker), This is what I refer to 7000
-> 1500 DB tests. That should be our goal IMHO (roughly - I made that 1500
number up of course, just thinking 10% of our tests should be DB tests)
3) Give each DB test, their own DB. That's probably far too costly as an
overhead (especially that with DB tests we want to - likely test all supported
DBs). I attempted once to expand the approach from 1) and have a separate DB
"per test module" rather than "per test group" - so instead of 12 or so
TEST_TYPES we could have now 200 or so separately started docker-compose
instances with DB and CI containers (up to 8 at a time of course)- but the
overhead for starting/stoping images and DBs was so huge that the tests took
way more than now to complete (at least 5 times more I think)
So I believe 1) is a good middle-ground between parallelism and isolation
and when we get far less number of DB tests by followng 2) route and moving
more tests to be non-DB, we will get "optimal" solution. I think chasing "let's
get the DB tests run with xdist` might not achieve the goal due to
overhead/complexity/flakiness ,
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]