dstandish commented on pull request #13421:
URL: https://github.com/apache/airflow/pull/13421#issuecomment-753694100
hehe no worries :)
i think the example_bash_operator does a good job of illustrating why
example dags should not be used in tests.
in general, tests should have only the elements that are absolutely
necessary to test the behavior they are examining.
let me give an example.
consider this class:
```python
class MyObj:
def __init__(self, param1=None, param2=None):
self.param1 = param1
self.param2 = param2
def get_param1(self):
return self.param1
```
consider these tests
```python
def test_get_param1():
m = MyObj('a')
assert m.get_param1() == 'a'
```
```python
def test_get_param1():
m = MyObj('a', 'b')
assert m.get_param1() == 'a'
```
The first test is better because the value for `param2` is totally
irrelevant to the test. Including 'b' is just extra noise, and it is in fact a
costly distraction because if there is something wrong with the test, it's
another thing we have to chase down to see if it is the cause. And if we need
to make a change, then it's another thing we need to understand.
Similarly, when we use example_bash_operator as a test dag for many
different tests, inevitably we end up with a lot of extra noise in tests. An
example of this is in test `test_backfill_multi_dates` in
`test_backfill_job.py`. It uses example bash operator.
There is a component in there that spells out expected execution order:
```python
expected_execution_order = [
("runme_0", DEFAULT_DATE),
("runme_1", DEFAULT_DATE),
("runme_2", DEFAULT_DATE),
("runme_0", end_date),
("runme_1", end_date),
("runme_2", end_date),
("also_run_this", DEFAULT_DATE),
("also_run_this", end_date),
("run_after_loop", DEFAULT_DATE),
("run_after_loop", end_date),
("run_this_last", DEFAULT_DATE),
("run_this_last", end_date),
]
```
Almost certainly this test could be accomplished with far fewer tasks. But
it uses all of them because that's what is in example_bash_operator. And
probably as people have added to that example, they add to this and other tests
test. But by doing so, we are spending effort, and not getting any return on
our effort. Adding more tasks to this test is not good for the community
because it makes this test _worse_, not better. It increases technical debt.
Keeping test and example combined is not good for users either. In
handcuffs us with regard to writing example dags because we are constrained by
the tests that reference it, tests which we don't want to modify without good
reason.
So in sum the problems break down into two distinct categories.
1. test and example dags are the same
* constrained from designing example dags to best exemplify features
because they have mixed purposes
* can't freely alter an example without dealing with entirely unrelated
tests
2. having too many tests share the same test dag
* makes tests unnecessarily complicated and noisy
* can reach a point where the structure required by one test is
incompatible, or at least very inconvenient, when trying to meet the needs of
another test
* developer inconvenience -- need to understand, and alter, many
different tests, entirely unrelated to your change
- non-trivial risk that "fixing" a test actually breaks it, when did
not really need to change it at all for your PR
what do you think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]