potiuk commented on code in PR #35160:
URL: https://github.com/apache/airflow/pull/35160#discussion_r1376539399
##########
tests/system/providers/amazon/aws/utils/__init__.py:
##########
@@ -106,6 +106,8 @@ def _fetch_from_ssm(key: str, test_name: str | None = None)
-> str:
# Since a default value after the SSM check is allowed, these exceptions
should not stop execution.
except NoCredentialsError as e:
log.info("No boto credentials found: %s", e)
+ except ClientError as e:
Review Comment:
Likely becase previously when this test was run we already had some other
tests initialising credentials. This is one of these typical problems where we
have implicit and unwanted side effects between tests that are run in the same,
predictable sequence.
This is the test that had the problem - it's a DB test:
```python
@pytest.mark.db_test
@pytest.mark.parametrize("example", example_not_suspended_dags(),
ids=relative_path)
def test_should_be_importable(example):
dagbag = DagBag(
dag_folder=example,
include_examples=False,
)
assert len(dagbag.import_errors) == 0,
f"import_errors={str(dagbag.import_errors)}"
assert len(dagbag.dag_ids) >= 1
```
This test (this error appears in non-DB `Always` test that is importing all
system tests) was previously run in a suite of other DB + NON-DB tests and
likely some of the previous Non-DB tests left "attempt" of making the
authentication setup behind - not enough to authenticate (because it had no
auth information) but enough to change the error message when the "import all
system tests" was run previously.
Now this test is run in "DB" suite only so likely the test that was
predictably running before and leaving the side effect is likely run in the
"Non-DB" suite. Or maybe some of the other changes in other tests I made
stopped leaving the side-effect.
This is one example where some tests could have impacted tests behaviour -
even if they do not share the DB. Just localy cache authentication
configuration (even if the attempt failed) is enough.
Just the usual case when you heavily impact the sequence and "neighbours" of
the tests
BTW, After this one gets merged, I am planning to experiment with
randomization of test execution. This is recommended practice for tests to run
them in random order to be able to detect similar side effects - It however
requires quite a discipline and multiple runs to get to a stable state if it
is not used by everyone from the very beginning. I am very interested to see
those when I complete the PR. It could be that we will be able to enable
randomisation for some of the test types but not for the others. But I will
definitely attempt to do so.
BTW. There was a very nice podcast about taming flaky tests in "Talk Python
to me"
https://talkpython.fm/episodes/show/429/taming-flaky-tests which I listened
to. I think it reinvigorated me to try some of those things, knowing that what
I think about is somethign that others also think make sense and how they deal
with it :). I knew about test randomization and why it is good and wanted to
try it - but hearing that it **can** work even in large installation was
encouraging to think "ok let's try it".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]