Agreed, testing logs should be a last resort, and we likely to over-use it, but there are at least a few instances where we do explicitly want to test that certain info is included in the logs for users.
-a > On 16 Feb 2025, at 11:16, Jens Scheffler <j_scheff...@gmx.de.INVALID> wrote: > > Besides the instabilities we saw from caplog I also hope the proposed > fix from @ash helps in stability. > > Still from point of reviewer I think proper unit tests should not rely > on log statements emmitted. Rather the logic should be tested. MEans > that in best case logic blocks in code are structured for better > test-ability. Checking for logs is for me only a way if there is no > other means of checking logic has happened. > > On 16.02.25 11:02, Buğra Öztürk wrote: >> Great, it looks promising! I agree on fixing over removing. The idea is to >> reduce the time spent on this. If we can fix it, this would eliminate the >> work we should do to remove caplog usage. >> >> On Sat, Feb 15, 2025 at 10:56 PM Jarek Potiuk <ja...@potiuk.com> wrote: >> >>> Cool. I'd love to have the caplog issue fixed without removing caplog >>> convenience. Let's see if we can get the missing 20% confidence back to >>> make sure no more time of contributors and maintainers is lost because of >>> this one >>> >>> On Sat, Feb 15, 2025 at 10:35 PM Ash Berlin-Taylor <a...@apache.org> wrote: >>> >>>> >>>> @pytest.fixture(name="caplog") >>>> def override_caplog(request): >>>> fixture = pytest.LogCaptureFixture(request.node) >>>> yield fixture >>>> fixture._finalize() >>>> >>>> if "airflow.logging_config" in sys.modules: >>>> import airflow.logging_config >>>> >>>> airflow.logging_config.configure_logging() >>>> >>>> >>>> I am 80% confident that this is going to fix the flakeyness from caplog. >>>> I’m introducing it in my Trigger re-work PR. >>>> >>>>> On 7 Feb 2025, at 13:17, Jarek Potiuk <ja...@potiuk.com> wrote: >>>>> >>>>> As clearly explained in the above example - this is not a problem with >>>>> databricsk and azure. This is aproblem in Azure BECAUSE we move >>>> Databricks. >>>>> So clearly a side effect of one group of tests on another. As such - we >>>>> have no idea how many other cases we have like that and who might be >>>>> causing it. We only know that a lot of time of people who try to keep >>> the >>>>> main green is lost because of that. >>>>> >>>>>> This is not very collaborative and accepting that people have a life >>>>> outside of their computer’s Jarek. >>>>> >>>>> No. It's collaborative. This is how our social contract works. If you >>>> want >>>>> to change the rules and have good reasoning for that - propose a >>> change. >>>>> And you still can propose to revert the decision in a separate thread. >>>> This >>>>> happened in the past and it's perfectly fine. >>>>> >>>>> We cannot completely change the social agreement we have on decision >>>> making >>>>> work in our community because one person missed a thread on a really >>>>> insignificant thing. We should just accept it and follow, or propose to >>>>> change a contract. That's how a community works. >>>>> >>>>> >>>>> J >>>>> >>>>> >>>>> >>>>> On Fri, Feb 7, 2025 at 2:05 PM Jarek Potiuk <ja...@potiuk.com> wrote: >>>>> >>>>>>> How is it flakey though? This is the part I’m asking for explaining >>>>>> about. Surely it’s messed up and _ALWAYS_ messed up? >>>>>> >>>>>> See above example posted in the thread. It explains everything. I am >>> not >>>>>> sure if you missed it, but it's the third time I am explaining it and >>>> you >>>>>> keep on asking. The example I have shows not only the copy & paste of >>>> the >>>>>> problem, but link to the exact failure. >>>>>> >>>>>> On Fri, Feb 7, 2025 at 2:00 PM Ash Berlin-Taylor <a...@apache.org> >>>> wrote: >>>>>>> How does caplog introduce flakiness though? That still isn’t >>>>>>> >>>>>>> >>>>>>>> On 7 Feb 2025, at 11:07, Jarek Potiuk <ja...@potiuk.com> wrote: >>>>>>>> >>>>>>>> Ash - take a close look at the example I sent. Maybe you missed it. >>>>>>>> >>>>>>>> * Caplog gets logs from loggers - If your loggers are messed 1000 >>>> tests >>>>>>>> before and not restored - you will have failing tests. Side effects >>>>>>> How is it flakey though? This is the part I’m asking for explaining >>>>>>> about. Surely it’s messed up and _ALWAYS_ messed up? >>>>>>> >>>>>>>> * Mocking logs is mocking lag method call. It always work and is >>>>>>>> side-effect resistant >>>>>>>> >>>>>>>> >>>>>>>> I would only agree to restore back Caplog usag (which as I mentioned >>>>>>>> requires new proposal, new consensus or voting as the consensus is >>>>>>> already >>>>>>>> reached). I will expect the person proposing it to track all the >>> cases >>>>>>>> where side effects have been causing problems >>>>>>> This is not very collaborative and accepting that people have a life >>>>>>> outside of their computer’s Jarek. >>>>>>> >>>>>>> >>>>>>> Here’s a proposal for you: don’t create needless code churn and >>>> wholesale >>>>>>> replace caplog, instead only swap if it’s not actually a problem. >>>> Replace >>>>>>> it only when it is noticed as being flakey - which you’ve pointed out >>>> two >>>>>>> cases, Databricks and Azure.. We have 287 instances of `caplog` in >>> our >>>>>>> tests. How many are actually flakey? Why change things that aren’t >>>> broken? >>>>>>>> >>>>>>>> This happened (and let me repeat it again - I hope it will not be >>>> again >>>>>>>> ignored this time). About 10 to 15 times in the last few weeks. >>>>>>>> >>>>>>>> Ash - if you want to propose restoring Caplog usage, please start a >>>> new >>>>>>>> thread. Ideally with explanation why. >>>>>>>> >>>>>>>> J. >>>>>>>> >>>>>>>> pt., 7 lut 2025, 10:15 użytkownik Ash Berlin-Taylor <a...@apache.org >>>>>>>> napisał: >>>>>>>> >>>>>>>>> How does replacing caplog with mocking help anything, assuming that >>>> you >>>>>>>>> test what is logged against the mock? >>>>>>>>> >>>>>>>>> Please explain it to me like I’m 5, cos I don’t see how it makes >>> the >>>>>>>>> blindest bit of difference. Testing what is on the caplog output vs >>>>>>> what is >>>>>>>>> recorded on the mock log object is functional identical. If caplog >>>>>>> tests >>>>>>>>> are flakey then mock log tests will be equally flakey. The caplog >>>>>>> process >>>>>>>>> itself is not flakey. It’s what is producing the log that is >>> flakey, >>>>>>> no? >>>>>>>>> If you are talking about removing logging tests entirely if they >>>> aren’t >>>>>>>>> providing value: sure, but that isn’t what I understood this to be >>>>>>> about. >>>>>>>>>> On 6 Feb 2025, at 19:41, Jarek Potiuk <ja...@potiuk.com> wrote: >>>>>>>>>> >>>>>>>>>> BTW. Ash - good example of why we want to get rid of caplog is >>>> here: >>>>>>>>>> >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106 >>>>>>>>>> This is PR to move databricks provider to the new provider >>>> structure. >>>>>>>>>> Suddenly in that PR ***WASB*** (yes - we moved **DATABRICKS**) >>>> tests >>>>>>> are >>>>>>>>>> failing: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> =================================== FAILURES >>>>>>>>>> =================================== >>>>>>>>>> 2559 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2576 >>>>>>>>>> _______ >>>>>>>>>> TestWasbHook.test_delete_container_generic_exception[ValueError] >>>>>>> _______ >>>>>>>>>> 2560 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2577 >>>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in >>>>>>>>>> test_delete_container_generic_exception >>>>>>>>>> 2561 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2578 >>>>>>>>>> assert "Error deleting container: mycontainer" in caplog.text >>>>>>>>>> 2562 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2579 >>>>>>>>>> E >>>>>>>>>> AssertionError: assert 'Error deleting container: mycontainer' in >>> '' >>>>>>>>>> 2563 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2580 >>>>>>>>>> E >>>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at >>>>>>>>>> 0x7fa7109dcbe0>.text >>>>>>>>>> 2564 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2581 >>>>>>>>>> ______ >>>>>>>>>> TestWasbHook.test_delete_container_generic_exception[RuntimeError] >>>>>>> ______ >>>>>>>>>> 2565 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2582 >>>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in >>>>>>>>>> test_delete_container_generic_exception >>>>>>>>>> 2566 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2583 >>>>>>>>>> assert "Error deleting container: mycontainer" in caplog.text >>>>>>>>>> 2567 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2584 >>>>>>>>>> E >>>>>>>>>> AssertionError: assert 'Error deleting container: mycontainer' in >>> '' >>>>>>>>>> 2568 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2585 >>>>>>>>>> E >>>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at >>>>>>>>>> 0x7fa710a65040>.text >>>>>>>>>> 2569 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2586 >>>>>>>>>> ____________ >>>>>>>>>> TestWasbHook.test_delete_container_resource_not_found >>> _____________ >>>>>>>>>> 2570 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2587 >>>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:548: in >>>>>>>>>> test_delete_container_resource_not_found >>>>>>>>>> 2571 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2588 >>>>>>>>>> assert "Unable to delete container mycontainer (not found)" in >>>>>>>>> caplog.text >>>>>>>>>> 2572 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2589 >>>>>>>>>> E >>>>>>>>>> AssertionError: assert 'Unable to delete container mycontainer >>> (not >>>>>>>>> found)' >>>>>>>>>> in '' >>>>>>>>>> 2573 >>>>>>>>>> < >>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2590 >>>>>>>>>> E >>>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at >>>>>>>>>> 0x7fa710c37bb0>.text >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> And those are the failing tests. They are using caplog and I am >>> not >>>>>>> sure >>>>>>>>> if >>>>>>>>>> you can figure out why suddenly the expected log entries cannot be >>>>>>> found >>>>>>>>> in >>>>>>>>>> caplog. >>>>>>>>>> >>>>>>>>>> My best guess is that somewhere, some tests (databricks) are >>>>>>>>> reconfiguring >>>>>>>>>> the loggers, and it messes up with caplog. Classic side effects. >>>>>>>>>> >>>>>>>>>> Mocking log methods directly avoids side-effects involved with >>>> logger >>>>>>>>>> reconfiguraton. This is about 10th time we have this problem >>> while >>>>>>>>> moving >>>>>>>>>> providers. >>>>>>>>>> Here are the tests that are failing. >>>>>>>>>> >>>>>>>>>> @pytest.mark.parametrize("exc", [ValueError, RuntimeError]) >>>>>>>>>> def test_delete_container_generic_exception(self, exc: >>>>>>>>> type[Exception], >>>>>>>>>> caplog): >>>>>>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >>>>>>>>>> with mock.patch.object(WasbHook, "_get_container_client") as >>> m: >>>>>>>>>> m.return_value.delete_container.side_effect = >>>>>>>>>> exc("FakeException") >>>>>>>>>> caplog.clear() >>>>>>>>>> caplog.set_level("ERROR") >>>>>>>>>> with pytest.raises(exc, match="FakeException"): >>>>>>>>>> hook.delete_container("mycontainer") >>>>>>>>>> assert "Error deleting container: mycontainer" in >>>>>>> caplog.text >>>>>>>>>> def test_delete_container_resource_not_found(self, caplog): >>>>>>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >>>>>>>>>> with mock.patch.object(WasbHook, "_get_container_client") as >>> m: >>>>>>>>>> m.return_value.delete_container.side_effect = >>>>>>>>>> ResourceNotFoundError("FakeException") >>>>>>>>>> caplog.clear() >>>>>>>>>> caplog.set_level("WARNING") >>>>>>>>>> hook.delete_container("mycontainer") >>>>>>>>>> assert "Unable to delete container mycontainer (not >>> found)" >>>>>>> in >>>>>>>>>> caplog.text >>>>>>>>>> >>>>>>>>>> @mock.patch.object(WasbHook, "delete_blobs") >>>>>>>>>> def test_delete_single_blob(self, delete_blobs, >>>>>>>>>> mocked_blob_service_client): >>>>>>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >>>>>>>>>> hook.delete_file("container", "blob", is_prefix=False) >>>>>>>>>> delete_blobs.assert_called_once_with("container", "blob") >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Feb 6, 2025 at 3:56 PM Jarek Potiuk <ja...@potiuk.com> >>>> wrote: >>>>>>>>>>> Just - note that the 72 hrs are specifically designed to cover >>> the >>>>>>>>>>> weekend. So when there are such - rather insignificant - issues, >>> I >>>>>>>>> think we >>>>>>>>>>> should be ok. We have had many of our votes - including the >>>>>>> absolutely >>>>>>>>> most >>>>>>>>>>> important - releases - lasting 72 hours over weekend and it has >>>> never >>>>>>>>> been >>>>>>>>>>> a problem, we only extended it including weekends where the >>> matters >>>>>>> were >>>>>>>>>>> really important and we expected them to be contentious (this one >>>> was >>>>>>>>> not). >>>>>>>>>>> Does it mean Ash that we propose extending all kinds of consensus >>>> and >>>>>>>>>>> voting in this way ? It's the first time it has been raised, so >>>> maybe >>>>>>>>>>> that's a good idea to make it longer now when we are all >>>> additionally >>>>>>>>> busy >>>>>>>>>>> with Airflow 3, but I don't see any reason why it should be a >>>> problem >>>>>>>>> "in >>>>>>>>>>> general". WDYT? >>>>>>>>>>> >>>>>>>>>>> Note that your objection is also 5 days after the initial >>> message, >>>> so >>>>>>>>> even >>>>>>>>>>> if it would be 2 days more, that would not change anything. >>>>>>>>>>> >>>>>>>>>>> IMHO the consensus is reached. and If you want to revert that you >>>>>>> should >>>>>>>>>>> start a new thread Ash. >>>>>>>>>>> >>>>>>>>>>> J. >>>>>>>>>>> >>>>>>>>>>> On Thu, Feb 6, 2025 at 3:35 PM Ash Berlin-Taylor <a...@apache.org >>>>>>>>> wrote: >>>>>>>>>>>> Please be aware of time when you are proposing this. 48 out of >>>> the >>>>>>> 72 >>>>>>>>>>>> hours your had here were over the weekend which doesn’t give >>> many >>>>>>>>> people >>>>>>>>>>>> adequate time to read and think upon this. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Because I’m -1 on this as I don’t have enough info (see my other >>>>>>> email) >>>>>>>>>>>> to understand why this change makes the first bit of difference. >>>>>>>>>>>> >>>>>>>>>>>>> On 4 Feb 2025, at 22:38, Buğra Öztürk <ozturkbugr...@gmail.com >>>>>>>>> wrote: >>>>>>>>>>>>> Hey all, >>>>>>>>>>>>> >>>>>>>>>>>>> The consensus has been reached. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Feb 1, 2025 at 5:38 PM Buğra Öztürk < >>>>>>> ozturkbugr...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Hello everyone, >>>>>>>>>>>>>> >>>>>>>>>>>>>> There have been recurring discussions about minimizing the >>> usage >>>>>>> of >>>>>>>>>>>> caplog >>>>>>>>>>>>>> in unit tests. If logs are necessary for a test, we should >>> mock >>>>>>> them >>>>>>>>>>>> unless >>>>>>>>>>>>>> the test strictly requires actual log output. Otherwise, the >>>>>>> caplog >>>>>>>>>>>> tests >>>>>>>>>>>>>> should be removed and logs should be mocked. >>>>>>>>>>>>>> >>>>>>>>>>>>>> During recent large migrations, changes and ongoing CI/CD >>>> efforts, >>>>>>>>>>>> caplog >>>>>>>>>>>>>> has proven to be flaky in unit tests. They frequently cause >>> red >>>>>>>>>>>> pipelines >>>>>>>>>>>>>> in PRs and scheduled CI/CD runs. There have been local >>>>>>> discussions on >>>>>>>>>>>>>> removing caplog from unit tests. To formalize this, I propose >>> a >>>>>>> lazy >>>>>>>>>>>>>> consensus to remove and prevent caplog usage in unit tests, >>>>>>> ensuring >>>>>>>>>>>> logs >>>>>>>>>>>>>> are mocked when needed and disallowing caplog without mocking >>>>>>> unless >>>>>>>>>>>>>> explicitly approved. >>>>>>>>>>>>>> >>>>>>>>>>>>>> *Why should caplog be avoided?* >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Big maintenance effort on CI/CD >>>>>>>>>>>>>> - Instability >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> *What should we do instead if we use caplog in a unit test?* >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Mocking the Log >>>>>>>>>>>>>> >>>>>>>>>>>>>> *In the Scope of this consensus* >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Remove caplog usage from unit tests if possible >>>>>>>>>>>>>> - Mock logs and remove caplog from unit tests if possible >>>>>>>>>>>>>> - Exceptional cases will be subject to maintainer approval >>>>>>>>>>>>>> - Prevent caplog to be included in unit tests without >>> explicitly >>>>>>>>>>>>>> mocking the log >>>>>>>>>>>>>> >>>>>>>>>>>>>> *Action Items related to the above Scope:* >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Scan and replace caplog tests with mocking where possible >>>>>>>>>>>>>> - Scan and remove caplog tests where possible >>>>>>>>>>>>>> - Include CI check to prevent adding additional caplog tests >>>>>>> without >>>>>>>>>>>>>> Mocking and/or without approval from a maintainer to allow >>>>>>>>>>>> flexibility in >>>>>>>>>>>>>> some exceptional cases >>>>>>>>>>>>>> - Create documentation with examples for implementing tests >>>> using >>>>>>>>>>>> logs >>>>>>>>>>>>>> without caplog (e.g., using mocking or avoiding logs entirely) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The lazy consensus will be reached on 2025-02-04, (unless >>>> someone >>>>>>>>>>>> surfaces >>>>>>>>>>>>>> an objection). >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Bugra Ozturk >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Bugra Ozturk >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>>>>>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org >>>>>>> >>>>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>>> For additional commands, e-mail: dev-h...@airflow.apache.org >>>> >>>> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > <mailto:dev-unsubscr...@airflow.apache.org> > For additional commands, e-mail: dev-h...@airflow.apache.org > <mailto:dev-h...@airflow.apache.org>