How does caplog introduce flakiness though? That still isn’t
> On 7 Feb 2025, at 11:07, Jarek Potiuk <ja...@potiuk.com> wrote: > > Ash - take a close look at the example I sent. Maybe you missed it. > > * Caplog gets logs from loggers - If your loggers are messed 1000 tests > before and not restored - you will have failing tests. Side effects How is it flakey though? This is the part I’m asking for explaining about. Surely it’s messed up and _ALWAYS_ messed up? > > * Mocking logs is mocking lag method call. It always work and is > side-effect resistant > > > I would only agree to restore back Caplog usag (which as I mentioned > requires new proposal, new consensus or voting as the consensus is already > reached). I will expect the person proposing it to track all the cases > where side effects have been causing problems This is not very collaborative and accepting that people have a life outside of their computer’s Jarek. Here’s a proposal for you: don’t create needless code churn and wholesale replace caplog, instead only swap if it’s not actually a problem. Replace it only when it is noticed as being flakey - which you’ve pointed out two cases, Databricks and Azure.. We have 287 instances of `caplog` in our tests. How many are actually flakey? Why change things that aren’t broken? > > > This happened (and let me repeat it again - I hope it will not be again > ignored this time). About 10 to 15 times in the last few weeks. > > Ash - if you want to propose restoring Caplog usage, please start a new > thread. Ideally with explanation why. > > J. > > pt., 7 lut 2025, 10:15 użytkownik Ash Berlin-Taylor <a...@apache.org> > napisał: > >> How does replacing caplog with mocking help anything, assuming that you >> test what is logged against the mock? >> >> Please explain it to me like I’m 5, cos I don’t see how it makes the >> blindest bit of difference. Testing what is on the caplog output vs what is >> recorded on the mock log object is functional identical. If caplog tests >> are flakey then mock log tests will be equally flakey. The caplog process >> itself is not flakey. It’s what is producing the log that is flakey, no? >> >> If you are talking about removing logging tests entirely if they aren’t >> providing value: sure, but that isn’t what I understood this to be about. >> >>> On 6 Feb 2025, at 19:41, Jarek Potiuk <ja...@potiuk.com> wrote: >>> >>> BTW. Ash - good example of why we want to get rid of caplog is here: >>> >>> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106 >>> >>> This is PR to move databricks provider to the new provider structure. >>> Suddenly in that PR ***WASB*** (yes - we moved **DATABRICKS**) tests are >>> failing: >>> >>> >>> =================================== FAILURES >>> =================================== >>> 2559 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2576 >>> _______ >>> TestWasbHook.test_delete_container_generic_exception[ValueError] _______ >>> 2560 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2577 >>> >>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in >>> test_delete_container_generic_exception >>> 2561 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2578 >>> >>> assert "Error deleting container: mycontainer" in caplog.text >>> 2562 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2579 >>> E >>> AssertionError: assert 'Error deleting container: mycontainer' in '' >>> 2563 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2580 >>> E >>> + where '' = <_pytest.logging.LogCaptureFixture object at >>> 0x7fa7109dcbe0>.text >>> 2564 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2581 >>> ______ >>> TestWasbHook.test_delete_container_generic_exception[RuntimeError] ______ >>> 2565 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2582 >>> >>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in >>> test_delete_container_generic_exception >>> 2566 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2583 >>> >>> assert "Error deleting container: mycontainer" in caplog.text >>> 2567 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2584 >>> E >>> AssertionError: assert 'Error deleting container: mycontainer' in '' >>> 2568 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2585 >>> E >>> + where '' = <_pytest.logging.LogCaptureFixture object at >>> 0x7fa710a65040>.text >>> 2569 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2586 >>> ____________ >>> TestWasbHook.test_delete_container_resource_not_found _____________ >>> 2570 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2587 >>> >>> providers/tests/microsoft/azure/hooks/test_wasb.py:548: in >>> test_delete_container_resource_not_found >>> 2571 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2588 >>> >>> assert "Unable to delete container mycontainer (not found)" in >> caplog.text >>> 2572 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2589 >>> E >>> AssertionError: assert 'Unable to delete container mycontainer (not >> found)' >>> in '' >>> 2573 >>> < >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2590 >>> E >>> + where '' = <_pytest.logging.LogCaptureFixture object at >>> 0x7fa710c37bb0>.text >>> >>> >>> And those are the failing tests. They are using caplog and I am not sure >> if >>> you can figure out why suddenly the expected log entries cannot be found >> in >>> caplog. >>> >>> My best guess is that somewhere, some tests (databricks) are >> reconfiguring >>> the loggers, and it messes up with caplog. Classic side effects. >>> >>> Mocking log methods directly avoids side-effects involved with logger >>> reconfiguraton. This is about 10th time we have this problem while >> moving >>> providers. >>> Here are the tests that are failing. >>> >>> @pytest.mark.parametrize("exc", [ValueError, RuntimeError]) >>> def test_delete_container_generic_exception(self, exc: >> type[Exception], >>> caplog): >>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >>> with mock.patch.object(WasbHook, "_get_container_client") as m: >>> m.return_value.delete_container.side_effect = >>> exc("FakeException") >>> caplog.clear() >>> caplog.set_level("ERROR") >>> with pytest.raises(exc, match="FakeException"): >>> hook.delete_container("mycontainer") >>> assert "Error deleting container: mycontainer" in caplog.text >>> >>> def test_delete_container_resource_not_found(self, caplog): >>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >>> with mock.patch.object(WasbHook, "_get_container_client") as m: >>> m.return_value.delete_container.side_effect = >>> ResourceNotFoundError("FakeException") >>> caplog.clear() >>> caplog.set_level("WARNING") >>> hook.delete_container("mycontainer") >>> assert "Unable to delete container mycontainer (not found)" in >>> caplog.text >>> >>> @mock.patch.object(WasbHook, "delete_blobs") >>> def test_delete_single_blob(self, delete_blobs, >>> mocked_blob_service_client): >>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >>> hook.delete_file("container", "blob", is_prefix=False) >>> delete_blobs.assert_called_once_with("container", "blob") >>> >>> >>> >>> >>> On Thu, Feb 6, 2025 at 3:56 PM Jarek Potiuk <ja...@potiuk.com> wrote: >>> >>>> Just - note that the 72 hrs are specifically designed to cover the >>>> weekend. So when there are such - rather insignificant - issues, I >> think we >>>> should be ok. We have had many of our votes - including the absolutely >> most >>>> important - releases - lasting 72 hours over weekend and it has never >> been >>>> a problem, we only extended it including weekends where the matters were >>>> really important and we expected them to be contentious (this one was >> not). >>>> >>>> Does it mean Ash that we propose extending all kinds of consensus and >>>> voting in this way ? It's the first time it has been raised, so maybe >>>> that's a good idea to make it longer now when we are all additionally >> busy >>>> with Airflow 3, but I don't see any reason why it should be a problem >> "in >>>> general". WDYT? >>>> >>>> Note that your objection is also 5 days after the initial message, so >> even >>>> if it would be 2 days more, that would not change anything. >>>> >>>> IMHO the consensus is reached. and If you want to revert that you should >>>> start a new thread Ash. >>>> >>>> J. >>>> >>>> On Thu, Feb 6, 2025 at 3:35 PM Ash Berlin-Taylor <a...@apache.org> >> wrote: >>>> >>>>> Please be aware of time when you are proposing this. 48 out of the 72 >>>>> hours your had here were over the weekend which doesn’t give many >> people >>>>> adequate time to read and think upon this. >>>>> >>>>> >>>>> Because I’m -1 on this as I don’t have enough info (see my other email) >>>>> to understand why this change makes the first bit of difference. >>>>> >>>>>> On 4 Feb 2025, at 22:38, Buğra Öztürk <ozturkbugr...@gmail.com> >> wrote: >>>>>> >>>>>> Hey all, >>>>>> >>>>>> The consensus has been reached. >>>>>> >>>>>> Regards, >>>>>> >>>>>> On Sat, Feb 1, 2025 at 5:38 PM Buğra Öztürk <ozturkbugr...@gmail.com> >>>>> wrote: >>>>>> >>>>>>> Hello everyone, >>>>>>> >>>>>>> There have been recurring discussions about minimizing the usage of >>>>> caplog >>>>>>> in unit tests. If logs are necessary for a test, we should mock them >>>>> unless >>>>>>> the test strictly requires actual log output. Otherwise, the caplog >>>>> tests >>>>>>> should be removed and logs should be mocked. >>>>>>> >>>>>>> During recent large migrations, changes and ongoing CI/CD efforts, >>>>> caplog >>>>>>> has proven to be flaky in unit tests. They frequently cause red >>>>> pipelines >>>>>>> in PRs and scheduled CI/CD runs. There have been local discussions on >>>>>>> removing caplog from unit tests. To formalize this, I propose a lazy >>>>>>> consensus to remove and prevent caplog usage in unit tests, ensuring >>>>> logs >>>>>>> are mocked when needed and disallowing caplog without mocking unless >>>>>>> explicitly approved. >>>>>>> >>>>>>> *Why should caplog be avoided?* >>>>>>> >>>>>>> - Big maintenance effort on CI/CD >>>>>>> - Instability >>>>>>> >>>>>>> >>>>>>> *What should we do instead if we use caplog in a unit test?* >>>>>>> >>>>>>> - Mocking the Log >>>>>>> >>>>>>> *In the Scope of this consensus* >>>>>>> >>>>>>> - Remove caplog usage from unit tests if possible >>>>>>> - Mock logs and remove caplog from unit tests if possible >>>>>>> - Exceptional cases will be subject to maintainer approval >>>>>>> - Prevent caplog to be included in unit tests without explicitly >>>>>>> mocking the log >>>>>>> >>>>>>> *Action Items related to the above Scope:* >>>>>>> >>>>>>> - Scan and replace caplog tests with mocking where possible >>>>>>> - Scan and remove caplog tests where possible >>>>>>> - Include CI check to prevent adding additional caplog tests without >>>>>>> Mocking and/or without approval from a maintainer to allow >>>>> flexibility in >>>>>>> some exceptional cases >>>>>>> - Create documentation with examples for implementing tests using >>>>> logs >>>>>>> without caplog (e.g., using mocking or avoiding logs entirely) >>>>>>> >>>>>>> >>>>>>> The lazy consensus will be reached on 2025-02-04, (unless someone >>>>> surfaces >>>>>>> an objection). >>>>>>> -- >>>>>>> Bugra Ozturk >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Bugra Ozturk >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>>>> For additional commands, e-mail: dev-h...@airflow.apache.org >>>>> >>>>> >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org