Changes the level to DEBUG I mean. On Sun, Feb 16, 2025 at 12:40 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> > Checking for logs is for me only a way if there is no > other means of checking logic has happened. > > Unless we **really** want to check if a log is emitted at a specific level > and with specific content. Which we sometimes do, I believe. I would really > love if someone raises an issue that particular log is polluting the > output, changes the level to WARNING and adds unit tests for it. > > On Sun, Feb 16, 2025 at 12:17 PM Jens Scheffler <j_scheff...@gmx.de.invalid> > wrote: > >> Besides the instabilities we saw from caplog I also hope the proposed >> fix from @ash helps in stability. >> >> Still from point of reviewer I think proper unit tests should not rely >> on log statements emmitted. Rather the logic should be tested. MEans >> that in best case logic blocks in code are structured for better >> test-ability. Checking for logs is for me only a way if there is no >> other means of checking logic has happened. >> >> On 16.02.25 11:02, Buğra Öztürk wrote: >> > Great, it looks promising! I agree on fixing over removing. The idea is >> to >> > reduce the time spent on this. If we can fix it, this would eliminate >> the >> > work we should do to remove caplog usage. >> > >> > On Sat, Feb 15, 2025 at 10:56 PM Jarek Potiuk <ja...@potiuk.com> wrote: >> > >> >> Cool. I'd love to have the caplog issue fixed without removing caplog >> >> convenience. Let's see if we can get the missing 20% confidence back to >> >> make sure no more time of contributors and maintainers is lost because >> of >> >> this one >> >> >> >> On Sat, Feb 15, 2025 at 10:35 PM Ash Berlin-Taylor <a...@apache.org> >> wrote: >> >> >> >>> >> >>> @pytest.fixture(name="caplog") >> >>> def override_caplog(request): >> >>> fixture = pytest.LogCaptureFixture(request.node) >> >>> yield fixture >> >>> fixture._finalize() >> >>> >> >>> if "airflow.logging_config" in sys.modules: >> >>> import airflow.logging_config >> >>> >> >>> airflow.logging_config.configure_logging() >> >>> >> >>> >> >>> I am 80% confident that this is going to fix the flakeyness from >> caplog. >> >>> I’m introducing it in my Trigger re-work PR. >> >>> >> >>>> On 7 Feb 2025, at 13:17, Jarek Potiuk <ja...@potiuk.com> wrote: >> >>>> >> >>>> As clearly explained in the above example - this is not a problem >> with >> >>>> databricsk and azure. This is aproblem in Azure BECAUSE we move >> >>> Databricks. >> >>>> So clearly a side effect of one group of tests on another. As such - >> we >> >>>> have no idea how many other cases we have like that and who might be >> >>>> causing it. We only know that a lot of time of people who try to keep >> >> the >> >>>> main green is lost because of that. >> >>>> >> >>>>> This is not very collaborative and accepting that people have a life >> >>>> outside of their computer’s Jarek. >> >>>> >> >>>> No. It's collaborative. This is how our social contract works. If you >> >>> want >> >>>> to change the rules and have good reasoning for that - propose a >> >> change. >> >>>> And you still can propose to revert the decision in a separate >> thread. >> >>> This >> >>>> happened in the past and it's perfectly fine. >> >>>> >> >>>> We cannot completely change the social agreement we have on decision >> >>> making >> >>>> work in our community because one person missed a thread on a really >> >>>> insignificant thing. We should just accept it and follow, or propose >> to >> >>>> change a contract. That's how a community works. >> >>>> >> >>>> >> >>>> J >> >>>> >> >>>> >> >>>> >> >>>> On Fri, Feb 7, 2025 at 2:05 PM Jarek Potiuk <ja...@potiuk.com> >> wrote: >> >>>> >> >>>>>> How is it flakey though? This is the part I’m asking for explaining >> >>>>> about. Surely it’s messed up and _ALWAYS_ messed up? >> >>>>> >> >>>>> See above example posted in the thread. It explains everything. I am >> >> not >> >>>>> sure if you missed it, but it's the third time I am explaining it >> and >> >>> you >> >>>>> keep on asking. The example I have shows not only the copy & paste >> of >> >>> the >> >>>>> problem, but link to the exact failure. >> >>>>> >> >>>>> On Fri, Feb 7, 2025 at 2:00 PM Ash Berlin-Taylor <a...@apache.org> >> >>> wrote: >> >>>>>> How does caplog introduce flakiness though? That still isn’t >> >>>>>> >> >>>>>> >> >>>>>>> On 7 Feb 2025, at 11:07, Jarek Potiuk <ja...@potiuk.com> wrote: >> >>>>>>> >> >>>>>>> Ash - take a close look at the example I sent. Maybe you missed >> it. >> >>>>>>> >> >>>>>>> * Caplog gets logs from loggers - If your loggers are messed 1000 >> >>> tests >> >>>>>>> before and not restored - you will have failing tests. Side >> effects >> >>>>>> How is it flakey though? This is the part I’m asking for explaining >> >>>>>> about. Surely it’s messed up and _ALWAYS_ messed up? >> >>>>>> >> >>>>>>> * Mocking logs is mocking lag method call. It always work and is >> >>>>>>> side-effect resistant >> >>>>>>> >> >>>>>>> >> >>>>>>> I would only agree to restore back Caplog usag (which as I >> mentioned >> >>>>>>> requires new proposal, new consensus or voting as the consensus is >> >>>>>> already >> >>>>>>> reached). I will expect the person proposing it to track all the >> >> cases >> >>>>>>> where side effects have been causing problems >> >>>>>> This is not very collaborative and accepting that people have a >> life >> >>>>>> outside of their computer’s Jarek. >> >>>>>> >> >>>>>> >> >>>>>> Here’s a proposal for you: don’t create needless code churn and >> >>> wholesale >> >>>>>> replace caplog, instead only swap if it’s not actually a problem. >> >>> Replace >> >>>>>> it only when it is noticed as being flakey - which you’ve pointed >> out >> >>> two >> >>>>>> cases, Databricks and Azure.. We have 287 instances of `caplog` in >> >> our >> >>>>>> tests. How many are actually flakey? Why change things that aren’t >> >>> broken? >> >>>>>>> >> >>>>>>> This happened (and let me repeat it again - I hope it will not be >> >>> again >> >>>>>>> ignored this time). About 10 to 15 times in the last few weeks. >> >>>>>>> >> >>>>>>> Ash - if you want to propose restoring Caplog usage, please start >> a >> >>> new >> >>>>>>> thread. Ideally with explanation why. >> >>>>>>> >> >>>>>>> J. >> >>>>>>> >> >>>>>>> pt., 7 lut 2025, 10:15 użytkownik Ash Berlin-Taylor < >> a...@apache.org >> >>>>>>> napisał: >> >>>>>>> >> >>>>>>>> How does replacing caplog with mocking help anything, assuming >> that >> >>> you >> >>>>>>>> test what is logged against the mock? >> >>>>>>>> >> >>>>>>>> Please explain it to me like I’m 5, cos I don’t see how it makes >> >> the >> >>>>>>>> blindest bit of difference. Testing what is on the caplog output >> vs >> >>>>>> what is >> >>>>>>>> recorded on the mock log object is functional identical. If >> caplog >> >>>>>> tests >> >>>>>>>> are flakey then mock log tests will be equally flakey. The caplog >> >>>>>> process >> >>>>>>>> itself is not flakey. It’s what is producing the log that is >> >> flakey, >> >>>>>> no? >> >>>>>>>> If you are talking about removing logging tests entirely if they >> >>> aren’t >> >>>>>>>> providing value: sure, but that isn’t what I understood this to >> be >> >>>>>> about. >> >>>>>>>>> On 6 Feb 2025, at 19:41, Jarek Potiuk <ja...@potiuk.com> wrote: >> >>>>>>>>> >> >>>>>>>>> BTW. Ash - good example of why we want to get rid of caplog is >> >>> here: >> >>>>>>>>> >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106 >> >>>>>>>>> This is PR to move databricks provider to the new provider >> >>> structure. >> >>>>>>>>> Suddenly in that PR ***WASB*** (yes - we moved **DATABRICKS**) >> >>> tests >> >>>>>> are >> >>>>>>>>> failing: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> =================================== FAILURES >> >>>>>>>>> =================================== >> >>>>>>>>> 2559 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2576 >> >>>>>>>>> _______ >> >>>>>>>>> TestWasbHook.test_delete_container_generic_exception[ValueError] >> >>>>>> _______ >> >>>>>>>>> 2560 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2577 >> >>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in >> >>>>>>>>> test_delete_container_generic_exception >> >>>>>>>>> 2561 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2578 >> >>>>>>>>> assert "Error deleting container: mycontainer" in caplog.text >> >>>>>>>>> 2562 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2579 >> >>>>>>>>> E >> >>>>>>>>> AssertionError: assert 'Error deleting container: mycontainer' >> in >> >> '' >> >>>>>>>>> 2563 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2580 >> >>>>>>>>> E >> >>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at >> >>>>>>>>> 0x7fa7109dcbe0>.text >> >>>>>>>>> 2564 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2581 >> >>>>>>>>> ______ >> >>>>>>>>> >> TestWasbHook.test_delete_container_generic_exception[RuntimeError] >> >>>>>> ______ >> >>>>>>>>> 2565 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2582 >> >>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in >> >>>>>>>>> test_delete_container_generic_exception >> >>>>>>>>> 2566 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2583 >> >>>>>>>>> assert "Error deleting container: mycontainer" in caplog.text >> >>>>>>>>> 2567 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2584 >> >>>>>>>>> E >> >>>>>>>>> AssertionError: assert 'Error deleting container: mycontainer' >> in >> >> '' >> >>>>>>>>> 2568 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2585 >> >>>>>>>>> E >> >>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at >> >>>>>>>>> 0x7fa710a65040>.text >> >>>>>>>>> 2569 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2586 >> >>>>>>>>> ____________ >> >>>>>>>>> TestWasbHook.test_delete_container_resource_not_found >> >> _____________ >> >>>>>>>>> 2570 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2587 >> >>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:548: in >> >>>>>>>>> test_delete_container_resource_not_found >> >>>>>>>>> 2571 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2588 >> >>>>>>>>> assert "Unable to delete container mycontainer (not found)" in >> >>>>>>>> caplog.text >> >>>>>>>>> 2572 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2589 >> >>>>>>>>> E >> >>>>>>>>> AssertionError: assert 'Unable to delete container mycontainer >> >> (not >> >>>>>>>> found)' >> >>>>>>>>> in '' >> >>>>>>>>> 2573 >> >>>>>>>>> < >> >> >> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2590 >> >>>>>>>>> E >> >>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at >> >>>>>>>>> 0x7fa710c37bb0>.text >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> And those are the failing tests. They are using caplog and I am >> >> not >> >>>>>> sure >> >>>>>>>> if >> >>>>>>>>> you can figure out why suddenly the expected log entries cannot >> be >> >>>>>> found >> >>>>>>>> in >> >>>>>>>>> caplog. >> >>>>>>>>> >> >>>>>>>>> My best guess is that somewhere, some tests (databricks) are >> >>>>>>>> reconfiguring >> >>>>>>>>> the loggers, and it messes up with caplog. Classic side effects. >> >>>>>>>>> >> >>>>>>>>> Mocking log methods directly avoids side-effects involved with >> >>> logger >> >>>>>>>>> reconfiguraton. This is about 10th time we have this problem >> >> while >> >>>>>>>> moving >> >>>>>>>>> providers. >> >>>>>>>>> Here are the tests that are failing. >> >>>>>>>>> >> >>>>>>>>> @pytest.mark.parametrize("exc", [ValueError, RuntimeError]) >> >>>>>>>>> def test_delete_container_generic_exception(self, exc: >> >>>>>>>> type[Exception], >> >>>>>>>>> caplog): >> >>>>>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >> >>>>>>>>> with mock.patch.object(WasbHook, "_get_container_client") >> as >> >> m: >> >>>>>>>>> m.return_value.delete_container.side_effect = >> >>>>>>>>> exc("FakeException") >> >>>>>>>>> caplog.clear() >> >>>>>>>>> caplog.set_level("ERROR") >> >>>>>>>>> with pytest.raises(exc, match="FakeException"): >> >>>>>>>>> hook.delete_container("mycontainer") >> >>>>>>>>> assert "Error deleting container: mycontainer" in >> >>>>>> caplog.text >> >>>>>>>>> def test_delete_container_resource_not_found(self, caplog): >> >>>>>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >> >>>>>>>>> with mock.patch.object(WasbHook, "_get_container_client") >> as >> >> m: >> >>>>>>>>> m.return_value.delete_container.side_effect = >> >>>>>>>>> ResourceNotFoundError("FakeException") >> >>>>>>>>> caplog.clear() >> >>>>>>>>> caplog.set_level("WARNING") >> >>>>>>>>> hook.delete_container("mycontainer") >> >>>>>>>>> assert "Unable to delete container mycontainer (not >> >> found)" >> >>>>>> in >> >>>>>>>>> caplog.text >> >>>>>>>>> >> >>>>>>>>> @mock.patch.object(WasbHook, "delete_blobs") >> >>>>>>>>> def test_delete_single_blob(self, delete_blobs, >> >>>>>>>>> mocked_blob_service_client): >> >>>>>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) >> >>>>>>>>> hook.delete_file("container", "blob", is_prefix=False) >> >>>>>>>>> delete_blobs.assert_called_once_with("container", "blob") >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Feb 6, 2025 at 3:56 PM Jarek Potiuk <ja...@potiuk.com> >> >>> wrote: >> >>>>>>>>>> Just - note that the 72 hrs are specifically designed to cover >> >> the >> >>>>>>>>>> weekend. So when there are such - rather insignificant - >> issues, >> >> I >> >>>>>>>> think we >> >>>>>>>>>> should be ok. We have had many of our votes - including the >> >>>>>> absolutely >> >>>>>>>> most >> >>>>>>>>>> important - releases - lasting 72 hours over weekend and it has >> >>> never >> >>>>>>>> been >> >>>>>>>>>> a problem, we only extended it including weekends where the >> >> matters >> >>>>>> were >> >>>>>>>>>> really important and we expected them to be contentious (this >> one >> >>> was >> >>>>>>>> not). >> >>>>>>>>>> Does it mean Ash that we propose extending all kinds of >> consensus >> >>> and >> >>>>>>>>>> voting in this way ? It's the first time it has been raised, so >> >>> maybe >> >>>>>>>>>> that's a good idea to make it longer now when we are all >> >>> additionally >> >>>>>>>> busy >> >>>>>>>>>> with Airflow 3, but I don't see any reason why it should be a >> >>> problem >> >>>>>>>> "in >> >>>>>>>>>> general". WDYT? >> >>>>>>>>>> >> >>>>>>>>>> Note that your objection is also 5 days after the initial >> >> message, >> >>> so >> >>>>>>>> even >> >>>>>>>>>> if it would be 2 days more, that would not change anything. >> >>>>>>>>>> >> >>>>>>>>>> IMHO the consensus is reached. and If you want to revert that >> you >> >>>>>> should >> >>>>>>>>>> start a new thread Ash. >> >>>>>>>>>> >> >>>>>>>>>> J. >> >>>>>>>>>> >> >>>>>>>>>> On Thu, Feb 6, 2025 at 3:35 PM Ash Berlin-Taylor < >> a...@apache.org >> >>>>>>>> wrote: >> >>>>>>>>>>> Please be aware of time when you are proposing this. 48 out >> of >> >>> the >> >>>>>> 72 >> >>>>>>>>>>> hours your had here were over the weekend which doesn’t give >> >> many >> >>>>>>>> people >> >>>>>>>>>>> adequate time to read and think upon this. >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> Because I’m -1 on this as I don’t have enough info (see my >> other >> >>>>>> email) >> >>>>>>>>>>> to understand why this change makes the first bit of >> difference. >> >>>>>>>>>>> >> >>>>>>>>>>>> On 4 Feb 2025, at 22:38, Buğra Öztürk < >> ozturkbugr...@gmail.com >> >>>>>>>> wrote: >> >>>>>>>>>>>> Hey all, >> >>>>>>>>>>>> >> >>>>>>>>>>>> The consensus has been reached. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Regards, >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Sat, Feb 1, 2025 at 5:38 PM Buğra Öztürk < >> >>>>>> ozturkbugr...@gmail.com> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>>>> Hello everyone, >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> There have been recurring discussions about minimizing the >> >> usage >> >>>>>> of >> >>>>>>>>>>> caplog >> >>>>>>>>>>>>> in unit tests. If logs are necessary for a test, we should >> >> mock >> >>>>>> them >> >>>>>>>>>>> unless >> >>>>>>>>>>>>> the test strictly requires actual log output. Otherwise, the >> >>>>>> caplog >> >>>>>>>>>>> tests >> >>>>>>>>>>>>> should be removed and logs should be mocked. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> During recent large migrations, changes and ongoing CI/CD >> >>> efforts, >> >>>>>>>>>>> caplog >> >>>>>>>>>>>>> has proven to be flaky in unit tests. They frequently cause >> >> red >> >>>>>>>>>>> pipelines >> >>>>>>>>>>>>> in PRs and scheduled CI/CD runs. There have been local >> >>>>>> discussions on >> >>>>>>>>>>>>> removing caplog from unit tests. To formalize this, I >> propose >> >> a >> >>>>>> lazy >> >>>>>>>>>>>>> consensus to remove and prevent caplog usage in unit tests, >> >>>>>> ensuring >> >>>>>>>>>>> logs >> >>>>>>>>>>>>> are mocked when needed and disallowing caplog without >> mocking >> >>>>>> unless >> >>>>>>>>>>>>> explicitly approved. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> *Why should caplog be avoided?* >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> - Big maintenance effort on CI/CD >> >>>>>>>>>>>>> - Instability >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> *What should we do instead if we use caplog in a unit test?* >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> - Mocking the Log >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> *In the Scope of this consensus* >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> - Remove caplog usage from unit tests if possible >> >>>>>>>>>>>>> - Mock logs and remove caplog from unit tests if possible >> >>>>>>>>>>>>> - Exceptional cases will be subject to maintainer approval >> >>>>>>>>>>>>> - Prevent caplog to be included in unit tests without >> >> explicitly >> >>>>>>>>>>>>> mocking the log >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> *Action Items related to the above Scope:* >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> - Scan and replace caplog tests with mocking where possible >> >>>>>>>>>>>>> - Scan and remove caplog tests where possible >> >>>>>>>>>>>>> - Include CI check to prevent adding additional caplog tests >> >>>>>> without >> >>>>>>>>>>>>> Mocking and/or without approval from a maintainer to allow >> >>>>>>>>>>> flexibility in >> >>>>>>>>>>>>> some exceptional cases >> >>>>>>>>>>>>> - Create documentation with examples for implementing tests >> >>> using >> >>>>>>>>>>> logs >> >>>>>>>>>>>>> without caplog (e.g., using mocking or avoiding logs >> entirely) >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> The lazy consensus will be reached on 2025-02-04, (unless >> >>> someone >> >>>>>>>>>>> surfaces >> >>>>>>>>>>>>> an objection). >> >>>>>>>>>>>>> -- >> >>>>>>>>>>>>> Bugra Ozturk >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> -- >> >>>>>>>>>>>> Bugra Ozturk >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>> >> --------------------------------------------------------------------- >> >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> >>>>>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>> >> --------------------------------------------------------------------- >> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> >>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org >> >>>>>> >> >>>>>> >> >>> >> >>> --------------------------------------------------------------------- >> >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> >>> For additional commands, e-mail: dev-h...@airflow.apache.org >> >>> >> >>> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> For additional commands, e-mail: dev-h...@airflow.apache.org >> >>