Cool. I'd love to have the caplog issue fixed without removing caplog convenience. Let's see if we can get the missing 20% confidence back to make sure no more time of contributors and maintainers is lost because of this one
On Sat, Feb 15, 2025 at 10:35 PM Ash Berlin-Taylor <a...@apache.org> wrote: > > > @pytest.fixture(name="caplog") > def override_caplog(request): > fixture = pytest.LogCaptureFixture(request.node) > yield fixture > fixture._finalize() > > if "airflow.logging_config" in sys.modules: > import airflow.logging_config > > airflow.logging_config.configure_logging() > > > I am 80% confident that this is going to fix the flakeyness from caplog. > I’m introducing it in my Trigger re-work PR. > > > On 7 Feb 2025, at 13:17, Jarek Potiuk <ja...@potiuk.com> wrote: > > > > As clearly explained in the above example - this is not a problem with > > databricsk and azure. This is aproblem in Azure BECAUSE we move > Databricks. > > So clearly a side effect of one group of tests on another. As such - we > > have no idea how many other cases we have like that and who might be > > causing it. We only know that a lot of time of people who try to keep the > > main green is lost because of that. > > > >> This is not very collaborative and accepting that people have a life > > outside of their computer’s Jarek. > > > > No. It's collaborative. This is how our social contract works. If you > want > > to change the rules and have good reasoning for that - propose a change. > > And you still can propose to revert the decision in a separate thread. > This > > happened in the past and it's perfectly fine. > > > > We cannot completely change the social agreement we have on decision > making > > work in our community because one person missed a thread on a really > > insignificant thing. We should just accept it and follow, or propose to > > change a contract. That's how a community works. > > > > > > J > > > > > > > > On Fri, Feb 7, 2025 at 2:05 PM Jarek Potiuk <ja...@potiuk.com> wrote: > > > >>> How is it flakey though? This is the part I’m asking for explaining > >> about. Surely it’s messed up and _ALWAYS_ messed up? > >> > >> See above example posted in the thread. It explains everything. I am not > >> sure if you missed it, but it's the third time I am explaining it and > you > >> keep on asking. The example I have shows not only the copy & paste of > the > >> problem, but link to the exact failure. > >> > >> On Fri, Feb 7, 2025 at 2:00 PM Ash Berlin-Taylor <a...@apache.org> > wrote: > >> > >>> How does caplog introduce flakiness though? That still isn’t > >>> > >>> > >>>> On 7 Feb 2025, at 11:07, Jarek Potiuk <ja...@potiuk.com> wrote: > >>>> > >>>> Ash - take a close look at the example I sent. Maybe you missed it. > >>>> > >>>> * Caplog gets logs from loggers - If your loggers are messed 1000 > tests > >>>> before and not restored - you will have failing tests. Side effects > >>> > >>> How is it flakey though? This is the part I’m asking for explaining > >>> about. Surely it’s messed up and _ALWAYS_ messed up? > >>> > >>>> > >>>> * Mocking logs is mocking lag method call. It always work and is > >>>> side-effect resistant > >>>> > >>>> > >>>> I would only agree to restore back Caplog usag (which as I mentioned > >>>> requires new proposal, new consensus or voting as the consensus is > >>> already > >>>> reached). I will expect the person proposing it to track all the cases > >>>> where side effects have been causing problems > >>> > >>> This is not very collaborative and accepting that people have a life > >>> outside of their computer’s Jarek. > >>> > >>> > >>> Here’s a proposal for you: don’t create needless code churn and > wholesale > >>> replace caplog, instead only swap if it’s not actually a problem. > Replace > >>> it only when it is noticed as being flakey - which you’ve pointed out > two > >>> cases, Databricks and Azure.. We have 287 instances of `caplog` in our > >>> tests. How many are actually flakey? Why change things that aren’t > broken? > >>> > >>>> > >>>> > >>>> This happened (and let me repeat it again - I hope it will not be > again > >>>> ignored this time). About 10 to 15 times in the last few weeks. > >>>> > >>>> Ash - if you want to propose restoring Caplog usage, please start a > new > >>>> thread. Ideally with explanation why. > >>>> > >>>> J. > >>>> > >>>> pt., 7 lut 2025, 10:15 użytkownik Ash Berlin-Taylor <a...@apache.org> > >>>> napisał: > >>>> > >>>>> How does replacing caplog with mocking help anything, assuming that > you > >>>>> test what is logged against the mock? > >>>>> > >>>>> Please explain it to me like I’m 5, cos I don’t see how it makes the > >>>>> blindest bit of difference. Testing what is on the caplog output vs > >>> what is > >>>>> recorded on the mock log object is functional identical. If caplog > >>> tests > >>>>> are flakey then mock log tests will be equally flakey. The caplog > >>> process > >>>>> itself is not flakey. It’s what is producing the log that is flakey, > >>> no? > >>>>> > >>>>> If you are talking about removing logging tests entirely if they > aren’t > >>>>> providing value: sure, but that isn’t what I understood this to be > >>> about. > >>>>> > >>>>>> On 6 Feb 2025, at 19:41, Jarek Potiuk <ja...@potiuk.com> wrote: > >>>>>> > >>>>>> BTW. Ash - good example of why we want to get rid of caplog is > here: > >>>>>> > >>>>>> > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106 > >>>>>> > >>>>>> This is PR to move databricks provider to the new provider > structure. > >>>>>> Suddenly in that PR ***WASB*** (yes - we moved **DATABRICKS**) > tests > >>> are > >>>>>> failing: > >>>>>> > >>>>>> > >>>>>> =================================== FAILURES > >>>>>> =================================== > >>>>>> 2559 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2576 > >>>>>> _______ > >>>>>> TestWasbHook.test_delete_container_generic_exception[ValueError] > >>> _______ > >>>>>> 2560 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2577 > >>>>>> > >>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in > >>>>>> test_delete_container_generic_exception > >>>>>> 2561 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2578 > >>>>>> > >>>>>> assert "Error deleting container: mycontainer" in caplog.text > >>>>>> 2562 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2579 > >>>>>> E > >>>>>> AssertionError: assert 'Error deleting container: mycontainer' in '' > >>>>>> 2563 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2580 > >>>>>> E > >>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at > >>>>>> 0x7fa7109dcbe0>.text > >>>>>> 2564 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2581 > >>>>>> ______ > >>>>>> TestWasbHook.test_delete_container_generic_exception[RuntimeError] > >>> ______ > >>>>>> 2565 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2582 > >>>>>> > >>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in > >>>>>> test_delete_container_generic_exception > >>>>>> 2566 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2583 > >>>>>> > >>>>>> assert "Error deleting container: mycontainer" in caplog.text > >>>>>> 2567 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2584 > >>>>>> E > >>>>>> AssertionError: assert 'Error deleting container: mycontainer' in '' > >>>>>> 2568 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2585 > >>>>>> E > >>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at > >>>>>> 0x7fa710a65040>.text > >>>>>> 2569 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2586 > >>>>>> ____________ > >>>>>> TestWasbHook.test_delete_container_resource_not_found _____________ > >>>>>> 2570 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2587 > >>>>>> > >>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:548: in > >>>>>> test_delete_container_resource_not_found > >>>>>> 2571 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2588 > >>>>>> > >>>>>> assert "Unable to delete container mycontainer (not found)" in > >>>>> caplog.text > >>>>>> 2572 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2589 > >>>>>> E > >>>>>> AssertionError: assert 'Unable to delete container mycontainer (not > >>>>> found)' > >>>>>> in '' > >>>>>> 2573 > >>>>>> < > >>>>> > >>> > https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2590 > >>>>>> E > >>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at > >>>>>> 0x7fa710c37bb0>.text > >>>>>> > >>>>>> > >>>>>> And those are the failing tests. They are using caplog and I am not > >>> sure > >>>>> if > >>>>>> you can figure out why suddenly the expected log entries cannot be > >>> found > >>>>> in > >>>>>> caplog. > >>>>>> > >>>>>> My best guess is that somewhere, some tests (databricks) are > >>>>> reconfiguring > >>>>>> the loggers, and it messes up with caplog. Classic side effects. > >>>>>> > >>>>>> Mocking log methods directly avoids side-effects involved with > logger > >>>>>> reconfiguraton. This is about 10th time we have this problem while > >>>>> moving > >>>>>> providers. > >>>>>> Here are the tests that are failing. > >>>>>> > >>>>>> @pytest.mark.parametrize("exc", [ValueError, RuntimeError]) > >>>>>> def test_delete_container_generic_exception(self, exc: > >>>>> type[Exception], > >>>>>> caplog): > >>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) > >>>>>> with mock.patch.object(WasbHook, "_get_container_client") as m: > >>>>>> m.return_value.delete_container.side_effect = > >>>>>> exc("FakeException") > >>>>>> caplog.clear() > >>>>>> caplog.set_level("ERROR") > >>>>>> with pytest.raises(exc, match="FakeException"): > >>>>>> hook.delete_container("mycontainer") > >>>>>> assert "Error deleting container: mycontainer" in > >>> caplog.text > >>>>>> > >>>>>> def test_delete_container_resource_not_found(self, caplog): > >>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) > >>>>>> with mock.patch.object(WasbHook, "_get_container_client") as m: > >>>>>> m.return_value.delete_container.side_effect = > >>>>>> ResourceNotFoundError("FakeException") > >>>>>> caplog.clear() > >>>>>> caplog.set_level("WARNING") > >>>>>> hook.delete_container("mycontainer") > >>>>>> assert "Unable to delete container mycontainer (not found)" > >>> in > >>>>>> caplog.text > >>>>>> > >>>>>> @mock.patch.object(WasbHook, "delete_blobs") > >>>>>> def test_delete_single_blob(self, delete_blobs, > >>>>>> mocked_blob_service_client): > >>>>>> hook = WasbHook(wasb_conn_id=self.azure_shared_key_test) > >>>>>> hook.delete_file("container", "blob", is_prefix=False) > >>>>>> delete_blobs.assert_called_once_with("container", "blob") > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Thu, Feb 6, 2025 at 3:56 PM Jarek Potiuk <ja...@potiuk.com> > wrote: > >>>>>> > >>>>>>> Just - note that the 72 hrs are specifically designed to cover the > >>>>>>> weekend. So when there are such - rather insignificant - issues, I > >>>>> think we > >>>>>>> should be ok. We have had many of our votes - including the > >>> absolutely > >>>>> most > >>>>>>> important - releases - lasting 72 hours over weekend and it has > never > >>>>> been > >>>>>>> a problem, we only extended it including weekends where the matters > >>> were > >>>>>>> really important and we expected them to be contentious (this one > was > >>>>> not). > >>>>>>> > >>>>>>> Does it mean Ash that we propose extending all kinds of consensus > and > >>>>>>> voting in this way ? It's the first time it has been raised, so > maybe > >>>>>>> that's a good idea to make it longer now when we are all > additionally > >>>>> busy > >>>>>>> with Airflow 3, but I don't see any reason why it should be a > problem > >>>>> "in > >>>>>>> general". WDYT? > >>>>>>> > >>>>>>> Note that your objection is also 5 days after the initial message, > so > >>>>> even > >>>>>>> if it would be 2 days more, that would not change anything. > >>>>>>> > >>>>>>> IMHO the consensus is reached. and If you want to revert that you > >>> should > >>>>>>> start a new thread Ash. > >>>>>>> > >>>>>>> J. > >>>>>>> > >>>>>>> On Thu, Feb 6, 2025 at 3:35 PM Ash Berlin-Taylor <a...@apache.org> > >>>>> wrote: > >>>>>>> > >>>>>>>> Please be aware of time when you are proposing this. 48 out of > the > >>> 72 > >>>>>>>> hours your had here were over the weekend which doesn’t give many > >>>>> people > >>>>>>>> adequate time to read and think upon this. > >>>>>>>> > >>>>>>>> > >>>>>>>> Because I’m -1 on this as I don’t have enough info (see my other > >>> email) > >>>>>>>> to understand why this change makes the first bit of difference. > >>>>>>>> > >>>>>>>>> On 4 Feb 2025, at 22:38, Buğra Öztürk <ozturkbugr...@gmail.com> > >>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hey all, > >>>>>>>>> > >>>>>>>>> The consensus has been reached. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> > >>>>>>>>> On Sat, Feb 1, 2025 at 5:38 PM Buğra Öztürk < > >>> ozturkbugr...@gmail.com> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hello everyone, > >>>>>>>>>> > >>>>>>>>>> There have been recurring discussions about minimizing the usage > >>> of > >>>>>>>> caplog > >>>>>>>>>> in unit tests. If logs are necessary for a test, we should mock > >>> them > >>>>>>>> unless > >>>>>>>>>> the test strictly requires actual log output. Otherwise, the > >>> caplog > >>>>>>>> tests > >>>>>>>>>> should be removed and logs should be mocked. > >>>>>>>>>> > >>>>>>>>>> During recent large migrations, changes and ongoing CI/CD > efforts, > >>>>>>>> caplog > >>>>>>>>>> has proven to be flaky in unit tests. They frequently cause red > >>>>>>>> pipelines > >>>>>>>>>> in PRs and scheduled CI/CD runs. There have been local > >>> discussions on > >>>>>>>>>> removing caplog from unit tests. To formalize this, I propose a > >>> lazy > >>>>>>>>>> consensus to remove and prevent caplog usage in unit tests, > >>> ensuring > >>>>>>>> logs > >>>>>>>>>> are mocked when needed and disallowing caplog without mocking > >>> unless > >>>>>>>>>> explicitly approved. > >>>>>>>>>> > >>>>>>>>>> *Why should caplog be avoided?* > >>>>>>>>>> > >>>>>>>>>> - Big maintenance effort on CI/CD > >>>>>>>>>> - Instability > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> *What should we do instead if we use caplog in a unit test?* > >>>>>>>>>> > >>>>>>>>>> - Mocking the Log > >>>>>>>>>> > >>>>>>>>>> *In the Scope of this consensus* > >>>>>>>>>> > >>>>>>>>>> - Remove caplog usage from unit tests if possible > >>>>>>>>>> - Mock logs and remove caplog from unit tests if possible > >>>>>>>>>> - Exceptional cases will be subject to maintainer approval > >>>>>>>>>> - Prevent caplog to be included in unit tests without explicitly > >>>>>>>>>> mocking the log > >>>>>>>>>> > >>>>>>>>>> *Action Items related to the above Scope:* > >>>>>>>>>> > >>>>>>>>>> - Scan and replace caplog tests with mocking where possible > >>>>>>>>>> - Scan and remove caplog tests where possible > >>>>>>>>>> - Include CI check to prevent adding additional caplog tests > >>> without > >>>>>>>>>> Mocking and/or without approval from a maintainer to allow > >>>>>>>> flexibility in > >>>>>>>>>> some exceptional cases > >>>>>>>>>> - Create documentation with examples for implementing tests > using > >>>>>>>> logs > >>>>>>>>>> without caplog (e.g., using mocking or avoiding logs entirely) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> The lazy consensus will be reached on 2025-02-04, (unless > someone > >>>>>>>> surfaces > >>>>>>>>>> an objection). > >>>>>>>>>> -- > >>>>>>>>>> Bugra Ozturk > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Bugra Ozturk > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>> --------------------------------------------------------------------- > >>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > >>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org > >>>>>>>> > >>>>>>>> > >>>>> > >>>>> > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > >>> For additional commands, e-mail: dev-h...@airflow.apache.org > >>> > >>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > >