Agreed, testing logs should be a last resort, and we likely to over-use it, but 
there are at least a few instances where we do explicitly want to test that 
certain info is included in the logs for users.

-a 

> On 16 Feb 2025, at 11:16, Jens Scheffler <j_scheff...@gmx.de.INVALID> wrote:
> 
> Besides the instabilities we saw from caplog I also hope the proposed
> fix from @ash helps in stability.
> 
> Still from point of reviewer I think proper unit tests should not rely
> on log statements emmitted. Rather the logic should be tested. MEans
> that in best case logic blocks in code are structured for better
> test-ability. Checking for logs is for me only a way if there is no
> other means of checking logic has happened.
> 
> On 16.02.25 11:02, Buğra Öztürk wrote:
>> Great, it looks promising! I agree on fixing over removing. The idea is to
>> reduce the time spent on this. If we can fix it, this would eliminate the
>> work we should do to remove caplog usage.
>> 
>> On Sat, Feb 15, 2025 at 10:56 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>> 
>>> Cool. I'd love to have the caplog issue fixed without removing caplog
>>> convenience. Let's see if we can get the missing 20% confidence back to
>>> make sure no more time of contributors and maintainers is lost because of
>>> this one
>>> 
>>> On Sat, Feb 15, 2025 at 10:35 PM Ash Berlin-Taylor <a...@apache.org> wrote:
>>> 
>>>> 
>>>> @pytest.fixture(name="caplog")
>>>> def override_caplog(request):
>>>>     fixture = pytest.LogCaptureFixture(request.node)
>>>>     yield fixture
>>>>     fixture._finalize()
>>>> 
>>>>     if "airflow.logging_config" in sys.modules:
>>>>         import airflow.logging_config
>>>> 
>>>>         airflow.logging_config.configure_logging()
>>>> 
>>>> 
>>>> I am 80% confident that this is going to fix the flakeyness from caplog.
>>>> I’m introducing it in my Trigger re-work PR.
>>>> 
>>>>> On 7 Feb 2025, at 13:17, Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>> 
>>>>> As clearly explained in the above example - this is not a problem with
>>>>> databricsk and azure. This is aproblem in Azure BECAUSE we move
>>>> Databricks.
>>>>> So clearly a side effect of one group of tests on another. As such - we
>>>>> have no idea how many other cases we have like that and who might be
>>>>> causing it. We only know that a lot of time of people who try to keep
>>> the
>>>>> main green is lost because of that.
>>>>> 
>>>>>> This is not very collaborative and accepting that people have a life
>>>>> outside of their computer’s Jarek.
>>>>> 
>>>>> No. It's collaborative. This is how our social contract works. If you
>>>> want
>>>>> to change the rules and have good reasoning for that - propose a
>>> change.
>>>>> And you still can propose to revert the decision in a separate thread.
>>>> This
>>>>> happened in the past and it's perfectly fine.
>>>>> 
>>>>> We cannot completely change the social agreement we have on decision
>>>> making
>>>>> work in our community because one person missed a thread on a really
>>>>> insignificant thing. We should just accept it and follow, or propose to
>>>>> change a contract. That's how a community works.
>>>>> 
>>>>> 
>>>>> J
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Feb 7, 2025 at 2:05 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>> 
>>>>>>> How is it flakey though? This is the part I’m asking for explaining
>>>>>> about. Surely it’s messed up and _ALWAYS_ messed up?
>>>>>> 
>>>>>> See above example posted in the thread. It explains everything. I am
>>> not
>>>>>> sure if you missed it, but it's the third time I am explaining it and
>>>> you
>>>>>> keep on asking. The example I have shows not only the copy & paste of
>>>> the
>>>>>> problem, but link to the exact failure.
>>>>>> 
>>>>>> On Fri, Feb 7, 2025 at 2:00 PM Ash Berlin-Taylor <a...@apache.org>
>>>> wrote:
>>>>>>> How does caplog introduce flakiness though? That still isn’t
>>>>>>> 
>>>>>>> 
>>>>>>>> On 7 Feb 2025, at 11:07, Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>>>>> 
>>>>>>>> Ash - take a close look at the example I sent. Maybe you missed it.
>>>>>>>> 
>>>>>>>> * Caplog gets logs from loggers - If your loggers are messed 1000
>>>> tests
>>>>>>>> before and not restored - you will have failing tests. Side effects
>>>>>>> How is it flakey though? This is the part I’m asking for explaining
>>>>>>> about. Surely it’s messed up and _ALWAYS_ messed up?
>>>>>>> 
>>>>>>>> * Mocking logs is mocking lag method call. It always work and is
>>>>>>>> side-effect resistant
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I would only agree to restore back Caplog usag (which as I mentioned
>>>>>>>> requires new proposal, new consensus or voting as the consensus is
>>>>>>> already
>>>>>>>> reached). I will expect the person proposing it to track all the
>>> cases
>>>>>>>> where side effects have been causing problems
>>>>>>> This is not very collaborative and accepting that people have a life
>>>>>>> outside of their computer’s Jarek.
>>>>>>> 
>>>>>>> 
>>>>>>> Here’s a proposal for you: don’t create needless code churn and
>>>> wholesale
>>>>>>> replace caplog, instead only swap if it’s not actually a problem.
>>>> Replace
>>>>>>> it only when it is noticed as being flakey - which you’ve pointed out
>>>> two
>>>>>>> cases, Databricks and Azure.. We have 287 instances of `caplog` in
>>> our
>>>>>>> tests. How many are actually flakey? Why change things that aren’t
>>>> broken?
>>>>>>>> 
>>>>>>>> This happened (and let me repeat it again - I hope it will not be
>>>> again
>>>>>>>> ignored this time). About 10 to 15 times in the last few weeks.
>>>>>>>> 
>>>>>>>> Ash - if you want to propose restoring Caplog usage, please start a
>>>> new
>>>>>>>> thread. Ideally with explanation why.
>>>>>>>> 
>>>>>>>> J.
>>>>>>>> 
>>>>>>>> pt., 7 lut 2025, 10:15 użytkownik Ash Berlin-Taylor <a...@apache.org
>>>>>>>> napisał:
>>>>>>>> 
>>>>>>>>> How does replacing caplog with mocking help anything, assuming that
>>>> you
>>>>>>>>> test what is logged against the mock?
>>>>>>>>> 
>>>>>>>>> Please explain it to me like I’m 5, cos I don’t see how it makes
>>> the
>>>>>>>>> blindest bit of difference. Testing what is on the caplog output vs
>>>>>>> what is
>>>>>>>>> recorded on the mock log object is functional identical. If caplog
>>>>>>> tests
>>>>>>>>> are flakey then mock log tests will be equally flakey. The caplog
>>>>>>> process
>>>>>>>>> itself is not flakey. It’s what is producing the log that is
>>> flakey,
>>>>>>> no?
>>>>>>>>> If you are talking about removing logging tests entirely if they
>>>> aren’t
>>>>>>>>> providing value: sure, but that isn’t what I understood this to be
>>>>>>> about.
>>>>>>>>>> On 6 Feb 2025, at 19:41, Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> BTW. Ash -  good example of why we want to get rid of caplog is
>>>> here:
>>>>>>>>>> 
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106
>>>>>>>>>> This is PR to move databricks provider to the new provider
>>>> structure.
>>>>>>>>>> Suddenly in that PR  ***WASB*** (yes - we moved **DATABRICKS**)
>>>> tests
>>>>>>> are
>>>>>>>>>> failing:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> =================================== FAILURES
>>>>>>>>>> ===================================
>>>>>>>>>> 2559
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2576
>>>>>>>>>> _______
>>>>>>>>>> TestWasbHook.test_delete_container_generic_exception[ValueError]
>>>>>>> _______
>>>>>>>>>> 2560
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2577
>>>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in
>>>>>>>>>> test_delete_container_generic_exception
>>>>>>>>>> 2561
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2578
>>>>>>>>>> assert "Error deleting container: mycontainer" in caplog.text
>>>>>>>>>> 2562
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2579
>>>>>>>>>> E
>>>>>>>>>> AssertionError: assert 'Error deleting container: mycontainer' in
>>> ''
>>>>>>>>>> 2563
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2580
>>>>>>>>>> E
>>>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at
>>>>>>>>>> 0x7fa7109dcbe0>.text
>>>>>>>>>> 2564
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2581
>>>>>>>>>> ______
>>>>>>>>>> TestWasbHook.test_delete_container_generic_exception[RuntimeError]
>>>>>>> ______
>>>>>>>>>> 2565
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2582
>>>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:539: in
>>>>>>>>>> test_delete_container_generic_exception
>>>>>>>>>> 2566
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2583
>>>>>>>>>> assert "Error deleting container: mycontainer" in caplog.text
>>>>>>>>>> 2567
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2584
>>>>>>>>>> E
>>>>>>>>>> AssertionError: assert 'Error deleting container: mycontainer' in
>>> ''
>>>>>>>>>> 2568
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2585
>>>>>>>>>> E
>>>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at
>>>>>>>>>> 0x7fa710a65040>.text
>>>>>>>>>> 2569
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2586
>>>>>>>>>> ____________
>>>>>>>>>> TestWasbHook.test_delete_container_resource_not_found
>>> _____________
>>>>>>>>>> 2570
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2587
>>>>>>>>>> providers/tests/microsoft/azure/hooks/test_wasb.py:548: in
>>>>>>>>>> test_delete_container_resource_not_found
>>>>>>>>>> 2571
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2588
>>>>>>>>>> assert "Unable to delete container mycontainer (not found)" in
>>>>>>>>> caplog.text
>>>>>>>>>> 2572
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2589
>>>>>>>>>> E
>>>>>>>>>> AssertionError: assert 'Unable to delete container mycontainer
>>> (not
>>>>>>>>> found)'
>>>>>>>>>> in ''
>>>>>>>>>> 2573
>>>>>>>>>> <
>>> https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2590
>>>>>>>>>> E
>>>>>>>>>> + where '' = <_pytest.logging.LogCaptureFixture object at
>>>>>>>>>> 0x7fa710c37bb0>.text
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> And those are the failing tests. They are using caplog and I am
>>> not
>>>>>>> sure
>>>>>>>>> if
>>>>>>>>>> you can figure out why suddenly the expected log entries cannot be
>>>>>>> found
>>>>>>>>> in
>>>>>>>>>> caplog.
>>>>>>>>>> 
>>>>>>>>>> My best guess is that somewhere, some tests (databricks) are
>>>>>>>>> reconfiguring
>>>>>>>>>> the loggers, and it messes up with caplog. Classic side effects.
>>>>>>>>>> 
>>>>>>>>>> Mocking log methods directly avoids side-effects involved with
>>>> logger
>>>>>>>>>> reconfiguraton.  This is about 10th time we have this problem
>>> while
>>>>>>>>> moving
>>>>>>>>>> providers.
>>>>>>>>>> Here are the tests that are failing.
>>>>>>>>>> 
>>>>>>>>>>  @pytest.mark.parametrize("exc", [ValueError, RuntimeError])
>>>>>>>>>>  def test_delete_container_generic_exception(self, exc:
>>>>>>>>> type[Exception],
>>>>>>>>>> caplog):
>>>>>>>>>>      hook = WasbHook(wasb_conn_id=self.azure_shared_key_test)
>>>>>>>>>>      with mock.patch.object(WasbHook, "_get_container_client") as
>>> m:
>>>>>>>>>>          m.return_value.delete_container.side_effect =
>>>>>>>>>> exc("FakeException")
>>>>>>>>>>          caplog.clear()
>>>>>>>>>>          caplog.set_level("ERROR")
>>>>>>>>>>          with pytest.raises(exc, match="FakeException"):
>>>>>>>>>>              hook.delete_container("mycontainer")
>>>>>>>>>>          assert "Error deleting container: mycontainer" in
>>>>>>> caplog.text
>>>>>>>>>>  def test_delete_container_resource_not_found(self, caplog):
>>>>>>>>>>      hook = WasbHook(wasb_conn_id=self.azure_shared_key_test)
>>>>>>>>>>      with mock.patch.object(WasbHook, "_get_container_client") as
>>> m:
>>>>>>>>>>          m.return_value.delete_container.side_effect =
>>>>>>>>>> ResourceNotFoundError("FakeException")
>>>>>>>>>>          caplog.clear()
>>>>>>>>>>          caplog.set_level("WARNING")
>>>>>>>>>>          hook.delete_container("mycontainer")
>>>>>>>>>>          assert "Unable to delete container mycontainer (not
>>> found)"
>>>>>>> in
>>>>>>>>>> caplog.text
>>>>>>>>>> 
>>>>>>>>>>  @mock.patch.object(WasbHook, "delete_blobs")
>>>>>>>>>>  def test_delete_single_blob(self, delete_blobs,
>>>>>>>>>> mocked_blob_service_client):
>>>>>>>>>>      hook = WasbHook(wasb_conn_id=self.azure_shared_key_test)
>>>>>>>>>>      hook.delete_file("container", "blob", is_prefix=False)
>>>>>>>>>>      delete_blobs.assert_called_once_with("container", "blob")
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Feb 6, 2025 at 3:56 PM Jarek Potiuk <ja...@potiuk.com>
>>>> wrote:
>>>>>>>>>>> Just - note that the 72 hrs are specifically designed to cover
>>> the
>>>>>>>>>>> weekend. So when there are such - rather insignificant - issues,
>>> I
>>>>>>>>> think we
>>>>>>>>>>> should be ok. We have had many of our votes - including the
>>>>>>> absolutely
>>>>>>>>> most
>>>>>>>>>>> important - releases - lasting 72 hours over weekend and it has
>>>> never
>>>>>>>>> been
>>>>>>>>>>> a problem, we only extended it including weekends where the
>>> matters
>>>>>>> were
>>>>>>>>>>> really important and we expected them to be contentious (this one
>>>> was
>>>>>>>>> not).
>>>>>>>>>>> Does it mean Ash that we propose extending all kinds of consensus
>>>> and
>>>>>>>>>>> voting in this way ? It's the first time it has been raised, so
>>>> maybe
>>>>>>>>>>> that's a good idea to make it longer now when we are all
>>>> additionally
>>>>>>>>> busy
>>>>>>>>>>> with Airflow 3, but I don't see any reason why it should be a
>>>> problem
>>>>>>>>> "in
>>>>>>>>>>> general". WDYT?
>>>>>>>>>>> 
>>>>>>>>>>> Note that your objection is also 5 days after the initial
>>> message,
>>>> so
>>>>>>>>> even
>>>>>>>>>>> if it would be 2 days more, that would not change anything.
>>>>>>>>>>> 
>>>>>>>>>>> IMHO the consensus is reached. and If you want to revert that you
>>>>>>> should
>>>>>>>>>>> start a new thread Ash.
>>>>>>>>>>> 
>>>>>>>>>>> J.
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Feb 6, 2025 at 3:35 PM Ash Berlin-Taylor <a...@apache.org
>>>>>>>>> wrote:
>>>>>>>>>>>> Please be aware of time when you are proposing this.  48 out of
>>>> the
>>>>>>> 72
>>>>>>>>>>>> hours your had here were over the weekend which doesn’t give
>>> many
>>>>>>>>> people
>>>>>>>>>>>> adequate time to read and think upon this.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Because I’m -1 on this as I don’t have enough info (see my other
>>>>>>> email)
>>>>>>>>>>>> to understand why this change makes the first bit of difference.
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 4 Feb 2025, at 22:38, Buğra Öztürk <ozturkbugr...@gmail.com
>>>>>>>>> wrote:
>>>>>>>>>>>>> Hey all,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The consensus has been reached.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Sat, Feb 1, 2025 at 5:38 PM Buğra Öztürk <
>>>>>>> ozturkbugr...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> There have been recurring discussions about minimizing the
>>> usage
>>>>>>> of
>>>>>>>>>>>> caplog
>>>>>>>>>>>>>> in unit tests. If logs are necessary for a test, we should
>>> mock
>>>>>>> them
>>>>>>>>>>>> unless
>>>>>>>>>>>>>> the test strictly requires actual log output. Otherwise, the
>>>>>>> caplog
>>>>>>>>>>>> tests
>>>>>>>>>>>>>> should be removed and logs should be mocked.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> During recent large migrations, changes and ongoing CI/CD
>>>> efforts,
>>>>>>>>>>>> caplog
>>>>>>>>>>>>>> has proven to be flaky in unit tests. They frequently cause
>>> red
>>>>>>>>>>>> pipelines
>>>>>>>>>>>>>> in PRs and scheduled CI/CD runs. There have been local
>>>>>>> discussions on
>>>>>>>>>>>>>> removing caplog from unit tests. To formalize this, I propose
>>> a
>>>>>>> lazy
>>>>>>>>>>>>>> consensus to remove and prevent caplog usage in unit tests,
>>>>>>> ensuring
>>>>>>>>>>>> logs
>>>>>>>>>>>>>> are mocked when needed and disallowing caplog without mocking
>>>>>>> unless
>>>>>>>>>>>>>> explicitly approved.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> *Why should caplog be avoided?*
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - Big maintenance effort on CI/CD
>>>>>>>>>>>>>> - Instability
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> *What should we do instead if we use caplog in a unit test?*
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - Mocking the Log
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> *In the Scope of this consensus*
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - Remove caplog usage from unit tests if possible
>>>>>>>>>>>>>> - Mock logs and remove caplog from unit tests if possible
>>>>>>>>>>>>>> - Exceptional cases will be subject to maintainer approval
>>>>>>>>>>>>>> - Prevent caplog to be included in unit tests without
>>> explicitly
>>>>>>>>>>>>>> mocking the log
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> *Action Items related to the above Scope:*
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - Scan and replace caplog tests with mocking where possible
>>>>>>>>>>>>>> - Scan and remove caplog tests where possible
>>>>>>>>>>>>>> - Include CI check to prevent adding additional caplog tests
>>>>>>> without
>>>>>>>>>>>>>> Mocking and/or without approval from a maintainer to allow
>>>>>>>>>>>> flexibility in
>>>>>>>>>>>>>> some exceptional cases
>>>>>>>>>>>>>> - Create documentation with examples for implementing tests
>>>> using
>>>>>>>>>>>> logs
>>>>>>>>>>>>>> without caplog (e.g., using mocking or avoiding logs entirely)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The lazy consensus will be reached on 2025-02-04,  (unless
>>>> someone
>>>>>>>>>>>> surfaces
>>>>>>>>>>>>>> an objection).
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Bugra Ozturk
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Bugra Ozturk
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>>>>>>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
>>>>>>> 
>>>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
>>>> 
>>>> 
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org 
> <mailto:dev-unsubscr...@airflow.apache.org>
> For additional commands, e-mail: dev-h...@airflow.apache.org 
> <mailto:dev-h...@airflow.apache.org>

Reply via email to