Besides the instabilities we saw from caplog I also hope the proposed
fix from @ash helps in stability.

Still from point of reviewer I think proper unit tests should not rely
on log statements emmitted. Rather the logic should be tested. MEans
that in best case logic blocks in code are structured for better
test-ability. Checking for logs is for me only a way if there is no
other means of checking logic has happened.

On 16.02.25 11:02, Buğra Öztürk wrote:
Great, it looks promising! I agree on fixing over removing. The idea is to
reduce the time spent on this. If we can fix it, this would eliminate the
work we should do to remove caplog usage.

On Sat, Feb 15, 2025 at 10:56 PM Jarek Potiuk <ja...@potiuk.com> wrote:

Cool. I'd love to have the caplog issue fixed without removing caplog
convenience. Let's see if we can get the missing 20% confidence back to
make sure no more time of contributors and maintainers is lost because of
this one

On Sat, Feb 15, 2025 at 10:35 PM Ash Berlin-Taylor <a...@apache.org> wrote:


@pytest.fixture(name="caplog")
def override_caplog(request):
     fixture = pytest.LogCaptureFixture(request.node)
     yield fixture
     fixture._finalize()

     if "airflow.logging_config" in sys.modules:
         import airflow.logging_config

         airflow.logging_config.configure_logging()


I am 80% confident that this is going to fix the flakeyness from caplog.
I’m introducing it in my Trigger re-work PR.

On 7 Feb 2025, at 13:17, Jarek Potiuk <ja...@potiuk.com> wrote:

As clearly explained in the above example - this is not a problem with
databricsk and azure. This is aproblem in Azure BECAUSE we move
Databricks.
So clearly a side effect of one group of tests on another. As such - we
have no idea how many other cases we have like that and who might be
causing it. We only know that a lot of time of people who try to keep
the
main green is lost because of that.

This is not very collaborative and accepting that people have a life
outside of their computer’s Jarek.

No. It's collaborative. This is how our social contract works. If you
want
to change the rules and have good reasoning for that - propose a
change.
And you still can propose to revert the decision in a separate thread.
This
happened in the past and it's perfectly fine.

We cannot completely change the social agreement we have on decision
making
work in our community because one person missed a thread on a really
insignificant thing. We should just accept it and follow, or propose to
change a contract. That's how a community works.


J



On Fri, Feb 7, 2025 at 2:05 PM Jarek Potiuk <ja...@potiuk.com> wrote:

How is it flakey though? This is the part I’m asking for explaining
about. Surely it’s messed up and _ALWAYS_ messed up?

See above example posted in the thread. It explains everything. I am
not
sure if you missed it, but it's the third time I am explaining it and
you
keep on asking. The example I have shows not only the copy & paste of
the
problem, but link to the exact failure.

On Fri, Feb 7, 2025 at 2:00 PM Ash Berlin-Taylor <a...@apache.org>
wrote:
How does caplog introduce flakiness though? That still isn’t


On 7 Feb 2025, at 11:07, Jarek Potiuk <ja...@potiuk.com> wrote:

Ash - take a close look at the example I sent. Maybe you missed it.

* Caplog gets logs from loggers - If your loggers are messed 1000
tests
before and not restored - you will have failing tests. Side effects
How is it flakey though? This is the part I’m asking for explaining
about. Surely it’s messed up and _ALWAYS_ messed up?

* Mocking logs is mocking lag method call. It always work and is
side-effect resistant


I would only agree to restore back Caplog usag (which as I mentioned
requires new proposal, new consensus or voting as the consensus is
already
reached). I will expect the person proposing it to track all the
cases
where side effects have been causing problems
This is not very collaborative and accepting that people have a life
outside of their computer’s Jarek.


Here’s a proposal for you: don’t create needless code churn and
wholesale
replace caplog, instead only swap if it’s not actually a problem.
Replace
it only when it is noticed as being flakey - which you’ve pointed out
two
cases, Databricks and Azure.. We have 287 instances of `caplog` in
our
tests. How many are actually flakey? Why change things that aren’t
broken?

This happened (and let me repeat it again - I hope it will not be
again
ignored this time). About 10 to 15 times in the last few weeks.

Ash - if you want to propose restoring Caplog usage, please start a
new
thread. Ideally with explanation why.

J.

pt., 7 lut 2025, 10:15 użytkownik Ash Berlin-Taylor <a...@apache.org
napisał:

How does replacing caplog with mocking help anything, assuming that
you
test what is logged against the mock?

Please explain it to me like I’m 5, cos I don’t see how it makes
the
blindest bit of difference. Testing what is on the caplog output vs
what is
recorded on the mock log object is functional identical. If caplog
tests
are flakey then mock log tests will be equally flakey. The caplog
process
itself is not flakey. It’s what is producing the log that is
flakey,
no?
If you are talking about removing logging tests entirely if they
aren’t
providing value: sure, but that isn’t what I understood this to be
about.
On 6 Feb 2025, at 19:41, Jarek Potiuk <ja...@potiuk.com> wrote:

BTW. Ash -  good example of why we want to get rid of caplog is
here:

https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106
This is PR to move databricks provider to the new provider
structure.
Suddenly in that PR  ***WASB*** (yes - we moved **DATABRICKS**)
tests
are
failing:


=================================== FAILURES
===================================
2559
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2576
_______
TestWasbHook.test_delete_container_generic_exception[ValueError]
_______
2560
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2577
providers/tests/microsoft/azure/hooks/test_wasb.py:539: in
test_delete_container_generic_exception
2561
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2578
assert "Error deleting container: mycontainer" in caplog.text
2562
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2579
E
AssertionError: assert 'Error deleting container: mycontainer' in
''
2563
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2580
E
+ where '' = <_pytest.logging.LogCaptureFixture object at
0x7fa7109dcbe0>.text
2564
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2581
______
TestWasbHook.test_delete_container_generic_exception[RuntimeError]
______
2565
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2582
providers/tests/microsoft/azure/hooks/test_wasb.py:539: in
test_delete_container_generic_exception
2566
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2583
assert "Error deleting container: mycontainer" in caplog.text
2567
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2584
E
AssertionError: assert 'Error deleting container: mycontainer' in
''
2568
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2585
E
+ where '' = <_pytest.logging.LogCaptureFixture object at
0x7fa710a65040>.text
2569
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2586
____________
TestWasbHook.test_delete_container_resource_not_found
_____________
2570
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2587
providers/tests/microsoft/azure/hooks/test_wasb.py:548: in
test_delete_container_resource_not_found
2571
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2588
assert "Unable to delete container mycontainer (not found)" in
caplog.text
2572
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2589
E
AssertionError: assert 'Unable to delete container mycontainer
(not
found)'
in ''
2573
<
https://github.com/apache/airflow/actions/runs/13185566641/job/36807274106#step:6:2590
E
+ where '' = <_pytest.logging.LogCaptureFixture object at
0x7fa710c37bb0>.text


And those are the failing tests. They are using caplog and I am
not
sure
if
you can figure out why suddenly the expected log entries cannot be
found
in
caplog.

My best guess is that somewhere, some tests (databricks) are
reconfiguring
the loggers, and it messes up with caplog. Classic side effects.

Mocking log methods directly avoids side-effects involved with
logger
reconfiguraton.  This is about 10th time we have this problem
while
moving
providers.
Here are the tests that are failing.

  @pytest.mark.parametrize("exc", [ValueError, RuntimeError])
  def test_delete_container_generic_exception(self, exc:
type[Exception],
caplog):
      hook = WasbHook(wasb_conn_id=self.azure_shared_key_test)
      with mock.patch.object(WasbHook, "_get_container_client") as
m:
          m.return_value.delete_container.side_effect =
exc("FakeException")
          caplog.clear()
          caplog.set_level("ERROR")
          with pytest.raises(exc, match="FakeException"):
              hook.delete_container("mycontainer")
          assert "Error deleting container: mycontainer" in
caplog.text
  def test_delete_container_resource_not_found(self, caplog):
      hook = WasbHook(wasb_conn_id=self.azure_shared_key_test)
      with mock.patch.object(WasbHook, "_get_container_client") as
m:
          m.return_value.delete_container.side_effect =
ResourceNotFoundError("FakeException")
          caplog.clear()
          caplog.set_level("WARNING")
          hook.delete_container("mycontainer")
          assert "Unable to delete container mycontainer (not
found)"
in
caplog.text

  @mock.patch.object(WasbHook, "delete_blobs")
  def test_delete_single_blob(self, delete_blobs,
mocked_blob_service_client):
      hook = WasbHook(wasb_conn_id=self.azure_shared_key_test)
      hook.delete_file("container", "blob", is_prefix=False)
      delete_blobs.assert_called_once_with("container", "blob")




On Thu, Feb 6, 2025 at 3:56 PM Jarek Potiuk <ja...@potiuk.com>
wrote:
Just - note that the 72 hrs are specifically designed to cover
the
weekend. So when there are such - rather insignificant - issues,
I
think we
should be ok. We have had many of our votes - including the
absolutely
most
important - releases - lasting 72 hours over weekend and it has
never
been
a problem, we only extended it including weekends where the
matters
were
really important and we expected them to be contentious (this one
was
not).
Does it mean Ash that we propose extending all kinds of consensus
and
voting in this way ? It's the first time it has been raised, so
maybe
that's a good idea to make it longer now when we are all
additionally
busy
with Airflow 3, but I don't see any reason why it should be a
problem
"in
general". WDYT?

Note that your objection is also 5 days after the initial
message,
so
even
if it would be 2 days more, that would not change anything.

IMHO the consensus is reached. and If you want to revert that you
should
start a new thread Ash.

J.

On Thu, Feb 6, 2025 at 3:35 PM Ash Berlin-Taylor <a...@apache.org
wrote:
Please be aware of time when you are proposing this.  48 out of
the
72
hours your had here were over the weekend which doesn’t give
many
people
adequate time to read and think upon this.


Because I’m -1 on this as I don’t have enough info (see my other
email)
to understand why this change makes the first bit of difference.

On 4 Feb 2025, at 22:38, Buğra Öztürk <ozturkbugr...@gmail.com
wrote:
Hey all,

The consensus has been reached.

Regards,

On Sat, Feb 1, 2025 at 5:38 PM Buğra Öztürk <
ozturkbugr...@gmail.com>
wrote:
Hello everyone,

There have been recurring discussions about minimizing the
usage
of
caplog
in unit tests. If logs are necessary for a test, we should
mock
them
unless
the test strictly requires actual log output. Otherwise, the
caplog
tests
should be removed and logs should be mocked.

During recent large migrations, changes and ongoing CI/CD
efforts,
caplog
has proven to be flaky in unit tests. They frequently cause
red
pipelines
in PRs and scheduled CI/CD runs. There have been local
discussions on
removing caplog from unit tests. To formalize this, I propose
a
lazy
consensus to remove and prevent caplog usage in unit tests,
ensuring
logs
are mocked when needed and disallowing caplog without mocking
unless
explicitly approved.

*Why should caplog be avoided?*

- Big maintenance effort on CI/CD
- Instability


*What should we do instead if we use caplog in a unit test?*

- Mocking the Log

*In the Scope of this consensus*

- Remove caplog usage from unit tests if possible
- Mock logs and remove caplog from unit tests if possible
- Exceptional cases will be subject to maintainer approval
- Prevent caplog to be included in unit tests without
explicitly
mocking the log

*Action Items related to the above Scope:*

- Scan and replace caplog tests with mocking where possible
- Scan and remove caplog tests where possible
- Include CI check to prevent adding additional caplog tests
without
Mocking and/or without approval from a maintainer to allow
flexibility in
some exceptional cases
- Create documentation with examples for implementing tests
using
logs
without caplog (e.g., using mocking or avoiding logs entirely)


The lazy consensus will be reached on 2025-02-04,  (unless
someone
surfaces
an objection).
--
Bugra Ozturk


--
Bugra Ozturk


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to