Re: [python-committers] Codecov and PR

Brett Cannon Wed, 26 Apr 2017 10:46:17 -0700

On Tue, 25 Apr 2017 at 17:00 Terry Reedy <tjre...@udel.edu> wrote:

> On 4/25/2017 11:00 AM, Barry Warsaw wrote:
> > On Apr 24, 2017, at 09:32 PM, Ethan Furman wrote:
> >>On 04/21/2017 03:29 PM, Victor Stinner wrote:
>
> (In the context of having a patch blocked by the blind Codecov robot ...)
>
> >>> I dislike code coverage because there is a temptation to write
> artificial
> >>> tests whereas the code is tested indirectly or the code is not
> important
> >>> enough to *require* tests.
>
> While I use code coverage to improve automated unittesting, I am opposed
> to turning a usable but limited and sometime faulty tool into a blind
> robotic master that blocks improvements.  The prospect of this being
> done has discouraged me from learning the new system.  (More on 'faulty
> tool' later.)
>


It should be stated that code coverage is not a blocking status check for
merging from our perspective (the **only** required check is that Travis
pass with it's test run). The only reason Victor ran into it at a blocker
is because he tried to do a merge over his phone where GitHub apparently
prevents merging if any failing check exists, required or not (I assume
this is because reviewing code on a small screen raises the chances of
overlooking a failing check that one actually cared about).

I'm on vacation until tomorrow so I've been off of GitHub since last
Thursday, but it's possible there's already a PR out there to turn off the
status checks. If there is still no such PR I'm still fine with turning off
the status checks for Codecov.


>
> The temptation to write artificial tests to satisfy an artificial goal
> is real.  Doing so can eat valuable time better used for something else.
>   For instance:
>
>      def meth(self, arg):
>          mod.inst.meth(arg, True, ob=self, kw='cut')
>
> Mocking mod.class.meth, calling meth, and checking that the mock is
> called will satisfy the robot, but does not contribute much to the goal
> of providing a language that people can use to solve problems.
>

My assumption is that there will be a test that meth() does the right
thing, mock or not. If we provide an API there should be some test for it;
using a mock or something else to do the test is an implementation detail.


>
> Victor, can you explain 'tested indirectly' and perhaps give an example?
>
> As used here,'whereas' is incorrect English and a bit confusing.  I
> believe Victor meant something more like 'even when'.
>
> For the last clause, I believe he meant "the code change is not
> complicated enough to *require* automated unit test coverage for the
> changed line(s)".  If I change a comment or add missing spaces, I don't
> think I should be forced to write a missing test to make the improvement.
>

Agreed.


>
> A less trivial example: on IDLE's menu, Options => Configure IDLE opens
> a dialog with a font size widget that when clicked opens a list of font
> sizes.  I recently added 4 larger sizes to the tuple in
> idlelib.configdialog.ConfigDialog.LoadFontCfg, as requested, I think, by
> at least 2 people.  I tested manually by clicking until the list was
> displayed.  As I remember, I did not immediately worry about automated
> testing, let alone line coverage, and I do not think I should have had
> to to get the change into 3.6.0.
>
> That line may or may not by covered by the current minimal test that
> simply creates a ConfigDialog instance.  But this gets back to what I
> think is Viktor's point.  This minimal test 'covers' 46% of the file,
> but it only tests that 46% of the lines run without raising.  This is
> useful, but does not test that the lines are really correct.  (For GUI
> display code, human eyeballing is required.)  This would remain true
> even if all the other code were moved to a new module, making the
> coverage of configdialog the magical ***100%***.
>
> >> If it's not important enough to require tests >> it's not important
> enough to be in Python.  ;)
>
> Modules in the test package are mostly not tested. ;)
>

:) But they are at least executed which is what we're really measuring here
and I think all Ethan and I are advocating for. E.g. I don't expect
test_importlib to be directly responsible for exercising all code in
importlib, just that Python's entire test suite exercise importlib as much
as possible as a whole.


>
> If 'test' means 'line coverage test for new or changed lines', then as a
> practical matter, I disagree, as explained above.  So, in effect, did
> the people who committed untested lines.
>
> In the wider sense of 'test', there is no real argument.  Each statement
> written should be mentally tested both when written and when reviewed.
> Code should be manually tested, preferably by someone in addition to the
> author. Automated testing is more than nice, but not everything.  Ditto
> for unit testing.
>

The problem I have with just doing manual testing for things that can
easily be covered by a unit test -- directly or indirectly -- is there are
technically 85 people who can change CPython today. That means there's
potentially 85 different people who can screw up the code ;) . Making sure
code is exercised somehow by tests at least minimizes how badly someone
like me might mess something thanks to me not realizing I broke the code.


>
> Some practical issues with coverage and CodeCov:
>
> 1. A Python module is comprised of statements but coverage module counts
> physical lines.  This is good for development, but not for gating.  The
> number of physical lines comprising a statement can change without
> changing or with only trivially changing the compiled run code.  So if
> coverage is not 100%, it can vary without a real change in statement
> coverage.
>

As I said earlier, no one here is gating PRs on code coverage.


>
> 2. Some statements are only intended to run on certain systems, making
> 100% coverage impossible unless one carefully puts all system-specific
> code in "if system == 'xyz'" statements and uses system-specific
> .coveragerc files to exclude code for 'other' systems.
>

True. We could have a discussion as to whether we want to use Coverage.py's
pragmas -- e.g. `# pragma: no cover` -- to flag code as known to be
untested. Depending on how far we wanted to take this we could also set up
pragmas that flag when a test is OS-specific (I found
https://bitbucket.org/ned/coveragepy/issues/563/platform-specific-configuration
while
looking for a solution and I'm sure we could discuss things with Ned
Batchelder if we needed some functionality in coverage.py for our needs).


>
> 3. Some tests required extended resources.  Statements that are only
> covered by such tests will be seen as uncovered when coverage is run on
> a system lacking the resources.  As far as I know, all non-Windows
> buildbots and CodeCov are run on systems lacking the 'gui' resource.  So
> patches to gui code will be seen as uncovered.
>

I view 100% coverage as aspirational, not attainable. But if we want an
attainable goal, what should we aim for? We're at 83.44% now so we could
say that 80% is something we never want to drop below and be done with it.
We could up it to 85% or 90% in recognizing that there is more testing to
do but that we will never cover all Python code (all of this is
configurable in Codecov, hence why I'm asking).


>
> 4. As I explained in a post on the core-workflow list, IDLE needs the
> following added to the 'exclude_lines' item of .coveragerc.
>      .*# htest #
>      if not _utest:
> The mechanism behind these would also be useful for testing any other
> modules, scripts, or demos that use tkinter GUIs.
>
> There seems to be other issues too.
>
> > "Untested code is broken code" :)
>
> Most of CPython, including IDLE, has been pretty thoroughly tested.  And
> we have heaps of bug reports to show for it.  What's more important is
> that even code that is tested, by whatever means, may still bugs.  Hence
>   However, obscure bugs are still found.  And even correct code can be
> corrupted (repressed) by attempt fix and improve.
>
> --
> Terry Jan Reedy
> _______________________________________________
> python-committers mailing list
> python-committers@python.org
> https://mail.python.org/mailman/listinfo/python-committers
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>

_______________________________________________
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [python-committers] Codecov and PR

Reply via email to