Long doctests can accumulate large amounts of state. Object, test data
and variable reuse causes headaches for developers and reviewers. This
bothers me, so I brought it up at a reviewer's meeting a long long
time ago:
"Because of the accumulation of state in doctests I think it would
be good to limit new doctests to 200 lines or less. gmb wrote a new
doctest last week of just under 200 lines and it was still
perfectly comprehensible. That's where the number comes from."
The consensus was that a 200 line limit wasn't the answer, but there
were lots of other ideas, so let's discuss it.
(I tried to make notes from the logs, but it was a content packed and
concise discussion in itself. I've attached a formatted log for
reference.)
Gavin.
<bac> [topic] Limit doctests to 200 lines [allenap]
<bac> gavin, take it away
<allenap> Right,
<allenap> Because of the accumulation of state in doctests
<allenap> I think it would be good to limit new doctests to 200 lines or
less.
<allenap> gmb wrote a new doctest last week of just under 200 lines and
it was still perfectly comprehensible.
<allenap> So that's where the number comes from.
* gmb notes that it got changed into a unittest anyway during the review
process...
<danilos> I don't think that'd be right for large classes and
corresponding doctests, 200 sounds as artificially low limit
<allenap> Although manuel (?) adds a reset-globs directive whihc might
make this moot.
<flacoste> and there is an alternative to the state problem
<danilos> perhaps large classes are the problem, but we do have them
<flacoste> exactly, reset-globs in manuel
<gary_poster> agree with danilos. Also, agree with allenap's
observation of manuel
<sinzui> allenap: I have thought about breaking many registry doctests
into smaller themes
<flacoste> manuel also adds the option of running specific section of a
doctest
<sinzui> allenap: but my motivation it to control the layer the test is
run on.
<intellectronica> i think limiting the length is not the best way to
handle this. instead i suggest dividing tests thematically
<mars> I agree, I don't see the problem as much with larger narratives
as it is with naughty test code state.
<danilos> sinzui, I think the problem with existing doctests is that we
don't have a good structure of tests in general (oh, I'd like to
clean most of translations ones as well)
<intellectronica> that is, stop and start a new file when you're really
testing something new (with new state and so on)
<bac> intellectronica: i agree
<allenap> The main problem I have is trying to understand the object and
database state by the end of the test. That can't really be fixed by
reset-globs or manuel as I understand it.
<danilos> intellectronica, +1, basic doctests (those which are usually
large) should be basic class documentation; specific tests should be
unit test
<allenap> intellectronica: The limit is an incentive to split :)
<allenap> intellectronica: But I agree.
<danilos> allenap, I see the point, but artificial limit would have to
be broken in so many valid cases
<intellectronica> allenap: yes, i understand that, but somehow it feels
a bit too artificial to me. are you concerned that without a hard
limit developers/reviewers won't bother?
<bac> allenap: it is a good point but i'd rather see us have smaller,
more themed tests as a guide rather than adopting a limit
<gary_poster> We do have a "800 line limit for reviews" that is a
guideline but often breached
<intellectronica> also, i'll play sinzui and ask: who's going to split
all the existing test? ;)
<danilos> how about instead spreading a rule "enforce corner-case
testing through unit tests" among reviewers instead? those tend to
make doctests harder to read
<mars> instead of a hard limit, can we call it a "refactor point"? As
in, over X lines, the reviewer should look for a possible
refactoring
<allenap> Okay, sounds good. Is there a way to measure or have more
concrete guidelines, or is it at developer/reviewer discretion?
<mars> it's ok if they do not find such a point
<mars> kind of like review lint
<intellectronica> allenap: there is a measure. variables shouldn't be
recycled
<danilos> mars, I still believe that those complex doctests should
really be turned into unittests, and that's exactly where you can't
reuse variables :)
<allenap> intellectronica: I don't think we should split the existing
tests... until it is no longer possible to not split them.
<allenap> intellectronica: I like that :)
<allenap> intellectronica: There's still a problem of db state.
<mars> danilos, well, if they are over 800 lines, then you would
recommend the refactoring, wouldn't you? :)
<intellectronica> allenap: if we adopt this as a policy we will
eventually have to split the tests, if only so that they don't serve
as a negative example for new contributors
<allenap> danilos: +1
<mars> danilos, or 400, or whatever
<sinzui> intellectronica: One line that creates a file or email message
requires that the test be run on a different layer. In many cases
the test is actually moving to a new theme and I think it is easier
to read the that aspect of the test separately.
<allenap> intellectronica: Good point.
<intellectronica> allenap: i guess it's the same for db state. don't
rely on db state unless it's either set in the sample data or in the
prologue to your test
<danilos> mars, well, reviewers should always look for points of
refactoring, so I just feel such "rule" wouldn't really make any
difference, but sure :)
<allenap> intellectronica: I agree, and I don't know of a way to make it
less vague than that. I suppose that's what I was looking for.
<gary_poster> I find the stories that doctests tell valuable. The
stories are often long--and in fact, I often wish they were *better*
connected, not less, from a narrative standpoint. That viewpoint
would make me want to see manuel a preferred option, or perhaps
Sphinx involved for tying together story chunks, but that is maybe
herder.
<intellectronica> sinzui: so what you're saying is that we should split
tests if we're required to run the test in a lower layer because of
a change in the middle of the test?
<danilos> mars, even when the diff you are looking at is 10 lines, as a
reviewer, you should wonder if the code is sitting in the right
place
<gary_poster> and means that it doesn't help for reading text files,
only processed files
<mars> danilos, well, I don't think of it as a rule, just a
recommendation for reviewers. Kind of a warning, or code advice.
<gary_poster> s/herder/harder/
<mars> danilos, right
<danilos> anyway, it seems this is a hot topic
<danilos> allenap, bac, shall we move it to the mailing list or try to
come up with a conclusion here?
<gary_poster> I think a lot of people talking about dividing things up
are coming from the perspective of (1) expertise and (2) testing.
<sinzui> intellectronica: yes. In the exampled I am thinking about,
registiry object deletion, email notification, these themes are
different from the core uses of the object.
<allenap> danilos: Mailing list I think. There are many threads of
conversation and this isn't going to end here anyway ;)
<gary_poster> they are valuable, but these are narratives.
<intellectronica> sinzui: yes, i think that's a good guideline
<danilos> allenap, right, I think the general consensus is that we don't
want 200 as the limit, so it's best to discuss other solutions to
the problem you observe on list
<bac> danilos: yes, let's move the discussion to the mailing list.
allenap will you start the discussion there?
<allenap> bac: Sure.
<mars> gary_poster, good point
<gary_poster> :-) thank you mars
<bac> [action] allenap to start discussion on the ML about doctest size,
refactoring, moving corner cases to unittests, etc
_______________________________________________
Mailing list: https://launchpad.net/~launchpad-dev
Post to : [email protected]
Unsubscribe : https://launchpad.net/~launchpad-dev
More help : https://help.launchpad.net/ListHelp