RE: [INSPIRE-DEV] Why I do not use regression tests as often as I should

Piotr Praczyk Thu, 19 Jul 2012 07:48:13 -0700

Hello !

>From: Samuele Kaplun
>
> Hi Piotr,
>
>In data mercoledì, 18 luglio 2012 11.10:23, Piotr Praczyk ha scritto:
>> I think we are missing a consistent testing policy.
>>
>> As a developer, I don't feel very motivated to write regression/unit tests
>> if every time I try to run regression tests on the master branch  (which is
>> supposed to be clean), I can see number of errors suggesting that half of >
>> Invenio is broken.
> That's way a strong statement, isn't it? Not taking into account web tests so
> far we have 616 successful unit tests out of 616 (!) and 479 successful
> regression tests out of 490(!), so I don't see Invenio that broken in that
> sense.


Indeed, the number is smaller, but I was  talking about a general impression 
after 
seeing several pages of error messages rather than exact numbers.

>> The most obvious problem with current situation  is that potential errors
>> which were supposed to be tested by a broken test can not be detected.
>> This is particularly annoying when one modifies an important feature and
>> sees errors related to the test output.
>
>Good point.
>
>> There are also some a bit less obvious problems.
>> The difference between having no errors in the output and having one (or
>> more errors) is huge and alarms the developer that her code is broken. If 11
>> tests fail instead of 10, the difference is small and it is much easier to
>> ignore the warning.
>> Another problem is that the productivity of developers decreases if they
>> have to go over and over the same error list (usually being verbose and
>> requiring knowledge about the test source code to understand) searching for
>> differences between previous error messages and new ones.
>
>That is made less problematic by Tibor's development script which will display
>the diff of new kwalitee problems and new regressions.
>
> <https://github.com/tiborsimko/invenio-devscripts>

I did not know about this feature of kwalitee .... I was only using it as a 
wrapper around pylint

>> It is less related to the failing tests themselves, but if the mechanisms of
>> testing as quality measure of the software does not work, it does not
>> motivate to write new regression/unit tests.
>
>On the other hand some people prefer to follow the test-driven-development, by
>first implementing tests for functionalities that not yet exists so that they
>will be eventually implemented as originally designed. YMMV.

Up to now, my understanding of test-driven development was slightly different 
and I did not even think about such approach.
I always though about writing tests for currently implemented feature and 
satisfying them before a commit.
I think, the weakness of this approach lies in the size of a team that is 
collaborating on a project.
If You have small number of frequently communicating developers, this can work. 
If everyone starts uploading failing tests, all other developers have to deal 
with them and see results (which results in issues described in previous 
e-mail).
It is very difficult to distinguish tests that should fail because things are 
not implemented from these that fail because something is broken.
Moreover, commiting failing tests carries a risk of commiting tests which will 
never succeed because they are simply not well written (This is the case with 
at least one currently commited test).
This also leads to trouble.
Maybe it is better to use task tracking system for new features or at least 
mark tests as not-satisfied, so that testing framefowk could distinguish them 
automatically ? 
I guess having a separate repository for tests of future features (one branch 
related with one ticket) is a bit of an overkill on the side of infrastructure, 
but we could think about something better than now.

>> I realised I stopped running regression tests at all knowing that every time
>> I install a brand new Invenio, I have a lot of errors.
>
>In general we strive for having unit/regression tests being all successful for
>stable releases. The current situation is that unit test are all working
>nicely, while some regression tests are failing namely in the search syntax
>extension developed within INSPIRE, precisely due to the above TDD. (For
>BibClassify I instead don't know the reason for the failure).

Is't it against the principle of having commits describing complete features ? 
Brokebn tests = broken/missing features.
If I understand correctly, this principle lies in the basis of restrictive 
commit policy and having commit rights by only very limited number of people ?

>Also some time regression tests have been introduced to underline the
>existence of a bug, so that eventually someone would feel responsible for
>addressing it and fix it.

As Above, it should be more explicit, which tests are they.... bug report 
should also somehow point to these tests.

>> I think this is a really important problem and we should address it. At the
>> moment there about 10 errors in regression tests. What I would postulate is
>> to:
>>
>> - Fix them by people having knowledge of modules that is affected
>
>+1

>> - Make failure very prominent in the build server (currently tests fail and
>> the server reports success)

>+1 we should investigate how to finally implement this.

>> - We should refuse to commit (the repository could communicate with the
>> build server and do this automatically) patches causing at least one test to
>> fail. (I think, I proposed it several years ago)

>This last one, I don't think is yet feasible: this is what the system
>integrators strive to do all the time. Sometimes, however, due to reasons that
>transcend the pure technical aspects, things need to be merged in a urgent way
>because they are already running on production on some important installation,
>and we need to be sure that they got committed and maintained. This is not
>ideal of course, but in few restricted cases one need to reach compromises.

I can not agree that this is a right approach.
If urgent commit breaks tests, it breaks functionalities which can be important 
for the production system, so should never be pushed into master.
Allowing a commit that creates other problems is not a solution.
In Inspire there are those branches (to which Jan has access), where urgent 
changes go before being integrated in main branch.

>> Having much stricter testing policy would make the process of commiting to
>> the main repository easier as less manual attention would be needed from the
>> side of Tibor and Sam !

> :-) Well, although not enforced in an automatic way, this is the policy warmly
> advertised at:
> <https://twiki.cern.ch/twiki/bin/view/CDS/GitWorkflow#R3_Remarks_on_the_coding>

I will have another look ;) maybe it has been too long since last time... is 
the description of kwalitee's capabilities of testing against last commit 
described ? 
Our developer resources for someone starting with the project seem to be a bit 
distributed ;)

>> What do you think?
>So the current situation is not ideal, as you underline. So everybody should
>indeed feel more active in improving Invenio's kwalitee :-)

Situation is never ideal :-) ... It is good to have a big goal in front of 
one's eyes to see what is the correct direction :-) (I am not saying, my 
proposition is the most optimal one)

>Cheers!
>        Sam


cheers 
Piotr

>P.s. you tried to send your email to: project-cdsware-devel but you actually
>sent it to this strange address:
>"project-cdsware-developers (CDS Invenio developers)" <IMCEAEX-
>_O=CERN_OU=FIRST+20ADMINISTRATIVE+20GROUP_CN=RECIPIENTS_CN=PROJECT-CDSWARE-
>[email protected]>
online outlook inserted it ;) I hope it is fxed now.

RE: [INSPIRE-DEV] Why I do not use regression tests as often as I should

Reply via email to