Re: Pushing XFAILed test cases

2021-07-21 Thread Tobias Burnus

Hi all, hi Thomas (2x), hi Sandra,

On 16.07.21 09:52, Thomas Koenig via Fortran wrote:

The part of the patch to add tests for this goes on top of my base
TS29113 testsuite patch, which hasn't been reviewed or committed yet.


It is my understanding that it is not gcc policy to add xfailed test
cases for things that do not yet work. Rather, xfail is for tests that
later turn out not to work, especially on certain architectures.


...

On 17.07.21 09:25, Thomas Koenig via Fortran wrote:

Is it or is it not gcc policy to push a large number of test cases
that currently do not work and XFAIL them?


In my opinion, it is bad to add testcases which _only_ consist of
xfails for 'target *-*-*'; however, for an extensive set of test
cases, I think it is better to xfail missing parts than to comment
them out - or not having them at all. That permits a better
test coverage once the features have been implemented.

For the TS29113 patch, which Sandra has posted on July 7, I count:

* 77 'dg-do run' tests - of which 27 are xfailed (35%)
* 28 compile-time tests
* 291 dg-error - of which 59 are xfailed (20%)
* 29 dg-bogus - of which are 25 are xfailed (86%)
(And of course, those lines which are valid do not have
a dg-error - and usually also no dg-bogus.)

And in total:
* 1 '.exp' file
* 105 '.f90' files (with 8232 lines in total including comment lines)
* 53 '.c'files (5281 lines)
* 1 '.h' file (12 lines)

Hence, for me this sounds a rather reasonable amount of xfail.
Especially, given that several pending patches do/will reduce
the amount of xfails by fixing issues exposed by the testsuite
(which has been posted but so far not reviewed).

Of course, in an ideal world, xfail would not exist :-)

On 07.07.21 05:40, Sandra Loosemore wrote:

There was a question in one of the issues about why this testsuite
references TS29113 instead of the 2018 standard.  Well, that is what
our customer is interested in: finding out what parts of the TS29113
functionality remain unimplemented or broken, and fixing them, so that
gfortran can say that it implements that specification.


I believe the only real difference between TS29113 and
Fortran 2018's interoperability support is that
'select rank' was added in Fortran 2018.

The testsuite also tests 'select rank'; in that sense,
it is also for Fortran 2018. Thus, ts29113 + ts29113.exp
or 'f2018-c-interop' + 'f2018-c-interop.exp' are both
fine to me. — 'ts29113' is shorter while the other is
clearer to those who did not follow the Fortran standards
and missed that there was a technical specification (TS)
between F2008 and F2018, incorporated (with tiny modifications)
in F2018.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: Pushing XFAILed test cases

2021-07-17 Thread Thomas Koenig via Gcc-patches

On 16.07.21 20:22, Sandra Loosemore wrote:
So it seems to me rather surprising to take the position that we should 
not be committing any new test cases that need to be XFAILed


It is what I was told in no uncertain terms some years ago, which
is where my current state of knowledge comes from.

So, I have added the gcc mailing list to this discussion, with a
general question.

Is it or is it not gcc policy to push a large number of test cases
that currently do not work and XFAIL them?

Regards

Thomas


Re: Pushing XFAILed test cases

2021-07-16 Thread Sandra Loosemore

On 7/16/21 9:32 AM, Thomas Schwinge wrote:


[much snipped]

Of course, we shall assume a certain level of quality in the XFAILed test
cases: I'm certainly not suggesting we put any random junk into the
testsuite, coarsely XFAILed.  (I have not reviewed Sandra's test cases to
that effect, but knowing here, I'd be surprised if that were the problem
here.)


FWIW, Tobias already did an extensive review of an early version of the 
testsuite patches in question and pointed out several cases where 
failures were due to my misunderstanding of the language standard or 
general confusion about what the expected behavior was supposed to be 
when gfortran wasn't implementing it or was tripping over other bugs. 
:-S  I hope I incorporated all his suggestions and rewrote the 
previously-bogus tests to be more useful for the version I posted for 
review on the Fortran list, but shouldn't the normal patch review 
process be adequate to take care of any additional concerns about quality?


My previous understanding of the development process and testsuite 
conventions is that adding tests that FAIL is bad, but XFAILing them 
with reference to a PR is OK, and certainly much better than simply not 
having test coverage of those things at all.  Especially in the case of 
something like the TS29113 testsuite where the explicit goal is to track 
standards compliance and/or the completeness of the existing 
implementation.  :-S  So it seems to me rather surprising to take the 
position that we should not be committing any new test cases that need 
to be XFAILed.  :-S


-Sandra


Re: Pushing XFAILed test cases

2021-07-16 Thread Martin Sebor via Gcc-patches

On 7/16/21 9:32 AM, Thomas Schwinge wrote:

[Also including  for guidance.]


Hi!

(I'm not involved in or familiar with Sandra's Fortran TS29113 work, just
commenting generally here.)


On 2021-07-16T09:52:28+0200, Thomas Koenig via Gcc-patches 
 wrote:

It is my understanding that it is not gcc policy to add xfailed test
cases for things that do not yet work. Rather, xfail is for tests that
later turn out not to work, especially on certain architectures.


That's not current practice, as far as I can tell.  I'm certainly
"guilty" of pushing lots of XFAILed test cases (or, most often,
individual XFAILed DejaGnu directives), and I see a good number of others
GCC folks do that, too.  Ideally with but casually also without
corresponding GCC PRs filed.  If without, then of course should have
suitable commentary inside the test case file.  Time span of addressing
the XFAILs ranging between days and years.

In my opinion, if a test case has been written and analyzed, why
shouldn't you push it, even if (parts of) it don't quite work yet?  (If
someone -- at another time, possibly -- then implements the missing
functionality/fixes the bugs, the XFAILs turn into XPASSes, thus serving
to demonstrate the effect of code changes.

Otherwise -- and I've run into that just yesterday... -- effort spent on
such test cases simply gets lost "in the noise of the mailing list
archives", until re-discovered, or -- in my case -- re-implemented and
then re-discovered by chance.

We nowadays even have a way to mark up ICEing test cases ('dg-ice'),
which has been used to push test cases that ICE for '{ target *-*-* }'.


Of course, we shall assume a certain level of quality in the XFAILed test
cases: I'm certainly not suggesting we put any random junk into the
testsuite, coarsely XFAILed.  (I have not reviewed Sandra's test cases to
that effect, but knowing here, I'd be surprised if that were the problem
here.)


Not trying to overrule you, just sharing my opinion -- now happy to hear
others.  :-)


I've also been xfailing individual directives in new tests, with
or without PRs tracking the corresponding limitations (not so much
outright bugs as future enhancements).  The practice has been
discussed in the past and (IIRC) there was general agreement with
it.  Marek even formalized some of it for the C++ front end by
adding support for one or more dg- directives (I think dg-ice was
one of them). The discussion I recall is here:

https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550913.html

Martin


Pushing XFAILed test cases (was: [PATCH, Fortran] Bind(c): CFI_signed_char is not a Fortran character type)

2021-07-16 Thread Thomas Schwinge
[Also including  for guidance.]


Hi!

(I'm not involved in or familiar with Sandra's Fortran TS29113 work, just
commenting generally here.)


On 2021-07-16T09:52:28+0200, Thomas Koenig via Gcc-patches 
 wrote:
> It is my understanding that it is not gcc policy to add xfailed test
> cases for things that do not yet work. Rather, xfail is for tests that
> later turn out not to work, especially on certain architectures.

That's not current practice, as far as I can tell.  I'm certainly
"guilty" of pushing lots of XFAILed test cases (or, most often,
individual XFAILed DejaGnu directives), and I see a good number of others
GCC folks do that, too.  Ideally with but casually also without
corresponding GCC PRs filed.  If without, then of course should have
suitable commentary inside the test case file.  Time span of addressing
the XFAILs ranging between days and years.

In my opinion, if a test case has been written and analyzed, why
shouldn't you push it, even if (parts of) it don't quite work yet?  (If
someone -- at another time, possibly -- then implements the missing
functionality/fixes the bugs, the XFAILs turn into XPASSes, thus serving
to demonstrate the effect of code changes.

Otherwise -- and I've run into that just yesterday... -- effort spent on
such test cases simply gets lost "in the noise of the mailing list
archives", until re-discovered, or -- in my case -- re-implemented and
then re-discovered by chance.

We nowadays even have a way to mark up ICEing test cases ('dg-ice'),
which has been used to push test cases that ICE for '{ target *-*-* }'.


Of course, we shall assume a certain level of quality in the XFAILed test
cases: I'm certainly not suggesting we put any random junk into the
testsuite, coarsely XFAILed.  (I have not reviewed Sandra's test cases to
that effect, but knowing here, I'd be surprised if that were the problem
here.)


Not trying to overrule you, just sharing my opinion -- now happy to hear
others.  :-)


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955