Re: Pushing XFAILed test cases
Hi all, hi Thomas (2x), hi Sandra, On 16.07.21 09:52, Thomas Koenig via Fortran wrote: The part of the patch to add tests for this goes on top of my base TS29113 testsuite patch, which hasn't been reviewed or committed yet. It is my understanding that it is not gcc policy to add xfailed test cases for things that do not yet work. Rather, xfail is for tests that later turn out not to work, especially on certain architectures. ... On 17.07.21 09:25, Thomas Koenig via Fortran wrote: Is it or is it not gcc policy to push a large number of test cases that currently do not work and XFAIL them? In my opinion, it is bad to add testcases which _only_ consist of xfails for 'target *-*-*'; however, for an extensive set of test cases, I think it is better to xfail missing parts than to comment them out - or not having them at all. That permits a better test coverage once the features have been implemented. For the TS29113 patch, which Sandra has posted on July 7, I count: * 77 'dg-do run' tests - of which 27 are xfailed (35%) * 28 compile-time tests * 291 dg-error - of which 59 are xfailed (20%) * 29 dg-bogus - of which are 25 are xfailed (86%) (And of course, those lines which are valid do not have a dg-error - and usually also no dg-bogus.) And in total: * 1 '.exp' file * 105 '.f90' files (with 8232 lines in total including comment lines) * 53 '.c'files (5281 lines) * 1 '.h' file (12 lines) Hence, for me this sounds a rather reasonable amount of xfail. Especially, given that several pending patches do/will reduce the amount of xfails by fixing issues exposed by the testsuite (which has been posted but so far not reviewed). Of course, in an ideal world, xfail would not exist :-) On 07.07.21 05:40, Sandra Loosemore wrote: There was a question in one of the issues about why this testsuite references TS29113 instead of the 2018 standard. Well, that is what our customer is interested in: finding out what parts of the TS29113 functionality remain unimplemented or broken, and fixing them, so that gfortran can say that it implements that specification. I believe the only real difference between TS29113 and Fortran 2018's interoperability support is that 'select rank' was added in Fortran 2018. The testsuite also tests 'select rank'; in that sense, it is also for Fortran 2018. Thus, ts29113 + ts29113.exp or 'f2018-c-interop' + 'f2018-c-interop.exp' are both fine to me. — 'ts29113' is shorter while the other is clearer to those who did not follow the Fortran standards and missed that there was a technical specification (TS) between F2008 and F2018, incorporated (with tiny modifications) in F2018. Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: Pushing XFAILed test cases
On 16.07.21 20:22, Sandra Loosemore wrote: So it seems to me rather surprising to take the position that we should not be committing any new test cases that need to be XFAILed It is what I was told in no uncertain terms some years ago, which is where my current state of knowledge comes from. So, I have added the gcc mailing list to this discussion, with a general question. Is it or is it not gcc policy to push a large number of test cases that currently do not work and XFAIL them? Regards Thomas
Re: Pushing XFAILed test cases
On 7/16/21 9:32 AM, Thomas Schwinge wrote: [much snipped] Of course, we shall assume a certain level of quality in the XFAILed test cases: I'm certainly not suggesting we put any random junk into the testsuite, coarsely XFAILed. (I have not reviewed Sandra's test cases to that effect, but knowing here, I'd be surprised if that were the problem here.) FWIW, Tobias already did an extensive review of an early version of the testsuite patches in question and pointed out several cases where failures were due to my misunderstanding of the language standard or general confusion about what the expected behavior was supposed to be when gfortran wasn't implementing it or was tripping over other bugs. :-S I hope I incorporated all his suggestions and rewrote the previously-bogus tests to be more useful for the version I posted for review on the Fortran list, but shouldn't the normal patch review process be adequate to take care of any additional concerns about quality? My previous understanding of the development process and testsuite conventions is that adding tests that FAIL is bad, but XFAILing them with reference to a PR is OK, and certainly much better than simply not having test coverage of those things at all. Especially in the case of something like the TS29113 testsuite where the explicit goal is to track standards compliance and/or the completeness of the existing implementation. :-S So it seems to me rather surprising to take the position that we should not be committing any new test cases that need to be XFAILed. :-S -Sandra
Re: Pushing XFAILed test cases
On 7/16/21 9:32 AM, Thomas Schwinge wrote: [Also including for guidance.] Hi! (I'm not involved in or familiar with Sandra's Fortran TS29113 work, just commenting generally here.) On 2021-07-16T09:52:28+0200, Thomas Koenig via Gcc-patches wrote: It is my understanding that it is not gcc policy to add xfailed test cases for things that do not yet work. Rather, xfail is for tests that later turn out not to work, especially on certain architectures. That's not current practice, as far as I can tell. I'm certainly "guilty" of pushing lots of XFAILed test cases (or, most often, individual XFAILed DejaGnu directives), and I see a good number of others GCC folks do that, too. Ideally with but casually also without corresponding GCC PRs filed. If without, then of course should have suitable commentary inside the test case file. Time span of addressing the XFAILs ranging between days and years. In my opinion, if a test case has been written and analyzed, why shouldn't you push it, even if (parts of) it don't quite work yet? (If someone -- at another time, possibly -- then implements the missing functionality/fixes the bugs, the XFAILs turn into XPASSes, thus serving to demonstrate the effect of code changes. Otherwise -- and I've run into that just yesterday... -- effort spent on such test cases simply gets lost "in the noise of the mailing list archives", until re-discovered, or -- in my case -- re-implemented and then re-discovered by chance. We nowadays even have a way to mark up ICEing test cases ('dg-ice'), which has been used to push test cases that ICE for '{ target *-*-* }'. Of course, we shall assume a certain level of quality in the XFAILed test cases: I'm certainly not suggesting we put any random junk into the testsuite, coarsely XFAILed. (I have not reviewed Sandra's test cases to that effect, but knowing here, I'd be surprised if that were the problem here.) Not trying to overrule you, just sharing my opinion -- now happy to hear others. :-) I've also been xfailing individual directives in new tests, with or without PRs tracking the corresponding limitations (not so much outright bugs as future enhancements). The practice has been discussed in the past and (IIRC) there was general agreement with it. Marek even formalized some of it for the C++ front end by adding support for one or more dg- directives (I think dg-ice was one of them). The discussion I recall is here: https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550913.html Martin
Pushing XFAILed test cases (was: [PATCH, Fortran] Bind(c): CFI_signed_char is not a Fortran character type)
[Also including for guidance.] Hi! (I'm not involved in or familiar with Sandra's Fortran TS29113 work, just commenting generally here.) On 2021-07-16T09:52:28+0200, Thomas Koenig via Gcc-patches wrote: > It is my understanding that it is not gcc policy to add xfailed test > cases for things that do not yet work. Rather, xfail is for tests that > later turn out not to work, especially on certain architectures. That's not current practice, as far as I can tell. I'm certainly "guilty" of pushing lots of XFAILed test cases (or, most often, individual XFAILed DejaGnu directives), and I see a good number of others GCC folks do that, too. Ideally with but casually also without corresponding GCC PRs filed. If without, then of course should have suitable commentary inside the test case file. Time span of addressing the XFAILs ranging between days and years. In my opinion, if a test case has been written and analyzed, why shouldn't you push it, even if (parts of) it don't quite work yet? (If someone -- at another time, possibly -- then implements the missing functionality/fixes the bugs, the XFAILs turn into XPASSes, thus serving to demonstrate the effect of code changes. Otherwise -- and I've run into that just yesterday... -- effort spent on such test cases simply gets lost "in the noise of the mailing list archives", until re-discovered, or -- in my case -- re-implemented and then re-discovered by chance. We nowadays even have a way to mark up ICEing test cases ('dg-ice'), which has been used to push test cases that ICE for '{ target *-*-* }'. Of course, we shall assume a certain level of quality in the XFAILed test cases: I'm certainly not suggesting we put any random junk into the testsuite, coarsely XFAILed. (I have not reviewed Sandra's test cases to that effect, but knowing here, I'd be surprised if that were the problem here.) Not trying to overrule you, just sharing my opinion -- now happy to hear others. :-) Grüße Thomas - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955