[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 Tom de Vries changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #7 from Tom de Vries --- (In reply to Tom de Vries from comment #4) > Committed to trunk. > > Approved for 8.2. [ 8.1 release is targeted for Wednesday, May 2nd. ] Backported to gcc-8-branch after 8.1 release. Marking resolved-fixed.
[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 --- Comment #6 from Tom de Vries --- Author: vries Date: Wed May 2 10:55:07 2018 New Revision: 259834 URL: https://gcc.gnu.org/viewcvs?rev=259834&root=gcc&view=rev Log: backport "[nvptx, libgomp, testsuite] Reduce recursion depth in declare_target-{1,2}.f90" 2018-05-02 Tom de Vries backport from trunk: 2018-04-26 Tom de Vries PR target/85519 * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce recursion depth from 25 to 23. * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same. Modified: branches/gcc-8-branch/libgomp/ChangeLog branches/gcc-8-branch/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 branches/gcc-8-branch/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 Tom de Vries changed: What|Removed |Added CC||cesar at gcc dot gnu.org --- Comment #5 from Tom de Vries --- *** Bug 84871 has been marked as a duplicate of this bug. ***
[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 --- Comment #4 from Tom de Vries --- Committed to trunk. Approved for 8.2. [ 8.1 release is targeted for Wednesday, May 2nd. ]
[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 --- Comment #3 from Tom de Vries --- Author: vries Date: Thu Apr 26 13:26:09 2018 New Revision: 259674 URL: https://gcc.gnu.org/viewcvs?rev=259674&root=gcc&view=rev Log: [nvptx, libgomp, testsuite] Reduce recursion depth in declare_target-{1,2}.f90 2018-04-26 Tom de Vries PR target/85519 * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce recursion depth from 25 to 23. * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same. Modified: trunk/libgomp/ChangeLog trunk/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 trunk/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 Tom de Vries changed: What|Removed |Added Keywords||openacc, openmp, patch Severity|normal |trivial --- Comment #2 from Tom de Vries --- https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01122.html
[Bug target/85519] [nvptx, openacc, openmp, testsuite] Recursive tests may fail due to thread stack limit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85519 --- Comment #1 from Tom de Vries --- (In reply to Tom de Vries from comment #0) > All these solutions work until the next failure shows up. It would be nice > to fix this more definitely in some way, but I'm not sure how. We could try to figure out the frame size of the recursive function. Using GOMP_DEBUG=1 we see the JIT compile/link log: ... Link log warning : Stack size for entry function 'main$_omp_fn$0' cannot be statically determined info: 0 bytes gmem info: Function properties for 'main$_omp_fn$0': info: used 8 registers, 0 stack, 0 bytes smem, 328 bytes cmem[0], 0 bytes lmem ... but the stack size is only shown for the offloading region, not for individual functions. Using GOMP_NVPTX_SAVE_TEMPS=1 we could get the cubin, and dump the resource usage: ... $ cuobjdump -res-usage gomp-nvptx.*.cubin Resource usage: Common: GLOBAL:0 Function rec: REG:8 STACK:0 SHARED:0 LOCAL:0 TEXTURE:0 SURFACE:0 SAMPLER:0 Function main$_omp_fn$0: REG:8 STACK:UNKNOWN SHARED:0 LOCAL:0 CONSTANT[0]:328 TEXTURE:0 SURFACE:0 SAMPLER:0 ... but the STACK entry for rec shows up as 0. Finally, using nvdisasm (or GOMP_NVPTX_DISASM=1) we find the info: ... $ nvdisasm gomp-nvptx.*.cubin //- nvinfo : EIATTR_FRAME_SIZE .align 4 /**/.byte 0x04, 0x11 /*0002*/.short (.L_6 - .L_5) .align 4 .L_5: /*0004*/.word index@(rec) /*0008*/.word 0x0010 //- nvinfo : EIATTR_FRAME_SIZE .align 4 .L_6: /*000c*/.byte 0x04, 0x11 /*000e*/.short (.L_8 - .L_7) .align 4 .L_7: /*0010*/.word index@(main$_omp_fn$0) /*0014*/.word 0x //- nvinfo : EIATTR_MIN_STACK_SIZE .align 4 .L_8: /*0018*/.byte 0x04, 0x12 /*001a*/.short (.L_10 - .L_9) .align 4 .L_9: /*001c*/.word index@(main$_omp_fn$0) /*0020*/.word 0x .L_10: ... So, we could write some tcl function to get the frame size for a function, and xfail or skip the test if the frame size is bigger that given constant x, but AFAIK dejagnu is not setup for this. The best we could do is to add a dg-final check and emit a: ... PASS: rec.c dg-nvptx-frame-size-check main$_omp_fn$0 0 FAIL: rec.c dg-nvptx-frame-size-check rec 8 ... Or, going for a more precise check: ... FAIL: rec.c dg-nvptx-stack-size-check main$_omp_fn$0,rec=65 (peak stack size 1048 is larger than stack size limit 1024) ... where you then check that frame-size (main$_omp_fn$0) + 65 * frame-size (rec) < udaThreadGetLimit(&size, cudaLimitStackSize)). Presumably formulating the peak stack composition gets more involved with openmp test cases which have a more complicated call stack.