[Bug analyzer/115203] [15 Regression] Build fail with non LANG=C in analyzer self test: ICE in fail_formatted at selftest.cc:63 / tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_STREQ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115203 --- Comment #2 from Tobias Burnus --- Indeed, the suggestion was not to disable the translations in general. A similar issue shows up when running the testsuite. There is it solved by "setenv LANG C" and "setenv LANG C.ASCII" – while various scripts also use LANG=C. Thus, one way would be to have LANG=C set somewhere (e.g. via the Makefile - assuming it can be done portable). Alternative would be your suggestion to disable it in simple_diagnostic_path. Looking at gcc/intl.cc's gcc_init_libintl, you also need to watch out for open_quote/close_quote and other fun changes as they might come before you switch to to LANG=C.
[Bug analyzer/115203] New: [15 Regression] Build fail with non LANG=C in analyzer self test: ICE in fail_formatted at selftest.cc:63 / tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_S
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115203 Bug ID: 115203 Summary: [15 Regression] Build fail with non LANG=C in analyzer self test: ICE in fail_formatted at selftest.cc:63 / tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_STREQ Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: build, ice-on-valid-code Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- For some testing, I happened to build with LANG=de_DE.UTF-8 and that was also set when building GCC itself. That works fine until the analyzer self test - as the strings don't match: .../build-gcc-trunk-fast/./gcc/xgcc -B/home/tob/projects/build-gcc-trunk-fast/./gcc/ -xc++ -nostdinc /dev/null -S -o /dev/null -fself-test=.../gcc/testsuite/selftests .../gcc/tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_STREQ (" events 1-5\n" "FILENAME:1:6:\n" "1 | if ((arr = (struct foo **)malloc [...] 5 ||if ((arr[i] = (struct foo *)malloc(sizeof(struct foo))) == NULL) { ||~ ~~ ||| | |+--->(4) ...to here (5) wurde hier deklariert cc1plus: interner Compiler-Fehler: in fail_formatted, bei selftest.cc:63 0x22af256 selftest::fail_formatted(selftest::location const&, char const*, ...) ../../../repos/gcc/gcc/selftest.cc:63 0x22af301 selftest::assert_streq(selftest::location const&, char const*, char const*, char const*, char const*) ../../../repos/gcc/gcc/selftest.cc:92 0x25b6cd6 selftest::fail_formatted(selftest::location const&, char const*, ...) ../../../repos/gcc/gcc/selftest.cc:63 0x25b6d81 selftest::assert_streq(selftest::location const&, char const*, char const*, char const*, char const*) ../../../repos/gcc/gcc/selftest.cc:92 0x10a7b42 test_control_flow_5 ../../../repos/gcc/gcc/tree-diagnostic-path.cc:2158 0x10aabe6 control_flow_tests ../../../repos/gcc/gcc/tree-diagnostic-path.cc:2292 0x13a5512 test_control_flow_5 ../../../repos/gcc/gcc/tree-diagnostic-path.cc:2158 0x13a85b6 control_flow_tests ../../../repos/gcc/gcc/tree-diagnostic-path.cc:2292
[Bug fortran/115151] New: procedure(acos) [,pointer] :: p - is wrongly rejected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115151 Bug ID: 115151 Summary: procedure(acos) [,pointer] :: p - is wrongly rejected Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: rejects-valid Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- IT looks as if the following is wrongly rejected: procedure(acos) [,pointer] :: p While the examples below are with 'pointer', I don't see why it shouldn't be also valid without 'pointer', but I have not verified it. But I have to admit that I also have not fully understood C1218 – as I don't see how one could ever specify something else - it either is by construction an external procedure or it clashes name wise. Only: procedure(acos)[,pointer] :: acos does not make sense as the former pulls in an 'intrinsic :: acos'. * * * Found at https://github.com/klausler/fortran-wringer-tests/ which has several corner case testcases. Testcase: acos-iface.f90 - ! Intel, NVF, NAG, f18: works ! GNU, XLF: errors about elemental or non-external procedure as interface intrinsic :: acos procedure(acos), pointer :: p p => acos print *, p(1.) end - gfortran: Error: Procedure pointer 'p' at (1) shall not be elemental GCC's source code refers for this to F2008's ("15.4.3.6 Procedure declaration statement"): "C1218 (R1211) If a proc-interface describes an elemental procedure, each procedure-entity-name shall specify an external procedure." where R1211 procedure-declaration-stmt is PROCEDURE ( [ proc-interface ] )7 [ [ , proc-attr-spec ] ... :: ] proc-decl -list R1214 proc-decl is procedure-entity-name [ => proc-pointer-init ] * * * Furthermore, for procedure pointers, there is: C1220 (R1217) The procedure-name shall be the name of a nonelemental external or module procedure, or a specific intrinsic function listed in 13.6 and not marked with a bullet (•). where: "R1216 proc-pointer-init is null-init or initial-proc-target R1217 initial-proc-target is procedure-name" 'acos' has no dot – and 13.6 has: "Note that a specific function that is marked with a bullet (•) is not permitted to be used as an actual argument (12.5.1, C1220), as a target in a procedure pointer assignment statement (7.2.2.2, C729), or as the interface in a procedure declaration statement (12.4.3.6, C1216)." The latter implies that 'acos' is permitted on 'p => acos'. * * * Thus, the question is still whether C1218 makes sense for 'p', i.e. does it apply to 'p' and (if so) is this is valid for proc pointers or not? "C855 A named procedure with the POINTER attribute shall have the EXTERNAL attribute." is in any case required for proc-pointers as well. However, that's given by: "A procedure declaration statement declares procedure pointers, dummy procedures, and external procedures. It specifies the EXTERNAL attribute (8.5.9) for all entities in the proc-decl-list." Which means - unsurprisingly - that the following is invalid: module m contains subroutine msub; end; end use m intrinsic :: acos pointer :: msub, intsub, acos ! INVALID contains subroutine intsub; end end
[Bug fortran/115150] [12/13/14/15 Regression] SHAPE of zero-sized array yields a negative value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115150 Tobias Burnus changed: What|Removed |Added Target Milestone|--- |12.5 CC||sandra at gcc dot gnu.org --- Comment #1 from Tobias Burnus --- Looking at the dump, GCC 11 has: _gfortran_shape_4 (, D.3966); while GCC 12 has: x[S.3] = (integer(kind=4)) (((unsigned int) a.dim[S.3].ubound - (unsigned int) a.dim[S.3].lbound) + 1); * * * * Probably caused by: r12-4591-g1af78e731feb93 Author: Sandra Loosemore Date: Tue Oct 19 21:11:15 2021 -0700 Fortran: Fixes and additional tests for shape/ubound/size [PR94070] This patch reimplements the SHAPE intrinsic to be inlined similarly to LBOUND and UBOUND, instead of as a library call, to avoid an unnecessary array copy. Various bugs are also fixed. gcc/fortran/ PR fortran/94070 * * * SHAPE has: "Result Value. The result has a value whose i-th element is equal to the extent of dimension i of SOURCE, except that if SOURCE is assumed-rank, and associated with an assumed-size array, the last element is equal to −1." Thus, an example for the latter: GCC 11.4 - good: 0 2 -1 GCC 12 (to trunk) - wrong: -3 2 -1 integer :: x(10) call f(x, -3) contains subroutine f(y,n) integer :: y(1:n,2,*) call g(y) end subroutine g(z) integer :: z(..) print *, shape(z) end end
[Bug fortran/115150] New: [12/13/14/15 Regression] SHAPE of zero-sized array yields a negative value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115150 Bug ID: 115150 Summary: [12/13/14/15 Regression] SHAPE of zero-sized array yields a negative value Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- GCC 11.4 has: Shape: 0 0 Shape: 0 3 0 But since GCC 12: Shape: -2 0 Shape: -3 3 0 Testcase: implicit none real,allocatable :: A(:),B(:,:) allocate(a(3:0), b(5:1, 3)) print *, 'Shape:', shape(a), size(a) print *, 'Shape:', shape(b), size(b) end
[Bug fortran/44744] Missing -fcheck=bounds diagnostic for function assignment with tmp array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44744 --- Comment #5 from Tobias Burnus --- (In reply to Tobias Burnus from comment #4) > Another variant from lsdalton – or rather the BTW: I have not verified that the cause is the same (temporary variable), but it seems to be likely. When replacing the 'A(i,:,:)' on the LHS with 'B(:,:)', gfortran does diagnose the too large RHS.
[Bug fortran/44744] Missing -fcheck=bounds diagnostic for function assignment with tmp array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44744 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #4 from Tobias Burnus --- Another variant from lsdalton – or rather the https://github.com/openrsp/openrsp/archive/v1.0.0.tar.gz it downloads during the build. FLANG diagnoses the LSDalton issue as: error(/home/jehammond/DALTON/lsdalton/build/_deps/openrsp_sources-src/src/ao_dens/rsp_property_caching.f90:2164): Assign: mismatching element counts in array assignment (to 6, from 3) * * * GCC fails to do so. Testcase: – the problem is that the RHS is 3 and the LHS is 6: implicit none real,allocatable :: A(:,:,:) integer :: n, n2, i n = 6 n2 = 3 allocate(A(3,n,3)) do i = 1, 3 print *, shape(a), '|', shape(f(n2)) a(i,:,:) = f(n2) end do deallocate(A) contains function f(n) integer :: n real :: f(n,3) f = 99.0 end end
[Bug c/113905] [OpenMP] Declare variant rejects variant-function re-usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113905 --- Comment #1 from Tobias Burnus --- Ups, testcase was lost. Re-written from scratch: --- int var1() { return 1; } int var2() { return 2; } #pragma omp declare variant (var1) match(construct={target}) #pragma omp declare variant (var2) match(construct={parallel}) int foo() { return 42; } #pragma omp declare variant (var2) match(construct={parallel}) #pragma omp declare variant (var2) match(construct={target}) int bar() { return 99; } int main() { __builtin_printf("foo: %d (expected: 42)\n", foo()); __builtin_printf("bar: %d (expected: 99)\n", bar()); #pragma omp parallel if(0) { __builtin_printf("foo: %d (expected: 2)\n", foo()); __builtin_printf("bar: %d (expected: 1)\n", bar()); } #pragma omp target //device(-1 /*omp_initial_device*/) { __builtin_printf("foo: %d (expected: 1)\n", foo()); __builtin_printf("bar: %d (expected: 2)\n", bar()); } }
[Bug fortran/104621] [OpenMP] Issues with 'declare variant'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104621 --- Comment #1 from Tobias Burnus --- This got fixed for OpenMP 6 via OpenMP spec pull request #3383, adding: * A declarative directive must be specified in the specification part after all 'USE', 'IMPORT' and 'IMPLICIT' statements. * If a declarative directive applies to a function declaration or definition and it is specified with one or more C++ attribute specifiers, the specified attributes must be applied to the function as permitted by the base language. Plus revising the wording for 'declare variant' for Fortran (semantic + restrictions). See TR12/OpenMP 6.
[Bug fortran/114825] [11/12/13/14 Regression] Compiler error using gfortran and OpenMP since r5-1190
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114825 --- Comment #2 from Tobias Burnus --- The difference between the failing program and a working program (pointer-assignment in 'sub' comment out) is: failing: 'type' in gfc_omp_clause_default_ctor is '
[Bug middle-end/114754] New: [OpenMP] Missing 'uses_allocators' diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114754 Bug ID: 114754 Summary: [OpenMP] Missing 'uses_allocators' diagnostic Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: accepts-invalid, diagnostic, openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- Cf. https://github.com/SOLLVE/sollve_vv/pull/802 "LLVM error: error: allocator must be specified in the 'uses_allocators' clause adding uses_allocators(omp_default_mem_alloc) fixes the problem" OpenMP spec has under 'Restrictions to the *target* construct are as follows:' "Memory allocators that do not appear in a *uses_allocators* clause cannot appear as an allocator in an *allocate* clause or be used in the *target* region unless a *requires* directive with the *dynamic_allocators* clause is present in the same compilation unit." Example snippets, based on the sollve_vv testcase. The OG13 patch only diagnose the issue in the last/4th directive; clang diagnoses both the 'allocate' clause variants (2nd + 4th) but neither diagnoses the 1st/4th one. * * * omp_allocator_handle_t al = omp_init_allocator(omp_default_mem_space, 0, NULL); #pragma omp target { int *y = omp_alloc(omp_default_mem_alloc, sizeof(1)); } #pragma omp target allocate(omp_default_mem_alloc:x) firstprivate(x) map(from: device_result) { for (int i = 0; i < N; i++) x += i; device_result = x; } #pragma omp target firstprivate(al) { int *y = omp_alloc(al, sizeof(1)); } #pragma omp target allocate(al:x) firstprivate(x) map(from: device_result) { for (int i = 0; i < N; i++) x += i; device_result = x; }
[Bug fortran/103496] [F2018][TS29113] C_SIZEOF – rejects now valid args with 'must be an interoperable data entity'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103496 --- Comment #4 from Tobias Burnus --- (In reply to anlauf from comment #3) > The code in comment#0 compiles at r14-9893-gded646c91d2c0f > and gives the indicated results. which is the commit: Fortran: fix argument checking of intrinsics C_SIZEOF, C_F_POINTER [PR106500] It looks as if the issue is fixed, but gfortran misses a tescast to check that the obtained value is correct. c_sizeof_7.f90 contains tests, but I think there should be a run time trst that the obtained values are correct; I think some are constants such that a tree-dump scan test would work as well, but for the dynamic ones, a run time test seems zo be easier than trying to capture the generated code...
[Bug middle-end/113006] OpenMP 5 - lvalue parsing support for map/to/from clause
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113006 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #1 from Tobias Burnus --- [WIP] The Pull Request 3831 for Issue 2618 adds the restriction to *target* only: "* A list-item in a *map* clause that is specified on a *target* construct must have a base-variable or base-pointer." However, function calls in lvalue expressions can be still be used in 'target {,enter,exit} data'. NOTE: This also applies to Fortran as 'f()' if 'f' returns a data pointer is a variable in Fortran, but it isn't a base-variable; i.e. permitted in 'target ... data' not in 'target'.
[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #30 from Tobias Burnus --- (In reply to rguent...@suse.de from comment #29) > Might be for \r\n line endings? New lines are handled slightly differently – and \f and \v don't seem to be handled at all. Comparing the result with ifort/ifx/flang, they handle a bare '\r' (in contrast to \r\n) at fewer places than gfortran – albeit from the code it looks as if a \r not followed by \n is not handled consistently either. > I'd keep it for the sake of preserving > previous behavior. isspace(3) tests for \f, \n, \r, \t, \v and space > (but of course all depends on the locale, not sure whether libgfortran > needs to care for locales) I have added one example to the testcase, but that seems to be already handled by the code further below which handles '\r' and '\n' - thus, the patch does not handle it explicitly. The Fortran standard does not seem to permit \f, \t, \v at all – at least I only found those in the C interop section. The standard does not really define what new line actually is, but: "A newline character is a nonblank character returned by the intrinsic function NEW_LINE." – This handles different character kinds, but always returns a single character (e.g. \r vs. \n would be possible, but not \r\n). * * * Patch – which handles '\t': https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648950.html
[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #28 from Tobias Burnus --- Created attachment 57896 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57896=edit Testcase It seems as if 'tabs' cause problems, e.g. for: profile_single_file= .true. where there are two tabs before '='. * * * The problem seems to be that the new code uses: - eat_spaces (dtp); dtp->u.p.comma_flag = 0; + c = next_char (dtp); + if (c == ' ') +{ + eat_spaces (dtp); Thus, it explicitly checks for ' ' while eat_spaces handles: while (c != EOF && (c == ' ' || c == '\r' || c == '\t')); Testcase attached. I think we need at least an "|| c == '\t'"; I guess '\r' isn't really required here, or is it?
[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596 --- Comment #6 from Tobias Burnus --- (In reply to sandra from comment #5) > Tobias, it looks to me like you missed the connection ... No, I didn't - I did link it (cf. top of comment 3) — but I just cannot read :-/ Hence, for (1) *teams* *parallel* *for* parallel for (2) *teams* *parallel* for parallel *for* (3) *teams* parallel for *parallel* *for* we now have p = {1,2,3,4,5} → score = {1,2,4,8,16} Thus: (1) 1 + 2 + 4 = 9 (2) 1 + 2 + 16 = 19 (3) 1 + 8 + 16 = 25 such that (3) wins → and, hence, either of (f15) match (construct={parallel},user={condition(score(19):1)}) /* 8+19 */ (f16) match (implementation={atomic_default_mem_order(score(27):seq_cst)}) as 27 > 25 - and OpenMP states that it is implementation defined which of the score = 27 variant is used. * * * Thus, I concur that there is an ordering and, hence, scoring bug. * * * > I don't know if anybody wants to tackle a bug fix for this code for GCC 14. I think we don't - we are too close to the release. This has been implemented since 2019, 'score()' is not implemented in Clang 18, and code like that is unlikely to be often used. And the code is non-trivial. - All speaks against rushing to do an implementation for GCC 14.
[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596 --- Comment #4 from Tobias Burnus --- > For selector construct = {teams, parallel, for} > score1 = 29 score2 = 29 > --- > > The constructs[i]'s score look fine. > > But I wonder why score == 29 and not 28. Answer: omp_context_compute_score starts with: *score = 1; which is set (only) in the error case to "-1" and otherwise, scores only get added. — I think a constant offset to all scores does not harm (as it is only used internally) but I still find it confusing.
[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596 --- Comment #3 from Tobias Burnus --- The scoring is according to TR12: > Each trait selector for which the corresponding trait appears > in the construct trait set in the OpenMP context is given the > value 2^(p−1) where p is the position of the corresponding trait, > c_p, in the construct trait set And: > Specifically, the ordering of the set of constructs is > c1, . . . , cN , where c1 is the construct at the outermost > nesting level and cN is the construct at the innermost nesting level. and > construct trait set — The trait set that consists of all enclosing > constructs at a given point in an OpenMP program up to a target construct. > > enclosing context — For C/C++, the innermost scope enclosing a directive. At the call site: {teams, parallel, for, parallel, for} Selector in declare variant: {teams, parallel, for} Looking at 'context traits that contains all selectors in the same order are used', I see: (1) *teams* *parallel* *for* parallel for (2) *teams* *parallel* for parallel *for* (3) *teams* parallel for *parallel* *for* Sandra wrote: > By my reading, the OpenMP context at the point of call to f17 is > {teams, parallel, for, parallel, for} with associated scores > {1, 2, 4, 8, 16} In my reading (see second quote above), the order of enclosing contexts is inside out and the scores are: p = {5,4,3,2,1} → score = {16,8,4,2,1} Thus: (1) 16 + 8 + 4 = 28 (2) 16 + 8 + 1 = 25 (3) 16 + 2 + 1 = 19 where (1) wins because of: > if the traits that correspond to the construct selector set appear > multiple times in the OpenMP context, the highest valued subset of > context traits that contains all trait selectors in the same order > are used; * * * This matches what the testcase has: > match (construct={teams,parallel,for}) /* 16+8+4 */ (= 28) * * * --- $ install/bin/x86_64-linux-gnu-gcc declare-variant-12.c -fopenmp -foffload=disable -S ... constructs[0] = omp_for, scores[0] = 4, score = 16 constructs[1] = omp_parallel, scores[1] = 3, score = 8 constructs[2] = omp_teams, scores[2] = 2, score = 4 For selector construct = {teams, parallel, for} score1 = 29 score2 = 29 --- The constructs[i]'s score look fine. But I wonder why score == 29 and not 28.
[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596 --- Comment #2 from Tobias Burnus --- Crossref to the OpenMP spec discussions [not publicly available] related to the scoring: * Jakub asked about this testcase in an omp-lang in the email 2019/016447.html (w/o getting a reply. * He then opened Issue #2028 for three dozen issues/questions (issue lifetime: Oct 2019 to Aug 2020; it has 52 comments) → The scoring issue of this PR seems to run under '30' – but while I do see the question, I have trouble finding/understanding Alex' answer in that thread → 'Nov 12, 2019' seems to be the entry that has all quotes and seems to finally settle the issue.
[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966 Tobias Burnus changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Tobias Burnus --- FIXED on mainline (= GCC 14). Namely, the following was fixed. All of those issues involve compiling with '-g' such that 'mkoffload' generates also a GCN .o file for which the ELF flag has to match the other .o files. (A) The issue of comment 0: ELF Flag mismatch if GCC was configured with a --with-arch=... that does not match the default setting. → Fix: See comment 6 Earlier fixes, only vaguely related to comment 0: (B) * Compiler default was changed to gfx900 but mkoffload still had Fiji as default * Race in handling the debug files → Fix: See comment 1 (C) * Fixed issues related to xnack/sram-ecc, which also lead to ELF flag mismatches → Fix: See comment 4
[Bug libgomp/114445] New: [OpenMP] indirect - potential race when creating reverse-offload hash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114445 Bug ID: 114445 Summary: [OpenMP] indirect - potential race when creating reverse-offload hash Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- It looks as if when starting two kernels very quickly after another as in #pragma omp target nowait ... #pragma omp target ... Two device threads might concurrently create the hash in libgomp/config/accel/target-indirect.c's build_indirect_map if (!indirect_htab) { /* Count the number of entries in the NULL-terminated address map. */ for (map_entry = GOMP_INDIRECT_ADDR_MAP; *map_entry; map_entry += 2, num_ind_funcs++); indirect_htab = htab_create (num_ind_funcs); The issue only occurs if the assignment of the allocated memory done in the first thread occurs after the second kernel has checked for indirect_htab == NULL, which seems to be rather unlikely to happen in the real world, but IMHO cannot be ruled out. Thus, I was wondering (see email) whether a tmp variable + CAS should be used for indirect_htab; if the swap failed, the memory could just be freed as another process was faster - and has also completed by then the creation of the htab. See also r14-9629-g637e76b90e8b045c5e25206a41e3be55deace8d5 openmp: Change to using a hashtab to lookup offload target addresses for indirect function calls and review email at https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647755.html
[Bug target/114419] [GCC < 14] amdgcn offload compiler fails to build with amdgcn tools based on LLVM 18
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114419 Tobias Burnus changed: What|Removed |Added CC||ams at gcc dot gnu.org, ||burnus at gcc dot gnu.org Summary|amdgcn offload compiler |[GCC < 14] amdgcn offload |fails to build with amdgcn |compiler fails to build |tools based on LLVM 18 |with amdgcn tools based on ||LLVM 18 --- Comment #2 from Tobias Burnus --- This has been fixed in GCC 14 - and the documentation has been updated accordingly. See the AMD GCN entry in the GCC 14 and the link to the install document there: https://gcc.gnu.org/gcc-14/changes.html * * * I think there are two problems with LLVM 18 - both fixed with GCC 14/mainline. Namely: (A) Support for 'Fuji' devices was stopped, i.e. those use AMDHSA Code Object Version 3 → Solution: Configure with --with-arch=gfx900 --with-multilib-list=gfx900,gfx906,gfx908,gfx90a i.e. leave out 'fiji' and switch to gfx900 for the default Cf. r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f (B) The default version used if no version has been specified changed to Code Object 5 in LLVM's linker – but with '-g' also mkoffload produces an object file - but with version 4. => With debugging enabled, there will be an error with LLVM 18, solved by the commit: r14-8449-g4b5650acb3107239867830dc1214b31bdbe3cacd - namely: Since LLVM commit 082f87c9d418 (Pull Req. #79038; will become LLVM 18) "[AMDGPU] Change default AMDHSA Code Object version to 5" the default - when no --amdhsa-code-object-version= is used - was bumped.
[Bug libgomp/114335] OpenACC: use of accelerator constant/read-only memory for "readonly" modifier mappings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114335 Tobias Burnus changed: What|Removed |Added Keywords||missed-optimization, ||openacc CC||burnus at gcc dot gnu.org --- Comment #1 from Tobias Burnus --- It likewise applies for OpenMP for code such as: omp target firstprivate(array) allocate(omp_const_mem_alloc : array)
[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #19 from Tobias Burnus --- Regarding the LAPACK issue: Actually, I am inclined to * regard it as LAPACK bug * that was also fixed upstream, see comment 6, to make g95 happy. as ';' is not a value separator - while ' ;' is fine, where the blank is a value separator. My testcase of comment 4 therefore always used a space before the ',' / ';'. * * * I have now created an extended testcase, attached to PR105473 as attachment 57695. (Only testing integer/real parsing, not reading the char afterward as in comment 4.) The same testcase can also be found at https://godbolt.org/z/14h48167W and shows the result with gfortran, ifort, ifx and flag. I used this result to add comments to the testcases. * * * For some F2023 wording, see comment 14 above. And I have to admit that I am rather confused by the results as there does not seem to be any consistent pattern; there are cases where I agree with gfortran's error even though neither ifort nor flang show one, while for others, I think gfortran gets it wrong. In particular, I think for the following cases: call t('point', ';') ! gfortran: no error, others: error → IMHO invalid: not a value separator and not an integer. call t('point', '5;') ! gfortran: no error shown, others: error → This is the LAPACK example but for integers. I think ';' is invalid as it is not part of the integer but also not a value separator. call t('comma', '7 ,') ! gfortran: error; others: no error → IMHO valid - I think the ' ' as value separator is sufficient. call t('point', '3.3,', .true.) ! gfortran/flag: error shown; ifort: no error → What's wrong with a comma as value separator? call t('comma', '3,3;', .true.) ! gfortran: error shown; others: no error → Same, except that ';' is now the value separator But in the following cases, I think gfortran is *right*: call t('point', '5.') ! gfortran/flang: Error shown, ifort: no error → '.' is not part of an integer nor a value separator call t('comma', '5,') ! gfortran: error; others: no error → Likewise for ',' - the ',' is not part of an integer nor a value separator Disclaimer: I might have easily overlooked some fine print.
[Bug fortran/105473] semicolon allowed when list-directed read integer with decimal='point'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105473 Tobias Burnus changed: What|Removed |Added Attachment #57693|0 |1 is obsolete|| --- Comment #35 from Tobias Burnus --- Created attachment 57695 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57695=edit Extended testcase; comment shows results for gfortran(trunk),ifort,ifx,flang Fixed testcase – before: too many lines commented, bugs with floating-point test string fo decimal=comma the I/O code in general for floating-point case. See also: https://godbolt.org/z/14h48167W See also some comments at bug 114304 comment 19.
[Bug fortran/105473] semicolon allowed when list-directed read integer with decimal='point'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105473 --- Comment #34 from Tobias Burnus --- Created attachment 57693 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57693=edit Extended testcase; comment shows results for gfortran(trunk),ifort,ifx,flang See PR114304 for some related issue where too much is rejected while this one is about too much getting accepted. I now tried an extended version of comment 0, see below and see https://godbolt.org/z/GKzc4sveK for the result with gfortran, ifort, ifx and flag. I have to admit that I do not really see a real pattern here, comparing the compilers. One quote from F2023 can be found at bug 114304 comment 14.
[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #14 from Tobias Burnus --- Created attachment 57680 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57680=edit Testcase with decimal=COMMA, passes with ifort/ifx/flang - fails with gfortran > commit r14-9432-g0c179654c3170749f3fb3232f2442fcbc99bffbb > commit r13-8417-g824a71f609b37a8121793075b175e2bbe14fdb82 Thanks for the fix. We are now back to the GCC 13 result → comment 4 Namely, attachment 57668 now gives: 1.23434997 1243.23999 13.238 a 1.23434997 1243.23999 13.238 a 1.23434997 1243.23999 13.238 1.23434997 1243.23999 13.238 At line 33 of file foo.f90 (unit = 99, file = 'foo.inp') * * * The question is whether the following show give an error as shown above: real :: x(3) character(len=1) :: s ... write(99, '(a)') '1.23435 1243.24 13.24 ;' read(99, *) x, s Or whether reading this line should work, i.e. reading ';' as character – as it does with ifort and flang. Or in other words: * Does ';' count as character, readable by list-directed formatted I/O? (ifort, ifx, flang) * Or doesn't it? (gfortran since at least 4.9) * * * In F2023 (23-007r1), "13.10.2 Values and value separators": "A value separator is • a comma optionally preceded by one or more contiguous blanks and optionally followed by one or more contiguous blanks, unless the decimal edit mode is COMMA, in which case a semicolon is used in place of the comma, • a slash optionally preceded by one or more contiguous blanks and optionally followed by one or more contiguous blanks, or • one or more contiguous blanks between two nonblank values or following the last nonblank value, where a nonblank value is a constant, an r*c form, or an r* form." (where 'r' is an positive integer and 'c' is a literal constant [with ...].) To me it reads as if the semicolon should be read just fine. * * * I now have tried another testcase with decimal=COMMA, which works just fine with ifort / ifx /flang as shown at https://godbolt.org/z/ajeTjzEfY But with GCC it fails with: Fortran runtime error: Comma not allowed as separator with DECIMAL='comma' See godbolt link above for gfortran vs. ifort vs. ifx. vs. flang or the attached testcase.
[Bug libfortran/114304] [14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #6 from Tobias Burnus --- [For completeness: The LAPACK testsuite change Richard mentioned in comment 2 is https://github.com/Reference-LAPACK/lapack/commit/64e8a7500d817869e5fcde35afd39af8bc7a8086 - That's for g95 and was applied 2020.]
[Bug libfortran/114304] [14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 Tobias Burnus changed: What|Removed |Added Keywords||wrong-code Summary|[14 Regression] Rejects |[14 Regression] libgfortran |lapack test |I/O – bogus "Semicolon not ||allowed as separator with ||DECIMAL='point'" --- Comment #5 from Tobias Burnus --- I just noticed that the change libgfortran: [PR105473] Fix checks for decimal='comma'. got also backported to r13-8411-g7ecea49245bc6aeb6c889a4914961f94417f16e5 on Thu Mar 7, 2024. Thus, GCC 13 is now affected as well! [Disclaimer: I have not checked the spec, but it seems very much like a wrong-code bug.]
[Bug fortran/105473] semicolon allowed when list-directed read integer with decimal='point'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105473 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #32 from Tobias Burnus --- See PR114304 for an issue that was caused by the fix in comment 27.
[Bug libfortran/114304] [14 Regression] Rejects lapack test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #4 from Tobias Burnus --- Created attachment 57668 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57668=edit Testcase Testresults of the attached testcase: See also https://godbolt.org/z/q4rG61EvW The attached testcase shows with ifort and flang: 1.234350 1243.240 13.24000 a 1.234350 1243.240 13.24000 a 1.234350 1243.240 13.24000 1.234350 1243.240 13.24000 1.234350 1243.240 13.24000 ; 1.234350 1243.240 13.24000 ; With GCC mainline: 1.23434997 1243.23999 13.238 a 1.23434997 1243.23999 13.238 a Fortran runtime error: Semicolon not allowed as separator with DECIMAL='point' With GCC 13 (and 4.9): 1.23434997 1243.23999 13.238 a 1.23434997 1243.23999 13.238 a 1.23434997 1243.23999 13.238 1.23434997 1243.23999 13.238 Fortran runtime error: End of file
[Bug libfortran/114304] [14 Regression] Rejects lapack test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 Tobias Burnus changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- CC||burnus at gcc dot gnu.org, ||jvdelisle at gcc dot gnu.org --- Comment #3 from Tobias Burnus --- I think the semicolon is not permitted as item separator but if I have as input '1.234 134.23 abc' or likewise: '1.234, 134.23 abc' and read (..., * ) x(1:2) it works – i.e. only reading two floats and then stopping before reading 'abc'. But if I do the very same but replace ' abc' by ' ;', I get the error, which seems to be rather inconsistent — what's the difference between 'abc' and ';' in this case? * * * The message is new since r14-9050-ga71d87431d0c4e libgfortran: [PR105473] Fix checks for decimal='comma'. Jerry, can you check?
[Bug fortran/114283] New: [OpenMP] Dummy procedures/proc pointers and 'defaultmap', 'default', 'firstprivate' etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114283 Bug ID: 114283 Summary: [OpenMP] Dummy procedures/proc pointers and 'defaultmap', 'default', 'firstprivate' etc. Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- See also OpenMP specification Issue #3823 [and slightly related PR 114282]. There are two cases: (A) Dummy procedures IMHO those aren't variables and gfortran also rejects them when used in firstprivate, map, shared etc. ("Object 'f1' is not a variable"). However, gfortran does complain with 'default(none)': Error: 'f1' not specified in enclosing 'target' Note: 'default(none)' does not diagnose it. EXPECTED: There is no diagnosis for 'defaultmap(none)'. (B) Procedure pointers Here it is unclear whether it should be regarded as variable or not; gfortran treats those as variables. Depends on OpenMP specification Issue #3823. It seems as if handling it as variable, but using 'firstprivate' as default for implicit mapping makes most sense. – But also treating it as non-variable would make sense.
[Bug middle-end/114282] New: [OpenMP] Implicit mapping of function/procedure pointers should use 'firstprivate'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114282 Bug ID: 114282 Summary: [OpenMP] Implicit mapping of function/procedure pointers should use 'firstprivate' Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization, openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- For C/C++ function pointers and Fortran dummy procedures / procedure pointers, the assumption is that pointer is either pointing to the desired function or it has the address of a function on the initial device with 'declare target indirect'. GCC produces for implicit mapping: map(alloc:MEM[(char *)g] [len: 0]) map(firstprivate:g [pointer assign, bias: 0]) But a simple map(firstprivate:g) would do here. Likewise for an explicit map, where there is also no need for a pointer assignment. NOTE: For Fortran, it might be that explicit 'map' clauses will get disallowed per pending OpenMP specification Issue #3823, which is tracked elsewhere. Trivial testcases (for more complex, see 'indirect' testcases inside GCC or just add an explicit 'map' clause yourself): void f(){ void (*g)(); #pragma omp target g(); } subroutine f(g) procedure(), pointer :: g !$omp target map(g) call g() !$omp end target end
[Bug c++/110347] [OpenMP] private/firstprivate of a C++ member variable mishandled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110347 Tobias Burnus changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #4 from Tobias Burnus --- FIXED on mainline (GCC 14).
[Bug middle-end/113436] [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436 --- Comment #2 from Tobias Burnus --- As mentioned in comment 0, PR110347's testcase (r14-9257-g4f82d5a95a244d) contains '#if 0' code which has to be enabled once this bug is fixed. Please remember to take care of: * libgomp/testsuite/libgomp.c++/firstprivate-1.C's and * libgomp/testsuite/libgomp.c++/private-1.C's #if 0 /* FIXME: The following is disabled because of PR middle-end/113436. */
[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 --- Comment #5 from Tobias Burnus --- For: int *q; #pragma omp target device(y5) map(q, q[:5]) GCC currently generates: map(tofrom:q [len: 8]) map(tofrom:*q.4_1 [len: 20]) map(attach:q [bias: 0]) Expected: 'alloc:' instead of 'attach:' or even: map(tofrom:*q [len: 20]) map(firstprivate:q [pointer assign, bias: 0]) In any case, the first 'tofrom' is pointless! NOTE: GCC 13 shows: error: 'q' appears both in data and map clauses * * * For #pragma omp target map(s.p[:5]) GCC should do: map(tofrom:s [len: 24][implicit]) map(tofrom:*_5 [len: 16]) map(attach:s.p [bias: 0]) But (regression!) it does: map(struct:s [len: 1]) map(alloc:s.p [len: 8]) map(tofrom:*_5 [len: 16]) map(attach:s.p [bias: 0]) Solution: --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -12381,3 +12381,4 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH - || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH) + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH) break; However, unless I messed up, this will cause tons of ICE(segfault).
[Bug fortran/114002] New: [OpenACC][OpenACC 3.3] Add 'acc_attach'/'acc_detach' routine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114002 Bug ID: 114002 Summary: [OpenACC][OpenACC 3.3] Add 'acc_attach'/'acc_detach' routine Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openacc Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- Created attachment 57466 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57466=edit OpenACC run-time testcase The problem with acc_attach is that it does not like any temporary variable. For call acc_attach(var%v) We need at the end: acc_attach() But we will easily get: parm.v.data = var.v.data; ... acc_attach() which won't work => This requires a builtin in the GCC front end to handle this as no Fortran semantic will handle this. Note that: subroutine acc_attach (ptr_addr) bind(C) type(*), dimension(*), target, optional :: ptr_addr end subroutine comes close but gets 'acc_attach (var.v.data)' and not '' as argument.
[Bug fortran/113997] Bogus 'Warning: Interface mismatch in global procedure' with C binding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113997 --- Comment #3 from Tobias Burnus --- > Anyway, renaming the binding label, like >subroutine acc_attach_c(x) bind(C, name="acc_attach_renamed") > makes the code compile. Well, the code *does* compile as it is only a warning. * * * I think the problem here is a bit that on the Fortran-user side ('acc_attach' vs. 'acc_attach_c') and on the assembler-level side ('acc_attach_' vs. 'acc_attach') everything is fine (except with -fno-underscore) but, admittedly, not from the Fortran lanaguage side. (On the other hand, Fortran itself is perfectly happy with: 'subroutine foo' and 'subroutine bar() Bind(C, name='foo_')' but that will break with most Fortran compilers.) Thus, the question is whether we (gfortran) want to do something here - or are happy with issuing the semi-correct/semi-bogus warning here. * * * And renaming "acc_attach_c" does not really help as 'acc_attach' with C binding does exist. In this case it exists as: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgomp/oacc-mem.c;hb=refs/heads/master#l944 and renaming would just add another wrapper around it. However, an alternative is the following - which is (nearly) identical, except that GCC does some GFC-CFC and back conversations – independent whether implemented in C or in Fortran: subroutine acc_attach(x) bind(C, name="acc_attach_") use iso_c_binding, only : c_loc implicit none (external, type) type(*), dimension(..), target :: x interface subroutine acc_attach_c(x) bind(C, name="acc_attach") use iso_c_binding type(c_ptr) :: x end subroutine end interface call acc_attach_c(c_loc(x)) end
[Bug fortran/113997] New: Bogus 'Warning: Interface mismatch in global procedure' with C binding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113997 Bug ID: 113997 Summary: Bogus 'Warning: Interface mismatch in global procedure' with C binding Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- The following warning is bogus, unless -fno-leading-underscore is used: 8 | subroutine foo_c(x) bind(C, name="foo") |1 Warning: Interface mismatch in global procedure 'foo_c' at (1): Type mismatch in argument 'x' (TYPE(c_ptr)/TYPE(*)) * * * Because for 'subroutine acc_attach()' 'subroutine acc_attach_c(x) bind(C, name="acc_attach"') (A) the global Fortran name 'acc_attach' differs from the local name 'acc_attach_c' (B) the actual name (DECL_ASSEMBLER_NAME) differs: 'acc_attach_c' is 'acc_attach' but 'acc_attach' is 'acc_attach_c'. * * * ! The C and Fortran interfaces are part of OpenACC 3.3 ! An alternative implementation would be a C implementation using ! ISO_Fortran_binding.h. subroutine acc_attach(x) use iso_c_binding, only : c_loc implicit none (external, type) type(*), dimension(..), target :: x interface subroutine acc_attach_c(x) bind(C, name="acc_attach") use iso_c_binding type(c_ptr) :: x end subroutine end interface call acc_attach_c(c_loc(x)) end
[Bug target/113331] AMDGCN: Compilation failure due to duplicate .LEHB/.LEHE symbols
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113331 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #2 from Tobias Burnus --- All of the following is in except.cc. The problem is that the count in the label is relative to 'call_site_base'. In convert_to_eh_region_ranges, those get bumped - but the function reset it at the end. They do get accumulated via, e.g., dw2_output_call_site_table but, in GCN, the output_function_exception_table is exit early because of: 3229 if (!crtl->uses_eh_lsda 3230 || targetm_common.except_unwind_info (_options) == UI_NONE) 3231return; Thus, the next time convert_to_eh_region_ranges is called, it starts with the very same numbers. The reason that this gets produced is because there is an ERT_MUST_NOT_THROW ("MUST_NOT_THROW regions prevent all exceptions from propagating. This region type is used in C++ to surround destructors being run inside a CLEANUP region.") As there are both "-1" (implies no action) and "-2" (MUST_NOT_THROW), GCN produces this output. For whatever reason, nvptx has no "-1" actions in the function, thus, after the change to "-2", there is no flip and - hence, no output is produced - avoiding the issue. → I bet that both gcn and nvptx are affected (unless luck or compiled with -fno-exceptions).
[Bug middle-end/113904] [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113904 --- Comment #3 from Tobias Burnus --- See comment 1 for remaining to-do items. I also note that the Fortran resolution comes too early - during parsing - as the following shows: module m implicit none contains subroutine test !$omp declare variant (foo) match(user={condition(myTrue)}) !$omp declare variant (bar) match(user={condition(myCond(1).and.myCond(2))}) logical, parameter :: myTrue = .true. end subroutine foo; end subroutine bar; end logical function myCond(i) integer :: i myCond = i < 3 end end module m This fails with the complete bogus: 5 | !$omp declare variant (foo) match(user={condition(myTrue)}) | 1 Error: property must be a constant logical expression at (1) As 'myTrue' is a scalar logical PARAMETER. The problem is just that this is not known when parsing '!$omp' - for that reason, Fortran separates parsing and resolution, which the current code does not handle as it comes way too early. * * * Otherwise: It looks as if - except for simple variable names (and probablyalso for functions calls w/o arguments) - we want to introduce an internal aux function like: logical function __m_MOD_test_DV_cond1() result(res) res = myCond(1).and.myCond(2) end which is then called when evaluating the run-time expression. With header files and, possibly, also C++ modules, we might be able to always inline the condition - with Fortran modules probably not, such that an aux function would be needed for the generic case.
[Bug middle-end/113904] [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113904 --- Comment #1 from Tobias Burnus --- Patch for rejecting non-const arguments in Fortran (wrong-code bit) to bring it in line with C/C++: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645488.html * * * TODO as follow up: * Permit non-constant values for 'condition' and also for 'device_num' -> Middle end changes + update all front ends accordingly * For C/C++, consider rejecting nonconforming device numbers, if known at compile time, i.e. only permit positive numbers and omp_initial_device_number (= -1) and omp_invalid_device_number (GCC: -4). Cf. OpenMP Issue 3832 for the 'conforming' bit. [Current spec wording only permits 0 ... < omp_get_num_devices(), i.e. neither the host (= omp_initial_device and == omp_get_num_devices()) or omp_invalid_device_number are not permitted as explicit value; however, if absent, it is as if the trait appeared with the default-device-var ICV, which permits the discussed values.] -> If device_num(-4) (= omp_invalid_device_number), the selector can be folded to not matching. * Possible testcases for some of the features discussed here: - https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645472.html - the OpenMP 6.0 Examples' program_control/sources/dispatch.1.{c,f90}
[Bug middle-end/113906] New: [OpenMP][5.2] 'construct' context selectors lack many constructs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113906 Bug ID: 113906 Summary: [OpenMP][5.2] 'construct' context selectors lack many constructs Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, rejects-valid Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: sandra at gcc dot gnu.org Target Milestone: --- GCC only accepts the those constructs that are permitted for 5.1 for the 'construct' selector. Expected: Those of OpenMP 5.2 are supported as well. OpenMP 5.1 has: The 'construct' selector set defines the _construct_ traits that should be active in the OpenMP context. The following selectors can be defined in the construct set: 'target'; 'teams'; 'parallel'; 'for' (in C/C++); 'do' (in Fortran); 'simd' and 'dispatch'. Each trait-property of the simd selector is a _trait-property-clause._ The syntax is the same as for a valid clause of the 'declare simd' directive and the restrictions on the clauses from that directive apply. The construct selector is an ordered list c1, . . . , cN. OpenMP 5.2 has [and TR12 has]: The 'construct' selector set defines the construct traits [TR12: construct trait set] that should be active in the OpenMP context. Each [trait] selector that can be defined in the 'construct' [selector] set is the directive-name of a context-matching construct. Each trait-property of the 'simd' selector is a trait-property-clause. The syntax is the same as for a valid clause of the declare simd directive and the restrictions on the clauses from that directive apply. The construct selector is an ordered list c1, . . . , cN. OpenMP TR12 also adds a helpful glossary entry: 'construct trait set' - The trait set that consists of all enclosing constructs at a given point in an OpenMP program up to a target construct.
[Bug c/113905] New: [OpenMP] Declare variant rejects variant-function re-usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113905 Bug ID: 113905 Summary: [OpenMP] Declare variant rejects variant-function re-usage Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, rejects-valid Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org, parras at gcc dot gnu.org, sandra at gcc dot gnu.org Target Milestone: --- The attached testcase works with Clang 17 and prints: Got 42 (OK) Got 99 (OK) Got 1 (OK) Got 2 (OK) Got 2 (OK) Got 1 (OK) Where foo() and bar() share the variant functions 'var1' and 'var2', which seems to be perfectly valid. In GCC it fails to compile: test.c: In function 'bar': test.c:8:36: error: 'var1' used as a variant with incompatible 'construct' selector sets 8 | #pragma omp declare variant (var1) match(construct={target}) |^ test.c:9:36: error: 'var2' used as a variant with incompatible 'construct' selector sets 9 | #pragma omp declare variant (var2) match(construct={parallel}) |^ If I only keep the 'declare variant' for 'foo', it compiles. The gimple dump shows: __attribute__((omp declare target, omp declare variant variant (parallel ))) int var1 () __attribute__((omp declare target, omp declare variant variant (target ))) int var2 () __attribute__((omp declare target, omp declare variant base (var2 construct target ), omp declare variant base (var1 construct parallel ))) int foo () I guess the problem is the 'omp declare variant variant' attribute on 'var1' and 'var2', which causes the issue I am seeing.
[Bug middle-end/113904] New: [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113904 Bug ID: 113904 Summary: [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: accepts-invalid, openmp, rejects-valid, wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: parras at gcc dot gnu.org, sandra at gcc dot gnu.org Target Milestone: --- There are two related problems, leading currently to either wrong-code (Fortran - alias accepts-invalid OpenMP 5.0) or rejects-valid OpenMP 5.1 (C/C++). * Fortran accepts non constant values - but the ME does not handle them. → OpenMP 5.1 feature supported for parsing but not in the ME → wrong-code * C/C++ rejects non-const values → Rejecting valid 5.1 code gfortran happily accepts non constant values - while gcc/g++ reject them test.c:22:58: error: the value of 'foo_use_var2' is not usable in a constant expression 22 | #pragma omp declare variant (var2) match(user={condition(foo_use_var2)}) While OpenMP 5.0 only permits The user selector set defines the condition selector that provides additional user-defined conditions. C: The condition(boolean-expr) selector defines a constant expression that must evaluate to true for the selector to be true. C++: The condition(boolean-expr) selector defines a constexpr expression that must evaluate to true for the selector to be true. Fortran: The condition(logical-expr) selector defines a constant expression that must evaluate to true for the selector to be true. Since OpenMP 5.1: The condition selector contains a single trait-property-expression that must evaluate to true for the selector to be true. Any non-constant expression that is evaluated to determine the suitability of a variant is evaluated according to the data state trait in the dynamic trait set of the OpenMP context. The user selector set is dynamic if the condition selector is present and the expression in the condition selector is not a constant expression; otherwise, it is static.
[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 --- Comment #4 from Tobias Burnus --- Also not handled: struct s { int *p; } s1; ... #pragma omp target map(s1.p[:N]) p[0] = p[N-1] = 99; Here, the pointer attachment is missing. See also PR113724 's attachment 57407 for a testcase for this and (some) other issues. TODO: - Fix the extra struct issue (→ this patch or other solution) - Fix the missing attachment issue (this comment's example) - Audit whether other changes are required.
[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 --- Comment #6 from Tobias Burnus --- Created attachment 57407 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57407=edit C testcase – passes with patch (except for '#if 0'ed PR113867 issues) DejaGNU-ified testcase for this PR and ('#if 0'-ed for PR113867). Using the attached patch, it no longer gives an ICE and it works except for the '#if 0' code but it regresses for: FAIL: libgomp.c/../libgomp.c-c++-common/baseptrs-1.c (internal compiler error: Segmentation fault) FAIL: libgomp.c/../libgomp.c-c++-common/baseptrs-1.c (test for excess errors) FAIL: libgomp.c/../libgomp.c-c++-common/pr109062.c output pattern test FAIL: libgomp.c/../libgomp.c-c++-common/target-map-1.c execution test FAIL: libgomp.c/target-52.c execution test Thus, there is more work needed. TODO: - Create a tix which solves this issue without regressing - Possibly addressing the PR113867 here
[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 --- Comment #3 from Tobias Burnus --- Created attachment 57398 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57398=edit Patch - handling the libgomp issue Possible patch - lightly tested. This fixes the issue in libgomp. While always correct and possibly avoiding other corner cases (if there are any), an alternative approach is to not create those 'struct: s [len: 1]' + 'alloc:s2.p [len: 0]'. RFC: - Is there a reason why we want to have the struct in such a case? (GCC <= 13 doesn't create this struct) - Do we want to have this libgomp change even when not generating the struct as extra safety net?
[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 Tobias Burnus changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |burnus at gcc dot gnu.org Component|middle-end |libgomp --- Comment #2 from Tobias Burnus --- The problem here is in libgomp's gomp_map_vars_internal: /* Fallthrough. */ case GOMP_MAP_STRUCT: first = i + 1; last = i + sizes[i]; cur_node.host_start = (uintptr_t) hostaddrs[i]; cur_node.host_end = (uintptr_t) hostaddrs[last] + sizes[last]; if (tgt->list[first].key != NULL) continue; if (sizes[last] == 0) cur_node.host_end++; n = splay_tree_lookup (mem_map, _node); if (sizes[last] == 0) cur_node.host_end--; if (n == NULL && cur_node.host_start == cur_node.host_end) { gomp_mutex_unlock (>lock); gomp_fatal ("Struct pointer member not mapped (%p)", (void*) hostaddrs[first]); } if (n == NULL) ... field_tgt_base = (uintptr_t) hostaddrs[first]; ... field_tgt_clear = last; here: n == NULL and cur_node.host_end - cur_node.host_start = 8 [i.e. sizeof(void*)?!]: For i=1, there is no action to be taken due to the GOMP_MAP_ZERO_LEN_ARRAY_SECTION. And for i=2, if (field_tgt_clear != FIELD_TGT_EMPTY) { k->tgt_offset = k->host_start - field_tgt_base + field_tgt_offset; Here, k->tgt_offset = hostaddr of the struct but we are no longer mapping a struct here. - Clearly, resetting was forgotten ...
[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 --- Comment #5 from Tobias Burnus --- The runtime issue is now PR113867.
[Bug middle-end/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 --- Comment #1 from Tobias Burnus --- Created attachment 57382 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57382=edit Fortran testcase, kind of, as pointer + pointee mapping cannot be split (working) For completeness, a Fortran testcase. (This Testcase works on GCC 13 and mainline.) As in Fortran, 'map(ptr, dt%ptr)' will always attempt to map the pointer and the pointer, it is not possible to split map(s2.p[:N]) // map the pointee map(s2.p, s2.p[:0]) // map the pointer and try pointer attachment as in C/C++. And using 'map(s.p)' will prevent a later inner 'map(s.a,...)' as 's' is already partially mapped. Hence, the aux 'ptr' is used, but that kind of defeats the purpose of this testcase.
[Bug middle-end/113867] New: [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 Bug ID: 113867 Summary: [14 Regression][OpenMP] Wrong code with mapping pointers in structs Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Created attachment 57381 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57381=edit Testcase, compile with 'gcc -fopenmp' and run with an offload device Split off from PR113724 which mainly about an ICE. The attached programs work with GCC 13 but fails with mainline. (Requires actual offloading; tried here with nvptx.) Probably due to Julian's mapping patches. With mainline, it fails for 'g()' when executing omp target data map(tofrom: s2.p[:100]) (i.e. GOMP_target_data_ext → gomp_copy_host2dev → gomp_device_copy) with libgomp: cuMemGetAddressRange_v2 error: named symbol not found libgomp: Copying of host object [0x118c500..0x118c690) to dev object [0x7f7e721cae00..0x7f7e721caf90) failed It works using a separate target enter/exit data (i.e. for 'f()'). The mainline dump shows: map(struct:s2 [len: 1]) map(alloc:s2.p [len: 0]) map(tofrom:*_2 [len: 400]) map(attach:s2.p [bias: 0]) I somehow hadn't expected the map(struct:s2 [len: 1]) map(alloc:s2.p [len: 0]) which might or might not be the issue. As it works with 'f()' (i.e. enter/exit data), it might be a red herring (or not).
[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 --- Comment #4 from Tobias Burnus --- Created attachment 57377 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57377=edit Fixes the ICE – might paper over a real issue; doesn't fix the run-time issue → TODO + 'data'-issue in PR comment 4 The patch fixes the issue #pragma omp target data map(S1.p[:N],S1.p,S1.a,S1.b) This gets split into the groups (reverse order!) 'S1.b' (i = 0), 'S1.a' (i = 1), 'S1.p' (i = 3) and 'S1.p[:N]' (i = 4; map + attach). In omp_build_struct_sibling_lists the collecting and reordering happens but for 'S1.p' there should be an 'alloc' not a 'tofrom' - such 'S1.p' has grp->deleted and the attach code will add the 'alloc' code. Until i = 2, everything is fine: *(*group[2])->deleted == true *(*group[2])->grp_start == (*group[2])->grp_end and this encloses 'map(tofrom:S1.p [len: 8])' – which should be removed in favor of a later (i = 3) added 'map(alloc:S1.p [len: 8])'. In principle, everything looks fine, until i =3 calls omp_accumulate_sibling_list, which in turn calls: continue_at = cl ? omp_siblist_move_concat_nodes_after (cl, tail_chain, grp_start_p, grp_end, sc) : omp_siblist_move_nodes_after (grp_start_p, grp_end, sc); where 'cl' != NULL_TREE. After the call, 'tail_chain' alias 'list_p' looks fine - except for the tailing 'map(tofrom:S1.p [len: 8])'. In principle, 'groups' is no longer touched - except for the the 'grp->deleted' handling, which fails (deletes the wrong stuff) because grp_begin points to the wrong tree. Solution: Do the OMP_CLAUSE_DECL nullifying earlier such that messing around with groups won't cause issues. TODO: We should really find out WHY i=2's grp_begin gets updated. If it happens just for previously processed grp items, that's fine - but what will happen if it also affects a still to be processed item? - If that indeed happens, everything will be messed up again! * * * The testcase shows another issue: target data map(to: S2.p[:N]) gets mapped as: map(struct:S2 [len: 1]) map(alloc:S2.p [len: 0]) map(tofrom:*_14 [len: 400])map(attach:S2.p [bias: 0]) before: map(tofrom:*_14 [len: 400]) map(attach:S2.p [bias: 0] The problem of the former is of course that 'S' is already partially mapped and that an alloc of length 0 will then fail already in 'target data' for the attach:S2.p as 0 bytes aren't sufficient for a pointer attachment. This applies both to target data and target, except that for 'target', 'S' might appear implicitly - while for 'data' it can only appear explicitly or not at all.
[Bug fortran/113840] New: [OpenACC] !$acc loop seq – bogus rejection of Fortran's EXIT/CYCLE + C/C++ break/continue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113840 Bug ID: 113840 Summary: [OpenACC] !$acc loop seq – bogus rejection of Fortran's EXIT/CYCLE + C/C++ break/continue Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openacc, rejects-valid Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tschwinge at gcc dot gnu.org Target Milestone: --- OpenACC seems to permit EXIT and CYCLE in "!$ACC LOOP" if there is the SEQ clause. The following quote is from OpenACC 3.2 but it can be also found in 2.5 a bit less explicit and between the lines also for 1.0 and 2.0: "2.9 Loop Construct" → "Restrictions" "A loop associated with a loop construct that does not have a seq clause must be written to meet all of the following conditions:1931 – The loop variable must be of integer, C/C++ pointer, or C++ random-access iterator type. – The loop variable must monotonically increase or decrease in the direction of its termination condition. – The loop trip count must be computable in constant time when entering the loop construct." Currently, it fails with: test.f90:4:6: 4 | EXIT | 1 Error: EXIT statement at (1) terminating !$ACC LOOP loop or test.c:5:7: error: break statement used with OpenMP for loop 5 | break; | ^ * * * Testcases: !$acc parallel !$acc loop seq do i=1, 5 EXIT end do !$acc end parallel end void f() { #pragma acc parallel #pragma acc loop seq for (int i=1; i < 5; i++) break; } * * * It seems as if the loop conditions are also relaxed, which needs to be handled / supported. (Not folding to OMP_FOR internally – or still? If not: at least PRIVATE needs to be handled and the SEQ be honored.) * * * Real-world testcase: https://gitlab.dkrz.de/icon/icon-model/-/blob/release-2024.01-public/src/diagnostics/mo_tropopause.f90?ref_type=heads#L200-L213
[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 --- Comment #3 from Tobias Burnus --- Inside omp_build_struct_sibling_lists, the following assignment: 11654 grp->grp_start = new_next; has on the LHS the [3] array with value: (gdb) p *grp $147 = {grp_start = 0x771f9688, grp_end = 0x771f9630, mark = UNVISITED, deleted = true, reprocess_struct = false, fragile = false, sibling = 0x0, next = 0x0} while (gdb) p new_next $146 = (tree *) 0x771f96d0 which causes the alias issue we are seeing. Before the assignment: (gdb) p debug(*(tree*)0x771f9688) map(tofrom:S1.b) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) (gdb) p debug(*(tree*)0x771f96d0) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0])
[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 --- Comment #2 from Tobias Burnus --- Inside: omp_build_struct_sibling_lists new_next = omp_accumulate_sibling_list (region_type, code, struct_map_to_clause, *grpmap, grp_start_p, grp_end, addr_tokens, , _p, grp->reprocess_struct, _tail); This processing looks okay. But: /* Delete groups marked for deletion above. At this point the order of the groups may no longer correspond to the order of the underlying list, which complicates this a little. First clear out OMP_CLAUSE_DECL for deleted nodes... */ FOR_EACH_VEC_ELT (*groups, i, grp) if (grp->deleted) for (tree d = *grp->grp_start; d != OMP_CLAUSE_CHAIN (grp->grp_end); d = OMP_CLAUSE_CHAIN (d)) OMP_CLAUSE_DECL (d) = NULL_TREE; Where we have the following 4 elements. Note that grp_start is identical for [1] and [2] – where [2] is deleted = true – which causes that the CLAUSE_DECL are NULL. Namely, 'p (*groups)[i]' for i = 0...3 gives: $86 = (omp_mapping_group &) @0x30f7a48: {grp_start = 0x76c92070, grp_end = 0x771f96c0, mark = UNVISITED, deleted = false, reprocess_struct = false, fragile = false, sibling = 0x0, next = 0x0} $91 = (omp_mapping_group &) @0x30f7a70: {grp_start = 0x771f96d0, grp_end = 0x771f9678, mark = UNVISITED, deleted = false, reprocess_struct = false, fragile = false, sibling = 0x0, next = 0x0} $92 = (omp_mapping_group &) @0x30f7a98: {grp_start = 0x771f96d0, grp_end = 0x771f9630, mark = UNVISITED, deleted = true, reprocess_struct = false, fragile = false, sibling = 0x0, next = 0x0} $93 = (omp_mapping_group &) @0x30f7ac0: {grp_start = 0x771f9640, grp_end = 0x771f9708, mark = UNVISITED, deleted = false, reprocess_struct = false, fragile = false, sibling = 0x0, next = 0x0} Where the '*grp_start' values of [0],[1]+[2], [3] are: map(struct:S1 [len: 3]) map(tofrom:S1.a) map(tofrom:S1.b) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p) (gdb) p debug(*(tree*)0x771f9640) And 'grp_end' for [0]...[4] is: map(tofrom:S1.b) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p) map(tofrom:S1.a) map(tofrom:S1.b) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p) map(tofrom:S1.p) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p) BEFORE that deleted loop, the result is: (gdb) p debug(*list_p) map(struct:S1 [len: 3]) map(tofrom:S1.a) map(tofrom:S1.b) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p) which looks fine. Obviously, after the deleted, all entries after 'alloc:S.p' have CLAUSE_DECL == NULL_TREE, causing the fail. * * * RFC: * Why are there two 'grp' with the same *grp_start value? * Why does it get set to 'deleted' while its clauses are actually used?
[Bug tree-optimization/113731] [14 regression] ICE when building libbsd since r14-8768-g85094e2aa6dba7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113731 --- Comment #10 from Tobias Burnus --- (In reply to Tamar Christina from comment #9) > (In reply to Matthias Klose from comment #8) > > the proposed patch doesn't fix the amdgcn-amdhsa bootstrap. > > So what is the error with the patch? The output can't be the same as the > function was removed. For what it is worth, Jakub's patch (based on Tamar's patch email ), https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645062.html , works here, i.e. I can build GCC with the current in-tree newlib.
[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 --- Comment #1 from Tobias Burnus --- Debugging shows: In gimplify_adjust_omp_clauses (line numbers are off by 1 as I have a #pragma GCC optimize("O0") on top of the file): 13717 groups = omp_gather_mapping_groups (list_p); ... 13720 if (groups) 13721 { 13722 grpmap = omp_index_mapping_groups (groups); 13723 13724 omp_resolve_clause_dependencies (code, groups, grpmap); 13725 omp_build_struct_sibling_lists (code, ctx->region_type, groups, 13726 , list_p); On the outermost side: (gdb) p debug(*list_p) num_teams(-2) thread_limit(0) (gdb) p debug(*list_p) map(tofrom:S1.b) map(tofrom:S1.a) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) The latter goes into the 'if (groups)' and list_p after line 13726 is: (gdb) p debug(*list_p) map(struct:S1 [len: 3]) map(tofrom:S1.a) map(tofrom:S1.b) The later ICE / segfault is because 'map(struct' has len = 3 but only two map clauses follow. And the other question is: Why is 'S1.p' gone? * * * The 'struct' (GOMP_MAP_STRUCT) with initial length is created by omp_accumulate_sibling_list: 11120 OMP_CLAUSE_SET_MAP_KIND (l, str_kind); 11121 OMP_CLAUSE_DECL (l) = unshare_expr (base); 11122 OMP_CLAUSE_SIZE (l) = size_int (1); and later updated via 11462 OMP_CLAUSE_SIZE (*osc) 11463 = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc), size_one_node); 11464 11465 if (reprocessing_struct) * * * This works fine for: map(tofrom:S1.b) -> create struct with len=1 It works also for: map(tofrom:S1.a) -> update struct len to 2 + add 'S1.a' But not for: map(tofrom:S1.p) -> does update len to 3 but doesn't add 'S1.p'. I do note that at: 11382 if (attach_detach && sc == grp_start_p) (gdb) p attach_detach $139 = true (gdb) p sc == grp_start_p $140 = false (gdb) p debug(*sc) map(tofrom:S1.b) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0]) (gdb) p debug(*grp_start_p) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0])
[Bug middle-end/113771] New: [14 Regression][GCN] ICE during GIMPLE pass: vect in vect_transform_loop tree-vect-loop.cc:11969
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113771 Bug ID: 113771 Summary: [14 Regression][GCN] ICE during GIMPLE pass: vect in vect_transform_loop tree-vect-loop.cc:11969 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tnfchris at gcc dot gnu.org Target Milestone: --- Target: gcn Still to be debugged further. I did work last week, i.e. it is a very recent regression. In-tree build of Newlib in the amdgcn-amdhsa build of GCC fails with -O2 (-O1 is okay): during GIMPLE pass: vect In file included from /home/tob/repos/gcc/newlib/libc/string/memset.c:29: /home/tob/repos/gcc/newlib/libc/include/string.h: In function 'memset': /home/tob/repos/gcc/newlib/libc/include/string.h:33:10: internal compiler error: Segmentation fault 33 | void * memset (void *, int, size_t); | ^~ 0x102617f crash_signal /home/tob/repos/gcc/gcc/toplev.cc:317 0x7efe08c4123f ??? /usr/src/debug/glibc-2.39/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x12d28fc gsi_prev(gimple_stmt_iterator*) /home/tob/repos/gcc/gcc/gimple-iterator.h:236 0x12d28fc move_early_exit_stmts /home/tob/repos/gcc/gcc/tree-vect-loop.cc:11804 0x12d28fc vect_transform_loop(_loop_vec_info*, gimple*) /home/tob/repos/gcc/gcc/tree-vect-loop.cc:11969 0x1314321 vect_transform_loops /home/tob/repos/gcc/gcc/tree-vectorizer.cc:1006 0x131492c try_vectorize_loop_1 /home/tob/repos/gcc/gcc/tree-vectorizer.cc:1152 0x131492c try_vectorize_loop * * * In the debugger: Program received signal SIGSEGV, Segmentation fault. 0x012d28fc in gsi_prev (i=0x7fffc1a0) at /home/tob/repos/gcc/gcc/gimple-iterator.h:236 236 gimple *prev = i->ptr->prev; (gdb) p i->ptr $1 = (gimple_seq_node) 0x0 11802 gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt); 11803 gsi_move_before (_gsi, _gsi); 11804 gsi_prev (_gsi); (gdb) p debug_gimple_stmt(stmt) # .MEM_49 = VDEF <.MEM_81> *s_72 = _1; (gdb) p debug_gimple_stmt(*dest_gsi->seq) # .MEM_49 = VDEF <.MEM_81> *s_72 = _1; $11 = void (gdb) p debug_gimple_stmt(*stmt_gsi->seq) # DEBUG BEGIN_STMT (gdb) p debug_gimple_seq(stmt_gsi->ptr) # DEBUG s => s_48 # DEBUG n => n_46 # DEBUG BEGIN_STMT s.3_2 = (long unsigned int) s_48; _3 = s.3_2 & 7; if (_3 != 0) $14 = void (gdb) p debug_gimple_seq(dest_gsi->ptr) (gdb) p debug_bb(stmt_gsi->bb) [local count: 862990464]: # DEBUG BEGIN_STMT s_48 = s_72 + 1; # DEBUG s => s_48 _1 = (char) c_22(D); # DEBUG s => s_48 # DEBUG n => n_46 # DEBUG BEGIN_STMT s.3_2 = (long unsigned int) s_48; _3 = s.3_2 & 7; if (_3 != 0) goto ; [94.50%] else goto ; [5.50%] $16 = void (gdb) p debug_bb(dest_gsi->bb) [local count: 815525989]: *s_72 = _1; goto ; [100.00%]
[Bug target/113721] [14 Regression][OpenMP][nvptx] cuCtxSynchronize fail when calling 'free' in the target regsion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113721 Tobias Burnus changed: What|Removed |Added Resolution|--- |WORKSFORME Status|UNCONFIRMED |RESOLVED --- Comment #1 from Tobias Burnus --- Hmm, while I did see it on two systems, bisecting failed (always passing) and now it passes with both the quicker build and full build. I did not complete the bisecting as I did not see interesting commits in between, hence, I cannot rule out that there was an in-between issue very recently. Nor do I rule out some odd build issue. Oddly, I had the very same issue on two system. Anyway, close as WORKSFORME.
[Bug middle-end/113724] New: [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724 Bug ID: 113724 Summary: [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- Created attachment 57295 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57295=edit compile with: gcc -fopenmp target_struct_map.4 The attached testcase from the OpenMP examples document now ICEs in the new omp_gather_mapping_groups_1 function. * * * target_struct_map.4.c: In function ‘main’: target_struct_map.4.c:46:11: internal compiler error: Segmentation fault 46 | #pragma omp target data map(S1.p[:N],S1.p,S1.a,S1.b) | ^~~ 0x1045382 crash_signal /home/tburnus/repos/gcc/gcc/toplev.cc:317 0x7fc380e4251f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0xcf8263 tree_check(tree_node*, char const*, int, char const*, tree_code) /home/tburnus/repos/gcc/gcc/tree.h:3611 0xcf8263 omp_gather_mapping_groups_1 /home/tburnus/repos/gcc/gcc/gimplify.cc:9583 0xd0bc32 omp_gather_mapping_groups /home/tburnus/repos/gcc/gcc/gimplify.cc:9610 0xd0bc32 gimplify_adjust_omp_clauses /home/tburnus/repos/gcc/gcc/gimplify.cc:13733 0xd23d27 gimplify_omp_workshare
[Bug target/113721] New: [14 Regression][OpenMP][nvptx] cuCtxSynchronize fail when calling 'free' in the target regsion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113721 Bug ID: 113721 Summary: [14 Regression][OpenMP][nvptx] cuCtxSynchronize fail when calling 'free' in the target regsion Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tschwinge at gcc dot gnu.org Target Milestone: --- Target: nvptx Created attachment 57294 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57294=edit Compile with -fopenmp + run with nvptx offloading I have not fully debugged this, but OpenMP Example Document example devices/sources/target_ptr_map.1.c fails now. It works if one comments the 'free(ptr3)' line but with it in, it fails with: libgomp: cuCtxSynchronize error: unspecified launch failure (perhaps abort was called) libgomp: cuMemFree_v2 error: unspecified launch failure * * * * Fails with today's GCC where nvptx is configured with --with-arch=sm_80. * Fails also with -foffload=nvptx-none=-march=sm_30 * Works with AMD GPU offload * Works using the GCC 13 distro compiler * Works using the GCC 13 distro compiler and LD_LIBRARY_PATH set to the mainline compile.
[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615 --- Comment #4 from Tobias Burnus --- Patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644181.html It fixes this issue but two other kind of issues I still see for gfx1100.
[Bug libgomp/110813] [OpenMP] omp_target_memcpy_rect (+ strided 'target update'): Improve GCN performance and contiguous subranges
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110813 --- Comment #4 from Tobias Burnus --- The GCN specific part has been applied to GCC 14 mainline in commit: https://gcc.gnu.org/g:a17299c17afeb92a56ef716d2d6380c8538493c4 Unhandled: * Strided and optimized strided copy (incl. generic part of the linked comment 3, which still needs to be comitted), the former is "[PATCH 0/5] OpenMP: Array-shaping operator and strided/rectangular 'target update' support", https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629422.html * Consider also to use a library function for *inter*-device copy if the device type or the function pointer is the same.
[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615 Tobias Burnus changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #2 from Tobias Burnus --- > I'm seeing a lot of ICEs like this when running libgomp testsuite with > offloading for gfx1030. I wonder why Andrew S didn't see them (unless he did?). However, I did get a similar/the same ICE for the testcase in PR113645. I have not checked whether anything below applies to the PR as well or not but as Andrew P has marked it as duplicate ... * * * Regarding PR113645: While, I have no real idea about GCC backend handling, the following SEEMS TO FIX THE ISSUE for the ICE of the testcase with -O3 and gfx1030 and gfx1100, also known as possible patch: --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -4273,9 +4273,10 @@ (define_expand "fold_left_plus_" [(match_operand: 0 "register_operand") (match_operand: 1 "gcn_alu_operand") (match_operand:V_FP 2 "gcn_alu_operand")] - "can_create_pseudo_p () + "!TARGET_RDNA2_PLUS + && can_create_pseudo_p () && (flag_openacc || flag_openmp || flag_associative_math)" { rtx dest = operands[0];
[Bug target/113645] New: [amdgcn][gfx1030][gfx1100] ICE in RTL pass: vregs with -O3: unrecognizable insn (vector reductions)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113645 Bug ID: 113645 Summary: [amdgcn][gfx1030][gfx1100] ICE in RTL pass: vregs with -O3: unrecognizable insn (vector reductions) Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: ams at gcc dot gnu.org Target Milestone: --- Target: amdgcn-amdhsa Created attachment 57247 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57247=edit Testcase, compile with gcc -fopenmp -O3 -foffload-options=-march=gfx1030 or gfx1100 Found for BabelStream when compiling with AMD GPU offloading for gfx1030 or gfx1100 and using -O3. # git clone https://github.com/UoB-HPC/BabelStream # cd BabelStream; mkdir build; cd build # cmake .. -DMODEL=omp -DCMAKE_CXX_COMPILER=$HOME/projects/gcc-trunk-offload/bin/g++ -DOFFLOAD=ON -DCXX_EXTRA_FLAGS=-foffload=amdgcn-amdhsa=-march=gfx1100 -fopenmp # make * * * Simplified testcase attached, compile with: gcc -fopenmp -foffload=amdgcn-amdhsa \ -foffload-options=amdgcn-amdhsa=-march=gfx1030 -O3 or, likewise, gfx1100. * * * foo.c:5:9: error: unrecognizable insn: 5 | #pragma omp target teams distribute parallel for simd map(tofrom: sum) reduction(+:sum) | ^ (insn 144 143 145 10 (set (reg:V16SF 926) (unspec:V16SF [ (reg:V16SF 922) repeated x2 (const_int 1 [0x1]) ] UNSPEC_PLUS_DPP_SHR)) "foo.c":5:9 -1 (nil)) during RTL pass: vregs foo.c:5:9: internal compiler error: in extract_insn, at recog.cc:2812 0x7f6b21 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /home/tob/repos/gcc/gcc/rtl-error.cc:108 0x7f6b3d _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /home/tob/repos/gcc/gcc/rtl-error.cc:116 0x7f55e4 extract_insn(rtx_insn*) /home/tob/repos/gcc/gcc/recog.cc:2812 0xb6b860 instantiate_virtual_regs_in_insn /home/tob/repos/gcc/gcc/function.cc:1611 0xb6b860 instantiate_virtual_regs /home/tob/repos/gcc/gcc/function.cc:1994 0xb6b860 execute /home/tob/repos/gcc/gcc/function.cc:2041
[Bug libgomp/113513] [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513 --- Comment #2 from Tobias Burnus --- Patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643648.html
[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from Tobias Burnus --- FIXED on mainline/GCC 14.
[Bug libgomp/113513] [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513 --- Comment #1 from Tobias Burnus --- Looking at the called GOMP_OFFLOAD_* function, in the failing case, there is: ... DEBUG GOMP_OFFLOAD_run DEBUG GOMP_OFFLOAD_dev2host DEBUG GOMP_OFFLOAD_free DEBUG: nvptx_attach_host_thread_to_device - 0 and in the successful case: DEBUG GOMP_OFFLOAD_fini_device 0 <<< called before unregister DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_FINALIZED DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_FINALIZED and then - in the failing case: DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_INITIALIZED DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_INITIALIZED DEBUG: gomp_unload_image_from_device DEBUG GOMP_OFFLOAD_unload_image, 0, 196609 DEBUG: gomp_target_fini; dev=0, state=GOMP_DEVICE_INITIALIZED DEBUG GOMP_OFFLOAD_fini_device 0 DEBUG: nvptx_attach_host_thread_to_device - 0 libgomp: cuCtxGetDevice error: unknown cuda error Thus, for some reason, GOMP_OFFLOAD_fini_device then GOMP_offload_unregister_ver is swapped when OMP_DISPLAY_ENV=true and OMP_TARGET_OFFLOAD="mandatory" are set - but not otherwise. The call to omp_target_fini comes from: if (atexit (gomp_target_fini) != 0) gomp_fatal ("atexit failed"); While the call to GOMP_offload_unregister_ver comes from mkoffload: fprintf (out, "static __attribute__((destructor)) void fini (void)\n" "{\n" " GOMP_offload_unregister_ver (%#x, __OFFLOAD_TABLE__," " %d/*NVIDIA_PTX*/, _data);\n" "};\n", * * * Actually, the same problem occurs when compiled with: -foffload=disable With that flag + no 'mandatory': DEBUG GOMP_OFFLOAD_version DEBUG GOMP_OFFLOAD_get_caps DEBUG GOMP_OFFLOAD_get_num_devices 0 DEBUG GOMP_OFFLOAD_get_name DEBUG GOMP_OFFLOAD_get_type DEBUG GOMP_OFFLOAD_init_device 0 DEBUG: nvptx_open_device - 0 DEBUG: gomp_target_fini; dev=0, state=GOMP_DEVICE_INITIALIZED DEBUG GOMP_OFFLOAD_fini_device 0 DEBUG: nvptx_attach_host_thread_to_device - 0 And with 'mandatory' + OMP_DISPLAY_ENV=verbose: DEBUG GOMP_OFFLOAD_version DEBUG GOMP_OFFLOAD_get_caps DEBUG GOMP_OFFLOAD_get_num_devices 0 DEBUG GOMP_OFFLOAD_get_name DEBUG GOMP_OFFLOAD_get_type < omp_display_env output> DEBUG GOMP_OFFLOAD_init_device 0 DEBUG: nvptx_open_device - 0 libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device cannot be used for offloading DEBUG: gomp_target_fini; dev=0, state=GOMP_DEVICE_INITIALIZED DEBUG GOMP_OFFLOAD_fini_device 0 DEBUG: nvptx_attach_host_thread_to_device - libgomp: cuCtxGetDevice error: unknown cuda error libgomp: device finalization failed Thus, the error message is the same – but here no offloading code exists and just gomp_target_fini is called. - However, there is a prior call to 'gomp_fatal' which probably messes things up for the plugin handling - while in the original code, we have a valid code. * * * If there is no offloading code but OMP_DISPLAY_ENV=verbose OMP_TARGET_OFFLOAD="mandatory" is used, it works: DEBUG GOMP_OFFLOAD_version DEBUG GOMP_OFFLOAD_get_caps DEBUG GOMP_OFFLOAD_get_num_devices 0 DEBUG GOMP_OFFLOAD_get_name DEBUG GOMP_OFFLOAD_get_type OPENMP DISPLAY ENVIRONMENT BEGIN ... OPENMP DISPLAY ENVIRONMENT END DEBUG: gomp_target_fini; dev=0, state=0 * * * If there is only one or none of the two env vars, there is no need to search for devices - and, hence, the nvptx plugin is not called at all and it, obviously, works as well.
[Bug libgomp/113513] New: [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513 Bug ID: 113513 Summary: [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org Target Milestone: --- When using both OMP_DISPLAY_ENV=true and OMP_TARGET_OFFLOAD="mandatory", the device has to be initiated early as OMP_DEFAULT_DEVICE (either 0 or -4 = omp_invalid_device) needs to be known before printing the ICVs. On my system, this causes libgomp: cuCtxGetDevice error: unknown cuda error. That's with "CUDA Version: 12.3" and "NVIDIA RTX A1000 6GB" with --with-arch=sm_80. I am somewhat sure that I have manually tested it before; our tester wasn't able to remotely set the env vars, hence, I don't know whether it did work there or not - nor whether it is a regression, depends on CUDA, sm_xx, my card or ...
[Bug middle-end/113439] New: [OpenMP] Add more collapse testcases mixing precisions, in particular (unsigned) int vs. _BigInt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113439 Bug ID: 113439 Summary: [OpenMP] Add more collapse testcases mixing precisions, in particular (unsigned) int vs. _BigInt Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- Follow up to PR113409 and its testcase testsuite/libgomp.c/bitint-1.c This is only about adding a testcase. OpenMP states: "The iterations of some number of outer associated loops can be collapsed into one larger logical iteration space that is the collapsed iteration space. The particular integer type used to compute the iteration count for the collapsed loop is implementation defined, but its bit precision must be at least that of the widest type that the implementation would use for the iteration count of each loop if it was the only associated loop." Thus, when collapsing two loops with an 'int' and 'long' loop variable, the iteration-count variable must be (at least) long. It would be good to ensure that this works fine also when mixing (signed/unsigned) int, long, long long, int128_t with _BigInt in either order (int, _BigInt and _BigInt, int).
[Bug middle-end/113436] [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436 --- Comment #1 from Tobias Burnus --- BTW: The attach testcase misses 'firstprivate', which obviously needs to be handled as well.
[Bug middle-end/113436] New: [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436 Bug ID: 113436 Summary: [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Created attachment 57112 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57112=edit C/C++ testcase, compile with -fopenmp The following code fails with an ABORT for the alignment check in the target region as there is no 'omp_alloc' added for the privatized variables (private/firstprivate). It works in the parallel region. See testcase. * * * For dynamic allocators, it depends on the WIP patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637415.html but that should be a rather independent issue. Found while working on PR c++/110347 – the being created patch contains for that PR contains an #if 0 testcase, which shall be enabled once this PR is fixed.
[Bug fortran/108382] [12/13/14 Regression] Incorrect parsing when acc and omp coexist and -fopenmp -fopenacc is used.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108382 --- Comment #2 from Tobias Burnus --- Fixed-form Fortran likewise fails for: !$acc enter !$acc& data !$omp flush !$omp& RELEASE end ! fails in this line: "Bad continuation line"
[Bug target/113288] [i386] Missing #define for -mavx10.1-256 and -mavx10.1-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288 --- Comment #4 from Tobias Burnus --- The(In reply to Haochen Jiang from comment #3) > Adding them are quite straightforward. I guess so. Note: this PR is about the #define in gcc/config/i386, only. > But I am not quite sure how the whole > libgomp patch works. OpenMP has selectors which permits to choose different functions or OpenMap directives. Several can be evaluated at compile time , some only at runtime. example (syntax probably not completely right): ... match(implementation={arch(x86_64),isa(sse4)}) Here, it can be evaluated at compile time which is done via the function TARGET_OMP_DEVICE_KIND_ARCH_ISA For some, runtime checks are more useful and I am also not sure whether something like cpuid would make more sense here (in general and especially for the run-time selector). But that's a separate issue to this PR. > Is the patch attempt to check whether it is a perfect match for each ISA > detected from a hardware? If that is the case, we need them to be added. > BTW, under this scenario, no need to add an if clause for macro __EVEX512__ > and __EVEX256__ in that patch since those two are not true ISAs. Something like that. It is also more for completeness and consistency. For OpenMP we just state: https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html which is rather generic for i386/x86_64. We cpuld do less, but the target hook made it easy to support all of them... I don't think anyone will use avx10.1-256 as isa context selector with OpenMP.
[Bug target/113288] New: [i386] Missing #define for -mavx10.1-256 and -mavx10.1-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288 Bug ID: 113288 Summary: [i386] Missing #define for -mavx10.1-256 and -mavx10.1-512 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: haochen.jiang at intel dot com, hongyuw at gcc dot gnu.org Target Milestone: --- Target: i386,x86_64 As noted in https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642025.html There is not #define for -mavx10.1-256 and -mavx10.1-512 By contrast, there is one for, e.g., __AVX10_512BIT__ and "avx10-max-512bit" __AVX10_1__ and "avx10.1" __AMX_FP16__ and -mamx-fp16 etc.
[Bug libgomp/113216] New: [OpenMP] Improve omp_target_is_accessible
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113216 Bug ID: 113216 Summary: [OpenMP] Improve omp_target_is_accessible Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- For omp_target_is_accessible + unified address: * it is not 100% correct to assume that any address is accessible on the host, it might be a device-only address (For handling NULL, see also PR 113213) Likewise for the nonhost device: * memory might be accessible even if only unified address * * * The host case is a bit trickier as no generic documentation seems to be available albeit some ranges like 0x7F.. seem to denote device addresses + a superset has to be formed. * * * For the device side, an API function might be available to check for it. * * In case of nvptx / CUDA, the following function seems to be suitable: CUresult cuMemGetAccess ( unsigned long long* flags, const CUmemLocation* location, CUdeviceptr ptr ) checking for flags == CU_MEM_ACCESS_FLAGS_PROT_READWRITE, if I understand it correctly. typedef enum CUmemAccess_flags_enum { CU_MEM_ACCESS_FLAGS_PROT_NONE = 0x0, /**< Default, make the address range not accessible */ CU_MEM_ACCESS_FLAGS_PROT_READ = 0x1, /**< Make the address range read accessible */ CU_MEM_ACCESS_FLAGS_PROT_READWRITE = 0x3, /**< Make the address range read-write accessible */ CU_MEM_ACCESS_FLAGS_PROT_MAX = 0x7FFF } CUmemAccess_flags; * * In case of HSA/ROCm, I bet there is also some function. For instance, hipPointerGetAttribute{,s} + hipDrvPointerGetAttributes permit to query some pointer data.
[Bug libgomp/113213] New: [OpenMP] Update omp_target_is_present / omp_target_is_accessible handling for NULL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113213 Bug ID: 113213 Summary: [OpenMP] Update omp_target_is_present / omp_target_is_accessible handling for NULL Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Update omp_target_is_present / omp_target_is_accessible handling for NULL * Including documentation The non-public issue https://github.com/OpenMP/spec/issues/3287 is about to clarify that the result is false. This has also implications for device == initial device.
[Bug middle-end/113199] New: [14 Regression][GCN] ICE (segfault) when compiling Newlib
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113199 Bug ID: 113199 Summary: [14 Regression][GCN] ICE (segfault) when compiling Newlib Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-invalid-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: ams at gcc dot gnu.org, tnfchris at gcc dot gnu.org Target Milestone: --- Created attachment 56974 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56974=edit Reduced testcase This is with mainline - with the patch for PR113163 applied, i.e. https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641555.html Compiling Newlib's libc/time/wcsftime.c fails as follows. Here for the reduced testcase, compiling it with: gcc -O2 input7.i during GIMPLE pass: vect input7.i: In function '__strftime': input7.i:14:1: internal compiler error: Segmentation fault 14 | __strftime (wchar_t *s, size_t maxsize, const wchar_t *format, | ^~ 0x11dcbff crash_signal src/gcc-mainline/gcc/toplev.cc:316 0x7f0481ea008f ??? /build/glibc-wuryBv/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 0x1222566 contains_struct_check(tree_node*, tree_node_structure_enum, char const*, int, char const*) src/gcc-mainline/gcc/tree.h:3757 0x1222566 verify_gimple_assign_ternary src/gcc-mainline/gcc/tree-cfg.cc:4334 0x1227c8e verify_gimple_in_cfg(function*, bool, bool) src/gcc-mainline/gcc/tree-cfg.cc:5602 0x10cde64 execute_function_todo src/gcc-mainline/gcc/passes.cc:2088 0x10ce3ab execute_todo src/gcc-mainline/gcc/passes.cc:2142 * * * verify_gimple_assign_ternary (stmt=0x77bdb060) at /net/build5-fossa-cs/scratch/tburnus/fsf.mainline.x86_64-linux-gnu-amdgcn/src/gcc-mainline/gcc/tree-cfg.cc:4334 4334 tree rhs3_type = TREE_TYPE (rhs3); (gdb) p rhs3 $1 = (tree) 0x0 (gdb) up #1 0x01227c8f in verify_gimple_in_cfg (fn=0x77ba4228, verify_nothrow=verify_nothrow@entry=true, ice=ice@entry=true) at /net/build5-fossa-cs/scratch/tburnus/fsf.mainline.x86_64-linux-gnu-amdgcn/src/gcc-mainline/gcc/tree-cfg.cc:5602 5602 err2 |= verify_gimple_stmt (stmt); (gdb) p stmt $2 = (gimple *) 0x77bdb060 (gdb) p debug_gimple_stmt(stmt) loop_mask_46 = VEC_PERM_EXPR ;
[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163 --- Comment #6 from Tobias Burnus --- and for that condition, we have: 3375 if (!integer_onep (*step_vector)) (gdb) p debug_tree(*step_vector) constant 8>
[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163 --- Comment #5 from Tobias Burnus --- While higher at the call stack: #3 0x0148714f in vect_transform_loop (loop_vinfo=loop_vinfo@entry=0x350f2a0, loop_vectorized_call=loop_vectorized_call@entry=0x0) at src/gcc-mainline/gcc/tree-vect-loop.cc:11911 11911 epilogue = vect_do_peeling (loop_vinfo, niters, nitersm1, _vector, (gdb) p debug_tree(niters) constant 6> One level down: #2 0x01498154 in vect_do_peeling (loop_vinfo=loop_vinfo@entry=0x350f2a0, niters=, niters@entry=0x77bb2030, nitersm1=nitersm1@entry=0x77bb2c78, niters_vector=niters_vector@entry=0x7fffda60, step_vector=step_vector@entry=0x7fffda68, niters_vector_mult_vf_var=niters_vector_mult_vf_var@entry=0x7fffda70, th=, check_profitability=, niters_no_overflow=, advance=) at src/gcc-mainline/gcc/tree-vect-loop-manip.cc:3399 3399vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf, where niters_vector_mult_vf is ssa_name that fails in the assert. The variable seems to be generated a few lines up in the same function (line 3375 and following): if (!integer_onep (*step_vector)) { /* On exit from the loop we will have an easy way of calcalating NITERS_VECTOR / STEP * STEP. Install a dummy definition until then. */ niters_vector_mult_vf = make_ssa_name (TREE_TYPE (*niters_vector)); SSA_NAME_DEF_STMT (niters_vector_mult_vf) = gimple_build_nop (); *niters_vector_mult_vf_var = niters_vector_mult_vf; } else vect_gen_vector_loop_niters_mult_vf (loop_vinfo, *niters_vector, _vector_mult_vf);
[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163 --- Comment #2 from Tobias Burnus --- Created attachment 56958 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56958=edit Reduced testcase ( $ amdgcn-amdhsa-gcc -g -O2 inp5.i -march=gfx900 during GIMPLE pass: vect inp5.i: In function '_l64a_r': inp5.i:4:1: internal compiler error: in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420 4 | _l64a_r (struct _reent *rptr, | ^~~ 0x94cc17 vect_peel_nonlinear_iv_init(gimple**, tree_node*, tree_node*, tree_node*, vect_induction_op_type) /net/build5-fossa-cs/scratch/tburnus/fsf.mainline.x86_64-linux-gnu-amdgcn/src/gcc-mainline/gcc/tree-vect-loop.cc:9420 0x148bb04 vect_update_ivs_after_vectorizer src/gcc-mainline/gcc/tree-vect-loop-manip.cc:2267 0x1498153 vect_do_peeling(_loop_vec_info*, tree_node*, tree_node*, tree_node**, tree_node**, tree_node**, int, bool, bool, tree_node**) src/gcc-mainline/gcc/tree-vect-loop-manip.cc:3399 0x148714e vect_transform_loop(_loop_vec_info*, gimple*) src/gcc-mainline/gcc/tree-vect-loop.cc:11911 0x14c9544 vect_transform_loops src/gcc-mainline/gcc/tree-vectorizer.cc:1006 0x14c9bc3 try_vectorize_loop_1 src/gcc-mainline/gcc/tree-vectorizer.cc:1152 0x14c9bc3 try_vectorize_loop src/gcc-mainline/gcc/tree-vectorizer.cc:1182 0x14ca224 execute src/gcc-mainline/gcc/tree-vectorizer.cc:1298 * * * Breakpoint 1, vect_peel_nonlinear_iv_init (stmts=0x7fffd698, init_expr=0x77a56948, skip_niters=0x77bb8798, step_expr=0x77a64ba0, induction_type=vect_step_op_shr) at src/gcc-mainline/gcc/tree-vect-loop.cc:9420 9420 gcc_assert (TREE_CODE (skip_niters) == INTEGER_CST); (gdb) p debug_tree(skip_niters) unit-size align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x77b5e3f0 precision:32 min max > def_stmt GIMPLE_NOP version:54> $2 = void
[Bug middle-end/113163] New: [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163 Bug ID: 113163 Summary: [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tamar.christina at arm dot com Target Milestone: --- Target: amdgcn-amdhsa The ICE happens when building Newlib with the GCN compiler: during GIMPLE pass: vect In file included from src/accel_newlib-mainline/newlib/libc/stdlib/l64a.c:24: src/accel_newlib-mainline/newlib/libc/include/stdlib.h: In function 'l64a': src/accel_newlib-mainline/newlib/libc/include/stdlib.h:195:9: internal compiler error: in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420 195 | char * l64a (long __input); | ^~~~
[Bug middle-end/113067] [OpenMP][5.1] Context selector - handle 'implementation={requires(...)}'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113067 --- Comment #1 from Tobias Burnus --- Created attachment 56901 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56901=edit Simple testcase (C and Fortran) - as same-directory .diff
[Bug middle-end/113067] New: [OpenMP][5.1] Context selector - handle 'implementation={requires(...)}'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113067 Bug ID: 113067 Summary: [OpenMP][5.1] Context selector - handle 'implementation={requires(...)}' Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- OpenMP 5.1 added: 'implementation={requires(...)}' where ... = unified_shared_memory or unified_address etc. OpenMP 5.0 only had, e.g. 'implementation={unified_shared_memory}' the former is not yet handled * * * With the about to be committed patch, https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640817.html which is actually at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639797.html the Fortran parser in principle handles (when removing the 'sorry') and adds 'unified_shared_memory' and 'requires' according to -fdump-tree-*. For C/C++, it does ICE - which means that more work is required. And, in either case, depends how we want to handle it in internal representation. => Attached parse-only testcase. * * * Independent of this, I am not sure whether we do handle this requirement correctly. Namely, for: (A) implementation={unified_shared_memory}' i.e. those which change depending on 'omp requires unified_shared_memory' being set or not. (B) implementation={dynamic_allocators}' which is currently ignored rather early as it is always true for GCC. (C) implementation={atomic_default_mem_order(acq_rel)}' The later is quite interesting as - at least in Fortran - multiple values are permitted per file (to be checked) and I am not quite sure whether the value is really handled in the ME.
[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639 --- Comment #5 from Tobias Burnus --- Posted a patch for (A) https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639947.html but it seems as if I might have misunderstood some parts of the example at OpenMP spec issue #1796 (TRAC864) / OpenMP Pull Req. #912 Thus, this needs to be rechecked. - It might be that the current state of mainline is just fine, that some parts of this patch still make sense, or that more issues exist.
[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639 --- Comment #4 from Tobias Burnus --- There are *two* independent issues: (A) Predefined firstprivate does not find mappings done in the same directive, e.g. int a[100]; int *p = [0]; #pragma omp target teams distribute map(a) p[0] = 5; (B) The base pointer is not stored, hence, the following fails: int a[100]; int *p = [0]; // #pragma omp target enter data map(a[10:]) /* same result */ #pragma omp target teams distribute map(a[10:]) p[15] = 5; Here, map(a[10:]) /* or: map(a[start:n]) */ gives: map(tofrom:a[start] [len: _7]) map(firstprivate:a [pointer assign, bias: D.2943]) But then the basepointer is gone. Thus, any later lookup of an address that falls between basepointer and first mapped storage location is not found.
[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639 --- Comment #3 from Tobias Burnus --- Created attachment 56804 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56804=edit gimplify.cc patch to ensure that GOVD_MAP_0LEN_ARRAY comes last (does not fix the issue) I tried the attached patch see whether it fixes the problem. It doesn't as the pointer-lookup-for-attachment seems to happen in an earlier 'for' loop than the 'for' loop that does the actual mapping for clauses on the same 'target' directive (→ gomp_map_vars_internal). Thus, either this patch is not required - or it is only required in addition; in any case, it seems as if libgomp/target.c's gomp_map_vars_internal needs to be modified.
[Bug middle-end/112779] New: [OpenMP] Support omp Metadirectives
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112779 Bug ID: 112779 Summary: [OpenMP] Support omp Metadirectives Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- There is a rather complete support for metadirectives on the OG13 branch, i.e. devel/omp/gcc-13 branch. Several patches have been already posted. * * * Known issues with those patches: (A) OpenMP 5.2's renaming of clause 'default' is now (also) 'otherwise' (B) --- ICE (segfault) for "kind: nohost" in default() og12-offload/testlogs-2023-05-04 shows several 'internal compiler error: Segmentation fault' for the 'default' clause of the metadirectives → sollve_vv's test_metadirective_target_device{,_kind{,_any}}.c testcases. The problem for one test case at least is in omp-general.cc's omp_dynamic_cond : 'kind_sel' = {purpose = "kind", value = "{ purpose: "nohost", value: NULL}" } - and accessing TREE_VALUE (TREE_VALUE (kind_sel)). That's for the following code: tree kind_sel = omp_get_context_selector (ctx, "target_device", "kind"); if (kind_sel) { const char *str = (TREE_VALUE (TREE_VALUE (kind_sel)) ? TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (kind_sel))) : IDENTIFIER_POINTER (TREE_PURPOSE (TREE_VALUE (kind_sel; kind = build_string_literal (strlen (str) + 1, str); } I wonder why that's not already handled in gcc/omp-general.cc's omp_context_selector_matches (which has some code) – but it might indeed only be available at run time?!? (C) -- wonder whether libgomp/target.c's GOMP_evaluate_target_device lacks a check for kind == "nohost" — I only see "host" (for the host) and "gpu" (for the GPU) and the generic "any". (D) - Fortran's DO with do-end-label{} Fortran permits loops with label instead of a simple END DO, example: DO 123 i=1,5 DO 123 j=1,5 123 CONTINUE ! or '123 END DO' — Note the *shared* end-do-label (which invalid/deleted since F2018, before deprecated but valid) Such code is not handled with metadirectives as already indicated at gcc/fortran/parse.cc's parse_omp_metadirective_body: case_omp_do: st = parse_omp_do (clause->stmt); /* TODO: Does st == ST_IMPLIED_ENDDO need special handling? */ break; The answer seems to be yes — failing testcase is the following ("Error: END DO statement expected"): implicit none integer :: i, j, psi(5,5)!$omp metadirective & !$omp&when(user={condition(.false. )}: target teams & !$omp& distribute parallel do simd collapse(2)) & !$omp&when(user={condition(.false. )}: target teams & !$omp& distribute parallel do) & !$omp&default(target teams loop collapse(2)) DO 50 I=1,5 !$omp metadirective & !$omp& when(user={condition(.false. )}: simd) DO 51 J=1,5 PSI(j,i) = j 51 CONTINUE 50 CONTINUE end (E) --- internal compiler error: in c_parser_omp_metadirective, at c/c-parser.cc:26565 11 | #pragma omp metadirective when (user = { condition (USE_GPU == 1) } : target enter data map(alloc : number[ : SIZE])) for: #include #include int main(int argc, char ** argv){ const int SIZE = 10; int USE_GPU = 1; double number[SIZE]; double *number_d; #pragma omp metadirective when (user = { condition (USE_GPU == 1) } : target enter data map(alloc : number[ : SIZE])) if (USE_GPU) number_d = (double *)omp_get_mapped_ptr(number, omp_get_default_device()); else number_d = number; printf("number_d = %pnumber= %p\n", number_d, number); return 0; } (F) For C/C++, begin/end metadirective is not handled - it is for Fortran, where it is much more useful. Note: It is less useful that it sounds. From an internal bug tracker: This was a deliberate design decision. From the OpenMP 5.0 spec (2.3.4): "The begin metadirective directive behaves identically to the metadirective directive, except that the directive syntax for the specified directive variants must accept a paired end directive." so having 'target enter data' in a 'begin metadirective' is invalid. The only OpenMP directive supported in GCC that takes an end directive in C/C++ is 'declare target' (is this still true?), and we have already said that we would not support declarative constructs in metadirectives. So the 'begin/end metadirective' support was left out in C/C++. (Invalid) testcase: #pragma omp begin metadirective when (user = { condition
[Bug middle-end/112763] New: [OpenMP] ICE in gimplify_adjust_omp_clauses, at gimplify.cc:13238 – with defaultmap(firstprivate) for C++ member variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112763 Bug ID: 112763 Summary: [OpenMP] ICE in gimplify_adjust_omp_clauses, at gimplify.cc:13238 – with defaultmap(firstprivate) for C++ member variables Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- Created attachment 56718 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56718=edit Testcase; compile with 'g++ -fopenmp -DDEFAULT_FIRSTPRIVATE' for this PR or without -D for PR110347 The following code uses a C++ member variable and 'defaultmap(firstprivate)' causes an ICE. I am not quite sure what the result should be but an ICE is surely wrong. Vaguely related to OpenMP spec issue #3343, strongly related is my email today to the omp-lang@ spec email list. * * * In member function ‘void myClass::tgt()’: 28:14: internal compiler error: in gimplify_adjust_omp_clauses, at gimplify.cc:13238 28 | #pragma omp target defaultmap(firstprivate) private(d) if (0) | ^~~ 0x880609 gimplify_adjust_omp_clauses ../../repos/gcc-trunk-commit/gcc/gimplify.cc:13238 0x10094b4 gimplify_omp_workshare ../../repos/gcc-trunk-commit/gcc/gimplify.cc:15783 Compile the attached testcase with: g++ -fopenmp -DDEFAULT_FIRSTPRIVATE Note: it compiles with -DNON_MEMBER (and fails at runtime, which might be correct). See also PR 110347 for the case of unset DEFAULT_FIRSTPRIVATE, i.e. using firstprivate(member_var) instead of default{,map}(firstprivate)
[Bug middle-end/112667] New: [OpenMP] C++: Handle static local variable in target regions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112667 Bug ID: 112667 Summary: [OpenMP] C++: Handle static local variable in target regions Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openmp, rejects-valid Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org Target Milestone: --- Cross ref, for initialization of static global C++ variables, see: [PATCH] OpenMP: Constructors and destructors for "declare target" static aggregates https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618340.html * * * Based on Thomas' email to https://mailman.openmp.org/mailman/private/omp-lang/2023/018626.html Assume 'declare target' as needed: struct S { S() { } ~S() { } }; static void f() { static S s; } int main() { #pragma omp target { f(); } } This fails in GCC as: "error: variable ‘_ZGVZL1fvE1s’ has been referenced in offloaded code but hasn’t been marked to be included in the offloaded code" where "c++filt _ZGVZL1fvE1s" prints: "guard variable for f()::s" * * * Regarding the validity, Tom replied: It is valid OpenMP offload code, we have specific language to handle the initialization as well actually. We don't explicitly state that the initialization should be protected, but it falls through from the base language rules. For cases where the static has a separate corresponding instance on the device, I would expect it to have its own guard variable on the device (probably _each_ device actually) as well. For cases with unified shared memory, or generally where the static is using the same storage as the original, I would expect the guard to use node scope rather than device scope. The challenging question is whether the offload device is allowed to actually do the initialization in this case (or language says each device initializes its own instance, but this is control flow dependent now). In an ideal world it would probably be something like a reverse offload that does it for the unified case, but I'm pretty sure following the requirements through would say "whichever thread on whichever device gets there first" is the one that does it. [And continued:] [...]we have some language about how initialization happens. I can try to dig it out if you like but essentially it boils down to this: * static lifetime variables at global/class scope are initialized before code runs in the same TU, by each device which has a separate instance * at function scope initialized when first encountered * * * Jakub remarked: I believe we should in the omp_discover_* sub-pass handle with a help of a langhook automatically mark the guard variables (possibly iff the guarded variable is marked?), or e.g. rtti info (_ZTS*, _ZTI*) and eventually figure out what we should do about virtual tables (_ZTV*). The last case is most complicated, as it contains function pointers, and we need to figure out if we mark all methods, or say replace some pointers in the virtual table with NULLs or something that errors or terminates if it isn't marked. And sure, __cxa_guard_* would need to be implemented in the offloading libsupc++.a or libstdc++.a. * * * Side remark regarding virtual tables: OpenMP since 5.2 has in "13.8 target Construct" [287:10-12]: "[C++] Invoking a virtual member function of an object on a device other than the device on which the object was constructed results in unspecified behavior, unless the object is accessible and was constructed on the host device." Thanks to OpenMP 5.1's 'indirect' we already have a means to lookup on the device the function pointers on the host. (Implicitly assumes unified_address, which is the case of all offload devices in GCC.)
[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639 --- Comment #2 from Tobias Burnus --- > If 'a' is already present on the device (e.g. 'omp target enter data > map(a)'), it works. This applies to both the comment 0 example where only a section of 'a' is mapped start > 0 and for the comment 1 example where the whole of 'a' is mapped. It also works fine if 'p' points inside 'A'. * * * As spec ref: TR12 states in "14.8 target Construct" [379:8-10]: "[C/C++] If a list item in a map clause has a base pointer that is predetermined firstprivate (see Section 6.1.1) and on entry to the target region the list item is mapped, the firstprivate pointer is updated via corresponding base pointer initialization." OpenMP 5.1 has in the mentioned C/C++-only section "2.21.7.2 Pointer Initialization for Device Data Environments" that is too long to be quoted. [The TR12 wording 'on entry to the target region' makes it clear that effectively ordering needs to happen. The 5.1 wording is a bit unclear whether it can be mapped with that very target construct - or the storage needs to be present before the target directive. - However, the examples in OpenMP issue #1796 implies that also 5.1 permit mapping the data and the pointer be on the same directive.] * * * The implicit handling of the 'p' in this example happens in gimplify.cc's gimplify_adjust_omp_clauses_1 for 'else if (code == OMP_CLAUSE_MAP && (flags & GOVD_MAP_0LEN_ARRAY) != 0)'.
[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639 --- Comment #1 from Tobias Burnus --- Testing shows that the offsets are correctly handled but that there is an ordering problem. Example: int main () { int a[100] = {}; int *p = [0]; uintptr_t iptr; #pragma omp target map(a, iptr) iptr = (uintptr_t) p; This will fail - as the implicitly added 'firstprivate' arrives too early at GOMP_target_ext - before 'a' is mapped: map(alloc:MEM[(char *)p] [len: 0]) map(firstprivate:p [pointer assign, bias: 0]) map(tofrom:iptr [len: 8]) map(tofrom:a [len: 400]) If 'a' is already present on the device (e.g. 'omp target enter data map(a)'), it works. Solution: The implicitly mapped C/C++ pointer variable 'p' must be added at the end of the clauses.
[Bug c++/110347] [OpenMP] private/firstprivate of a C++ member variable mishandled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110347 --- Comment #2 from Tobias Burnus --- An explicit 'firstprivate(x)' will be turned in the compiler from a FIELD_DECL to: int D.2935 [value-expr: ((struct t *) this)->x]; #pragma omp target firstprivate(D.2934) firstprivate(D.2935) { (void) (D.2935 = 5) in semantics.cc's omp_privatize_field, called by finish_omp_clauses for handle_field_decl. The gimple dump then looks like: int x [value-expr: ((struct t *) this)->x]; #pragma omp target ... firstprivate(x) map(alloc:MEM[(char *)this] [len: 0]) map(firstprivate:this [pointer assign, bias: 0]) { this->x = 5; i.e. there is already a pointless 'this' mapping. For 'private', we could do in omp_privatize_field a simple: v = build_decl (input_location, VAR_DECL, DECL_NAME (t), TREE_TYPE (t)); but for 'firstprivate' that would miss the initialization - and adding a pointless assignment is not really the best, especially not for larger objects (like structs, arrays, reference types). * * * And for 'defaultmap(firstprivate)', the current code already adds 'this' mapping in the original dump: #pragma omp target map(tofrom:*(struct t *) this [len: 44]) map(firstprivate:(struct t *) this [pointer assign, bias: 0]) defaultmap(firstprivate:all) { { (void) (((struct t *) this)->x = 5); That's due to 'finish_omp_target_clauses_r' + data->this_expr_accessed = true; it either needs to be suppressed here - or later in the ME removed again. The former has the problem that 'defaultmap(firstprivate)' is not handled here, the latter means a special case - ensuring that it is only removed if all member accesses are for firstprivatized members. While 'private' does not exist as defaultmap, compiler-internal handling or (not checked) predefined/implicit mapping might.
[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634 --- Comment #5 from Tobias Burnus --- @Kostadin: Sebastian posted a patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637451.html that should be fine as workaround, even if it is not completely correct, cf. https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637543.html
[Bug fortran/110415] (Re)allocation on assignment to allocatable polymorphic variable from allocatable polymorphic function result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110415 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #3 from Tobias Burnus --- Andrew Jenner's submitted patch (gcc-patches@ only): https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636671.html and (fortran@ only): https://gcc.gnu.org/pipermail/fortran/2023-November/059928.html (Replies should got to both lists ...) * * * Technical it is a regression caused by https://gcc.gnu.org/r13-6747-gd7caf313525a46f200d7f5db1ba893f853774aee but before that commit there was no finalization. Comparing the versions: GCC 7+8: ICE in build_function_decl GCC 10+11+12: memory leak in 'func' GCC 13+mainline: segfault at runtime (at 'a = func()' in the main program). * * * I had analyzed the issue the elsewhere, let's copy it here for completeness and possibly to aid the patch review. (Note: The following was written before the patch was written and analyzed the current status.) ---_vptr; // save old value of vptr D.4328 = func (); // new value desc.0.data = (void * restrict) D.4328._data; // As scalar, there is not really a problem, but an //desc.0.dtype.elem_len = D.4328->_vptr->size; // is missing here. desc.0.span = (integer(kind=8)) desc.0.dtype.elem_len; if (__builtin_expect ((integer(kind=8)) (a->_data == 0B), 0, 42)) a->_data = (struct p *) __builtin_malloc (MAX_EXPR <(unsigned long) a->_vptr->_size, 1>); // WRONG: That should use D.4328->_vptr->size! else { if (a->_vptr != D.4349) { __builtin_realloc ((void *) a->_data, a->_vptr->_size); Likewise: a->_vptr should be D.4328->_vptr. Alternatively, a->_vptr had to be updated before the 'if' block.
[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634 --- Comment #3 from Tobias Burnus --- Breakpoint 6, gen_assign_counter_update (gsi=0x7fffcab0, call=0x77230b48, func=0x7736cb00, result=0x77200b98, name=0x258f5f2 "PROF_time_profile") at ../../../repos/gcc/gcc/tree-profile.cc:247 (gdb) p debug_gimple_stmt(call) __atomic_add_fetch_8 (&__gcov_time_profiler_counter, 1, 0); (gdb) p debug_tree(result) BLK (gdb) p debug_tree(func) 32 + ? BUILT_IN_ATOMIC_ADD_FETCH_8: + BUILT_IN_ATOMIC_ADD_FETCH_4); + gcall *call = gimple_build_call (f, 3, addr, one, relaxed); + gen_assign_counter_update (gsi, call, f, result, name); with the new gen_assign_counter_update: + if (result) +{ + tree result_type = TREE_TYPE (TREE_TYPE (func)); + tree tmp = make_temp_ssa_name (result_type, NULL, name); + gimple_set_lhs (call, tmp); + gsi_insert_after (gsi, call, GSI_NEW_STMT); + gassign *assign = gimple_build_assign (result, tmp); + gsi_insert_after (gsi, assign, GSI_NEW_STMT); * * * Thus, it looks as if f's alias func's 'result_type' is unsigned while the rest is all signed.
[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634 --- Comment #2 from Tobias Burnus --- That's r14-5578-ga350a74d6113e3
[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634 Tobias Burnus changed: What|Removed |Added CC||sebastian.huber@embedded-br ||ains.de --- Comment #1 from Tobias Burnus --- Bisecting points to: commit a350a74d6113e3a84943266eb691275951c109d9 (HEAD) Author: Sebastian Huber Date: Sat Oct 21 15:52:15 2023 +0200 gcov: Add gen_counter_update()