[Bug analyzer/115203] [15 Regression] Build fail with non LANG=C in analyzer self test: ICE in fail_formatted at selftest.cc:63 / tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_STREQ

2024-05-23 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115203

--- Comment #2 from Tobias Burnus  ---
Indeed, the suggestion was not to disable the translations in general. A
similar issue shows up when running the testsuite. There is it solved by
"setenv LANG C" and "setenv LANG C.ASCII" – while various scripts also use
LANG=C.

Thus, one way would be to have LANG=C set somewhere (e.g. via the Makefile -
assuming it can be done portable).

Alternative would be your suggestion to disable it in simple_diagnostic_path.
Looking at gcc/intl.cc's gcc_init_libintl, you also need to watch out for
open_quote/close_quote and other fun changes as they might come before you
switch to to LANG=C.

[Bug analyzer/115203] New: [15 Regression] Build fail with non LANG=C in analyzer self test: ICE in fail_formatted at selftest.cc:63 / tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_S

2024-05-23 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115203

Bug ID: 115203
   Summary: [15 Regression] Build fail with non LANG=C in analyzer
self test: ICE in fail_formatted at selftest.cc:63 /
tree-diagnostic-path.cc:2158: test_control_flow_5:
FAIL: ASSERT_STREQ
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: build, ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

For some testing, I happened to build with LANG=de_DE.UTF-8
and that was also set when building GCC itself.

That works fine until the analyzer self test - as the strings don't match:

.../build-gcc-trunk-fast/./gcc/xgcc
-B/home/tob/projects/build-gcc-trunk-fast/./gcc/  -xc++ -nostdinc /dev/null -S
-o /dev/null -fself-test=.../gcc/testsuite/selftests
.../gcc/tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_STREQ
("  events 1-5\n" "FILENAME:1:6:\n" "1 |   if ((arr = (struct foo **)malloc
[...]
5 ||if ((arr[i] = (struct foo *)malloc(sizeof(struct foo))) == NULL) {
  ||~   ~~
  |||   |
  |+--->(4) ...to here  (5) wurde hier deklariert



cc1plus: interner Compiler-Fehler: in fail_formatted, bei selftest.cc:63

0x22af256 selftest::fail_formatted(selftest::location const&, char const*, ...)
../../../repos/gcc/gcc/selftest.cc:63
0x22af301 selftest::assert_streq(selftest::location const&, char const*, char
const*, char const*, char const*)
../../../repos/gcc/gcc/selftest.cc:92
0x25b6cd6 selftest::fail_formatted(selftest::location const&, char const*, ...)
../../../repos/gcc/gcc/selftest.cc:63
0x25b6d81 selftest::assert_streq(selftest::location const&, char const*, char
const*, char const*, char const*)
../../../repos/gcc/gcc/selftest.cc:92
0x10a7b42 test_control_flow_5
../../../repos/gcc/gcc/tree-diagnostic-path.cc:2158
0x10aabe6 control_flow_tests
../../../repos/gcc/gcc/tree-diagnostic-path.cc:2292
0x13a5512 test_control_flow_5
../../../repos/gcc/gcc/tree-diagnostic-path.cc:2158
0x13a85b6 control_flow_tests
../../../repos/gcc/gcc/tree-diagnostic-path.cc:2292

[Bug fortran/115151] New: procedure(acos) [,pointer] :: p - is wrongly rejected

2024-05-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115151

Bug ID: 115151
   Summary: procedure(acos) [,pointer] :: p  - is wrongly rejected
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

IT looks as if the following is wrongly rejected:

procedure(acos) [,pointer] :: p

While the examples below are with 'pointer', I don't see why it shouldn't be
also valid without 'pointer', but I have not verified it.

But I have to admit that I also have not fully understood C1218 – as I don't
see how one could ever specify something else - it either is by construction an
external procedure or it clashes name wise.

Only:
  procedure(acos)[,pointer] :: acos
does not make sense as the former pulls in an 'intrinsic :: acos'.

* * *

Found at https://github.com/klausler/fortran-wringer-tests/ which has several
corner case testcases.

Testcase: acos-iface.f90
-
! Intel, NVF, NAG, f18: works
! GNU, XLF: errors about elemental or non-external procedure as interface
intrinsic :: acos
procedure(acos), pointer :: p
p => acos
print *, p(1.)
end
-

gfortran:
   Error: Procedure pointer 'p' at (1) shall not be elemental

GCC's source code refers for this to F2008's ("15.4.3.6 Procedure declaration
statement"):

"C1218 (R1211) If a proc-interface describes an elemental procedure, each
procedure-entity-name shall specify an external procedure."

where

R1211 procedure-declaration-stmt is PROCEDURE ( [ proc-interface ] )7
[ [ , proc-attr-spec ] ... :: ] proc-decl -list

R1214 proc-decl is procedure-entity-name [ => proc-pointer-init ]

* * *

Furthermore, for procedure pointers, there is:

C1220 (R1217) The procedure-name shall be the name of a nonelemental external
or module procedure, or a specific intrinsic function listed in 13.6 and not
marked with a bullet (•).

where:
"R1216 proc-pointer-init is null-init
 or initial-proc-target
 R1217 initial-proc-target is procedure-name"

'acos' has no dot – and 13.6 has:

"Note that a specific function that is marked with a bullet (•) is not
permitted to be used as an actual argument (12.5.1, C1220), as a target in a
procedure pointer assignment statement (7.2.2.2, C729), or as the interface in
a procedure declaration statement (12.4.3.6, C1216)."

The latter implies that 'acos' is permitted on 'p => acos'.

* * *

Thus, the question is still whether  C1218   makes sense for 'p', i.e. does it
apply to 'p' and (if so) is this is valid for proc pointers or not?

"C855 A named procedure with the POINTER attribute shall have the EXTERNAL
attribute."  is in any case required for proc-pointers as well.

However, that's given by:
"A procedure declaration statement declares procedure pointers, dummy
procedures, and external procedures. It specifies the EXTERNAL attribute
(8.5.9) for all entities in the proc-decl-list."

Which means - unsurprisingly - that the following is invalid:

module m
contains
 subroutine msub; end;
end

use m
intrinsic :: acos
pointer :: msub, intsub, acos   ! INVALID
contains
subroutine intsub; end
end

[Bug fortran/115150] [12/13/14/15 Regression] SHAPE of zero-sized array yields a negative value

2024-05-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115150

Tobias Burnus  changed:

   What|Removed |Added

   Target Milestone|--- |12.5
 CC||sandra at gcc dot gnu.org

--- Comment #1 from Tobias Burnus  ---
Looking at the dump, GCC 11 has:
  _gfortran_shape_4 (, D.3966);
while GCC 12 has:
  x[S.3] = (integer(kind=4)) (((unsigned int) a.dim[S.3].ubound - (unsigned
int) a.dim[S.3].lbound) + 1);

* * * *

Probably caused by:

r12-4591-g1af78e731feb93
Author: Sandra Loosemore 
Date:   Tue Oct 19 21:11:15 2021 -0700

Fortran: Fixes and additional tests for shape/ubound/size [PR94070]

This patch reimplements the SHAPE intrinsic to be inlined similarly to
LBOUND and UBOUND, instead of as a library call, to avoid an
unnecessary array copy.  Various bugs are also fixed.

gcc/fortran/
PR fortran/94070

* * *

SHAPE has:
"Result Value. The result has a value whose i-th element is equal to the extent
of dimension i of SOURCE, except that if SOURCE is assumed-rank, and associated
with an assumed-size array, the last element is equal to −1."

Thus, an example for the latter:

GCC 11.4 - good:
   0   2  -1
GCC 12 (to trunk) - wrong:
  -3   2  -1

integer :: x(10)
call f(x, -3)
contains
subroutine f(y,n)
  integer :: y(1:n,2,*)
  call g(y)
end
subroutine g(z)
  integer :: z(..)
  print *, shape(z)
end
end

[Bug fortran/115150] New: [12/13/14/15 Regression] SHAPE of zero-sized array yields a negative value

2024-05-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115150

Bug ID: 115150
   Summary: [12/13/14/15 Regression] SHAPE of zero-sized array
yields a negative value
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

GCC 11.4 has:
 Shape:   0   0
 Shape:   0   3   0

But since GCC 12:
 Shape:  -2   0
 Shape:  -3   3   0

Testcase:

implicit none
real,allocatable :: A(:),B(:,:)
allocate(a(3:0), b(5:1, 3))
print *, 'Shape:', shape(a), size(a)
print *, 'Shape:', shape(b), size(b)
end

[Bug fortran/44744] Missing -fcheck=bounds diagnostic for function assignment with tmp array

2024-05-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44744

--- Comment #5 from Tobias Burnus  ---
(In reply to Tobias Burnus from comment #4)
> Another variant from lsdalton – or rather the

BTW: I have not verified that the cause is the same (temporary variable), but
it seems to be likely.
When replacing the 'A(i,:,:)' on the LHS with 'B(:,:)', gfortran does diagnose
the too large RHS.

[Bug fortran/44744] Missing -fcheck=bounds diagnostic for function assignment with tmp array

2024-05-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44744

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #4 from Tobias Burnus  ---
Another variant from lsdalton – or rather the
https://github.com/openrsp/openrsp/archive/v1.0.0.tar.gz it downloads during
the build.

FLANG diagnoses the LSDalton issue as:

error(/home/jehammond/DALTON/lsdalton/build/_deps/openrsp_sources-src/src/ao_dens/rsp_property_caching.f90:2164):
Assign: mismatching element counts in array assignment (to 6, from 3)

* * *

GCC fails to do so.
Testcase: – the problem is that the RHS is 3 and the LHS is 6:

implicit none
real,allocatable :: A(:,:,:)
integer :: n, n2, i
n = 6
n2 = 3
allocate(A(3,n,3))
do i = 1, 3
  print *, shape(a), '|', shape(f(n2))
  a(i,:,:) = f(n2)
end do
deallocate(A)
contains
  function f(n)
integer :: n
real :: f(n,3)  
f = 99.0
  end
end

[Bug c/113905] [OpenMP] Declare variant rejects variant-function re-usage

2024-05-17 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113905

--- Comment #1 from Tobias Burnus  ---
Ups, testcase was lost. Re-written from scratch:
---
int var1() { return 1; }
int var2() { return 2; }

#pragma omp declare variant (var1) match(construct={target})
#pragma omp declare variant (var2) match(construct={parallel})
int foo() { return 42; }

#pragma omp declare variant (var2) match(construct={parallel})
#pragma omp declare variant (var2) match(construct={target})
int bar() { return 99; }

int main() {
  __builtin_printf("foo: %d (expected: 42)\n", foo());
  __builtin_printf("bar: %d (expected: 99)\n", bar());
  #pragma omp parallel if(0)
  {
__builtin_printf("foo: %d (expected: 2)\n", foo());
__builtin_printf("bar: %d (expected: 1)\n", bar());
  }
  #pragma omp target //device(-1 /*omp_initial_device*/)
  {
__builtin_printf("foo: %d (expected: 1)\n", foo());
__builtin_printf("bar: %d (expected: 2)\n", bar());
  }
}

[Bug fortran/104621] [OpenMP] Issues with 'declare variant'

2024-05-17 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104621

--- Comment #1 from Tobias Burnus  ---
This got fixed for OpenMP 6 via OpenMP spec pull request #3383, adding:

* A declarative directive must be specified in the specification part after all
'USE', 'IMPORT' and 'IMPLICIT' statements.

* If a declarative directive applies to a function declaration or definition
  and it is specified with one or more C++ attribute specifiers, the
  specified attributes must be applied to the function as permitted by the
  base language.

Plus revising the wording for 'declare variant' for Fortran (semantic +
restrictions). See TR12/OpenMP 6.

[Bug fortran/114825] [11/12/13/14 Regression] Compiler error using gfortran and OpenMP since r5-1190

2024-04-23 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114825

--- Comment #2 from Tobias Burnus  ---
The difference between the failing program and a working program
(pointer-assignment in 'sub' comment out) is:

failing:
  'type' in gfc_omp_clause_default_ctor is '

[Bug middle-end/114754] New: [OpenMP] Missing 'uses_allocators' diagnostic

2024-04-17 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114754

Bug ID: 114754
   Summary: [OpenMP] Missing 'uses_allocators' diagnostic
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: accepts-invalid, diagnostic, openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Cf. https://github.com/SOLLVE/sollve_vv/pull/802

"LLVM error:
 error: allocator must be specified in the 'uses_allocators' clause
adding uses_allocators(omp_default_mem_alloc) fixes the problem"

OpenMP spec has under 'Restrictions to the *target* construct are as follows:'

"Memory allocators that do not appear in a *uses_allocators* clause cannot
appear as an allocator in an *allocate* clause or be used in the *target*
region unless a *requires* directive with the *dynamic_allocators* clause is
present in the same compilation unit."

Example snippets, based on the sollve_vv testcase.

The OG13 patch only diagnose the issue in the last/4th directive;
clang diagnoses both the 'allocate' clause variants (2nd + 4th) but neither
diagnoses the 1st/4th one.

* * *

   omp_allocator_handle_t al = omp_init_allocator(omp_default_mem_space, 0,
NULL);
   #pragma omp target
   {
 int *y = omp_alloc(omp_default_mem_alloc, sizeof(1));
   }

   #pragma omp target allocate(omp_default_mem_alloc:x) firstprivate(x)
map(from: device_result)
   {
  for (int i = 0; i < N; i++)
x += i;
  device_result = x;
   }

   #pragma omp target firstprivate(al)
   {
 int *y = omp_alloc(al, sizeof(1));
   }

   #pragma omp target allocate(al:x) firstprivate(x) map(from: device_result)
   {
  for (int i = 0; i < N; i++)
x += i;
  device_result = x;
   }

[Bug fortran/103496] [F2018][TS29113] C_SIZEOF – rejects now valid args with 'must be an interoperable data entity'

2024-04-12 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103496

--- Comment #4 from Tobias Burnus  ---
(In reply to anlauf from comment #3)
> The code in comment#0 compiles at r14-9893-gded646c91d2c0f
> and gives the indicated results.

which is the commit:
 Fortran: fix argument checking of intrinsics C_SIZEOF, C_F_POINTER [PR106500]

It looks as if the issue is fixed, but gfortran misses a tescast to check that
the obtained value is correct.

c_sizeof_7.f90 contains tests, but I think there should be a run time trst that
the obtained values are correct; I think some are constants such that  a
tree-dump scan test would work as well, but for the dynamic ones, a run time
test seems zo be easier than trying to capture the generated code...

[Bug middle-end/113006] OpenMP 5 - lvalue parsing support for map/to/from clause

2024-04-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113006

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #1 from Tobias Burnus  ---
[WIP] The Pull Request 3831 for Issue 2618 adds the restriction
to *target* only:

"* A list-item in a *map* clause that is specified on a *target* construct must
have a base-variable or base-pointer."

However, function calls in lvalue expressions can be still be used in 'target
{,enter,exit} data'.

NOTE: This also applies to Fortran as 'f()' if 'f' returns a data pointer is a
variable in Fortran, but it isn't a base-variable; i.e. permitted in 'target
... data' not in 'target'.

[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-04-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #30 from Tobias Burnus  ---
(In reply to rguent...@suse.de from comment #29)
> Might be for \r\n line endings?

New lines are handled slightly differently – and \f and \v don't seem to be
handled at all.

Comparing the result with ifort/ifx/flang, they handle a bare '\r' (in contrast
to \r\n) at fewer places than gfortran – albeit from the code it looks as if a
\r not followed by \n is not handled consistently either.

> I'd keep it for the sake of preserving
> previous behavior.  isspace(3) tests for \f, \n, \r, \t, \v and space
> (but of course all depends on the locale, not sure whether libgfortran
> needs to care for locales)

I have added one example to the testcase, but that seems to be already handled
by the code further below which handles '\r' and '\n' - thus, the patch does
not handle it explicitly.

The Fortran standard does not seem to permit \f, \t, \v at all – at least I
only found those in the C interop section. The standard does not really define
what new line actually is, but: "A newline character is a nonblank character
returned by the intrinsic function NEW_LINE." – This handles different
character kinds, but always returns a single character (e.g. \r vs. \n would be
possible, but not \r\n).

* * *

Patch – which handles '\t':
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648950.html

[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-04-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #28 from Tobias Burnus  ---
Created attachment 57896
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57896=edit
Testcase

It seems as if 'tabs' cause problems, e.g. for:

 profile_single_file= .true.

where there are two tabs before '='.

* * *

The problem seems to be that the new code uses:

-  eat_spaces (dtp);
   dtp->u.p.comma_flag = 0;
+  c = next_char (dtp);
+  if (c == ' ')
+{
+  eat_spaces (dtp);

Thus, it explicitly checks for ' ' while eat_spaces handles:

  while (c != EOF && (c == ' ' || c == '\r' || c == '\t'));

Testcase attached.

I think we need at least an "|| c == '\t'"; I guess '\r' isn't really required
here, or is it?

[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors

2024-04-05 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #6 from Tobias Burnus  ---
(In reply to sandra from comment #5)
> Tobias, it looks to me like you missed the connection ...

No, I didn't - I did link it (cf. top of comment 3) — but I just cannot read
:-/

Hence, for
(1)  *teams* *parallel* *for* parallel for
(2)  *teams* *parallel* for parallel *for*
(3)  *teams* parallel for *parallel* *for*

we now have
p = {1,2,3,4,5} → score = {1,2,4,8,16}

Thus:
(1) 1 + 2 + 4 = 9
(2) 1 + 2 + 16 = 19
(3) 1 + 8 + 16 = 25

such that (3) wins → and, hence, either of

 (f15) match (construct={parallel},user={condition(score(19):1)}) /* 8+19 */
 (f16) match (implementation={atomic_default_mem_order(score(27):seq_cst)})

as 27 > 25 - and OpenMP states that it is implementation defined which of the
score = 27 variant is used.

* * *

Thus, I concur that there is an ordering and, hence, scoring bug.

* * * 

> I don't know if anybody wants to tackle a bug fix for this code for GCC 14.

I think we don't - we are too close to the release. This has been implemented
since 2019, 'score()' is not implemented in Clang 18, and code like that is
unlikely to be often used. And the code is non-trivial. - All speaks against
rushing to do an implementation for GCC 14.

[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors

2024-04-05 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #4 from Tobias Burnus  ---
> For selector construct = {teams, parallel, for}
>   score1 = 29   score2 = 29
> ---
>
> The constructs[i]'s score look fine.
>
> But I wonder why score == 29 and not 28.

Answer: omp_context_compute_score starts with:

  *score = 1;

which is set (only) in the error case to "-1" and otherwise, scores only get
added. — I think a constant offset to all scores does not harm (as it is only
used internally) but I still find it confusing.

[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors

2024-04-05 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #3 from Tobias Burnus  ---
The scoring is according to TR12:

> Each trait selector for which the corresponding trait appears
> in the construct trait set in the OpenMP context is given the
> value 2^(p−1) where p is the position of the corresponding trait,
> c_p, in the construct trait set

And:

> Specifically, the ordering of the set of constructs is
> c1, . . . , cN , where c1 is the construct at the outermost
> nesting level and cN is the construct at the innermost nesting level.

and

> construct trait set — The trait set that consists of all enclosing
> constructs at a given point in an OpenMP program up to a target construct.
>
> enclosing context — For C/C++, the innermost scope enclosing a directive.


At the call site:
  {teams, parallel, for, parallel, for}

Selector in declare variant:
  {teams, parallel, for}

Looking at 'context traits that contains all selectors in the same order are
used', I see:

(1)  *teams* *parallel* *for* parallel for
(2)  *teams* *parallel* for parallel *for*
(3)  *teams* parallel for *parallel* *for*

Sandra wrote:
> By my reading, the OpenMP context at the point of call to f17 is
> {teams, parallel, for, parallel, for} with associated scores
> {1, 2, 4, 8, 16}

In my reading (see second quote above), the order of enclosing contexts is
inside out and the scores are:
p = {5,4,3,2,1} → score = {16,8,4,2,1}

Thus:
(1) 16 + 8 + 4 = 28
(2) 16 + 8 + 1 = 25
(3) 16 + 2 + 1 = 19

where (1) wins because of:

> if the traits that correspond to the construct selector set appear
> multiple times in the OpenMP context, the highest valued subset of
> context traits that contains all trait selectors in the same order
> are used;

* * *

This matches what the testcase has:

> match (construct={teams,parallel,for}) /* 16+8+4 */  (= 28)

* * *

---
$ install/bin/x86_64-linux-gnu-gcc declare-variant-12.c -fopenmp
-foffload=disable -S
...
constructs[0] = omp_for, scores[0] = 4, score = 16
constructs[1] = omp_parallel, scores[1] = 3, score = 8
constructs[2] = omp_teams, scores[2] = 2, score = 4

For selector construct = {teams, parallel, for}
  score1 = 29   score2 = 29
---

The constructs[i]'s score look fine.

But I wonder why score == 29 and not 28.

[Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors

2024-04-05 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #2 from Tobias Burnus  ---
Crossref to the OpenMP spec discussions [not publicly available]
related to the scoring:

* Jakub asked about this testcase in an omp-lang in the email 2019/016447.html
(w/o getting a reply.
* He then opened Issue #2028 for three dozen issues/questions
  (issue lifetime: Oct 2019 to Aug 2020; it has 52 comments)
→ The scoring issue of this PR seems to run under '30' – but while I do see the
question, I have trouble finding/understanding Alex' answer in that thread
→ 'Nov 12, 2019' seems to be the entry that has all quotes and seems to finally
settle the issue.

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-04-03 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Tobias Burnus  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Tobias Burnus  ---
FIXED on mainline (= GCC 14).

Namely, the following was fixed. All of those issues involve compiling with
'-g' such that 'mkoffload' generates also a GCN .o file for which the ELF flag
has to match the other .o files.

(A) The issue of comment 0: ELF Flag mismatch if GCC was configured with a
--with-arch=... that does not match the default setting.
→ Fix: See comment 6

Earlier fixes, only vaguely related to comment 0:

(B)
* Compiler default was changed to gfx900 but mkoffload still had Fiji as
default
* Race in handling the debug files
→ Fix: See comment 1

(C)
* Fixed issues related to xnack/sram-ecc, which also lead to ELF flag
mismatches
→ Fix: See comment 4

[Bug libgomp/114445] New: [OpenMP] indirect - potential race when creating reverse-offload hash

2024-03-23 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114445

Bug ID: 114445
   Summary: [OpenMP] indirect - potential race when creating
reverse-offload hash
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

It looks as if when starting two kernels very quickly after another as in

  #pragma omp target nowait
...
  #pragma omp target
...

Two device threads might concurrently create the hash in
libgomp/config/accel/target-indirect.c's build_indirect_map

  if (!indirect_htab)
{
  /* Count the number of entries in the NULL-terminated address map.  */
  for (map_entry = GOMP_INDIRECT_ADDR_MAP; *map_entry;
   map_entry += 2, num_ind_funcs++);

  indirect_htab = htab_create (num_ind_funcs);

The issue only occurs if the assignment of the allocated memory done in the
first thread occurs after the second kernel has checked for indirect_htab ==
NULL, which seems to be rather unlikely to happen in the real world, but IMHO
cannot be ruled out.

Thus, I was wondering (see email) whether a tmp variable + CAS should be used
for indirect_htab; if the swap failed, the memory could just be freed as
another process was faster - and has also completed by then the creation of the
htab.

See also

  r14-9629-g637e76b90e8b045c5e25206a41e3be55deace8d5
  openmp: Change to using a hashtab to lookup offload target addresses for
indirect function calls

and review email at

  https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647755.html

[Bug target/114419] [GCC < 14] amdgcn offload compiler fails to build with amdgcn tools based on LLVM 18

2024-03-22 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114419

Tobias Burnus  changed:

   What|Removed |Added

 CC||ams at gcc dot gnu.org,
   ||burnus at gcc dot gnu.org
Summary|amdgcn offload compiler |[GCC < 14] amdgcn offload
   |fails to build with amdgcn  |compiler fails to build
   |tools based on LLVM 18  |with amdgcn tools based on
   ||LLVM 18

--- Comment #2 from Tobias Burnus  ---
This has been fixed in GCC 14 - and the documentation has been updated
accordingly. See the AMD GCN entry in the GCC 14 and the link to the install
document there:
https://gcc.gnu.org/gcc-14/changes.html

* * *

I think there are two problems with LLVM 18 - both fixed with GCC 14/mainline.

Namely:

(A) Support for 'Fuji' devices was stopped,
i.e. those use AMDHSA Code Object Version 3

→ Solution: Configure with
  --with-arch=gfx900
  --with-multilib-list=gfx900,gfx906,gfx908,gfx90a
i.e. leave out 'fiji' and switch to gfx900 for the default

Cf. r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f


(B) The default version used if no version has been specified changed to Code
Object 5 in LLVM's linker – but with '-g' also mkoffload produces an object
file - but with version 4.

=> With debugging enabled, there will be an error with LLVM 18,
solved by the commit:
  r14-8449-g4b5650acb3107239867830dc1214b31bdbe3cacd - namely:

  Since LLVM commit 082f87c9d418 (Pull Req. #79038; will become LLVM 18)
"[AMDGPU] Change default AMDHSA Code Object version to 5"
  the default - when no --amdhsa-code-object-version= is used - was bumped.

[Bug libgomp/114335] OpenACC: use of accelerator constant/read-only memory for "readonly" modifier mappings

2024-03-14 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114335

Tobias Burnus  changed:

   What|Removed |Added

   Keywords||missed-optimization,
   ||openacc
 CC||burnus at gcc dot gnu.org

--- Comment #1 from Tobias Burnus  ---
It likewise applies for OpenMP for code such as:

  omp target firstprivate(array) allocate(omp_const_mem_alloc : array)

[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-14 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #19 from Tobias Burnus  ---
Regarding the LAPACK issue:

Actually, I am inclined to
* regard it as LAPACK bug
* that was also fixed upstream, see comment 6, to make g95 happy.
as ';' is not a value separator - while ' ;' is fine, where the blank is a
value separator.

My testcase of comment 4 therefore always used a space before the ',' / ';'.

* * *

I have now created an extended testcase, attached to PR105473 as attachment
57695. (Only testing integer/real parsing, not reading the char afterward as in
comment 4.)

The same testcase can also be found at https://godbolt.org/z/14h48167W and
shows the result with gfortran, ifort, ifx and flag. I used this result to add
comments to the testcases.

* * *

For some F2023 wording, see comment 14 above.

And I have to admit that I am rather confused by the results as there does not
seem to be any consistent pattern; there are cases where I agree with
gfortran's error even though neither ifort nor flang show one, while for
others, I think gfortran gets it wrong.

In particular, I think for the following cases:

  call t('point', ';') ! gfortran: no error, others: error
→ IMHO invalid: not a value separator and not an integer.

  call t('point', '5;') ! gfortran: no error shown, others: error
→ This is the LAPACK example but for integers.
  I think ';' is invalid as it is not part of the integer but also not a value
separator.

  call t('comma', '7 ,') ! gfortran: error; others: no error
→ IMHO valid - I think the ' ' as value separator is sufficient.

  call t('point', '3.3,', .true.) ! gfortran/flag: error shown; ifort: no error
→ What's wrong with a comma as value separator?

  call t('comma', '3,3;', .true.) ! gfortran: error shown; others: no error
→ Same, except that ';' is now the value separator


But in the following cases, I think gfortran is *right*:
  call t('point', '5.') ! gfortran/flang: Error shown, ifort: no error
→ '.' is not part of an integer nor a value separator

  call t('comma', '5,') ! gfortran: error; others: no error
→ Likewise for ',' - the ',' is not part of an integer nor a value separator


Disclaimer: I might have easily overlooked some fine print.

[Bug fortran/105473] semicolon allowed when list-directed read integer with decimal='point'

2024-03-14 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105473

Tobias Burnus  changed:

   What|Removed |Added

  Attachment #57693|0   |1
is obsolete||

--- Comment #35 from Tobias Burnus  ---
Created attachment 57695
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57695=edit
Extended testcase; comment shows results for gfortran(trunk),ifort,ifx,flang

Fixed testcase – before: too many lines commented, bugs with floating-point
test string fo decimal=comma the I/O code in general for floating-point case.

See also: https://godbolt.org/z/14h48167W

See also some comments at bug 114304 comment 19.

[Bug fortran/105473] semicolon allowed when list-directed read integer with decimal='point'

2024-03-14 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105473

--- Comment #34 from Tobias Burnus  ---
Created attachment 57693
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57693=edit
Extended testcase; comment shows results for gfortran(trunk),ifort,ifx,flang

See PR114304 for some related issue where too much is rejected
while this one is about too much getting accepted.

I now tried an extended version of comment 0, see below
and see https://godbolt.org/z/GKzc4sveK for the result with gfortran, ifort,
ifx and flag.

I have to admit that I do not really see a real pattern here, comparing the
compilers.

One quote from F2023 can be found at bug 114304 comment 14.

[Bug libfortran/114304] [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-12 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #14 from Tobias Burnus  ---
Created attachment 57680
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57680=edit
Testcase with decimal=COMMA, passes with ifort/ifx/flang - fails with gfortran

> commit r14-9432-g0c179654c3170749f3fb3232f2442fcbc99bffbb
> commit r13-8417-g824a71f609b37a8121793075b175e2bbe14fdb82

Thanks for the fix.

We are now back to the GCC 13 result → comment 4

Namely, attachment 57668 now gives:

   1.23434997   1243.23999   13.238 a
   1.23434997   1243.23999   13.238 a
   1.23434997   1243.23999   13.238
   1.23434997   1243.23999   13.238
At line 33 of file foo.f90 (unit = 99, file = 'foo.inp')

* * *

The question is whether the following show give an error as shown above:
  real :: x(3)
  character(len=1) :: s
  ...
  write(99, '(a)') '1.23435 1243.24 13.24 ;'
  read(99, *) x, s

Or whether reading this line should work, i.e. reading ';' as character – as it
does with ifort and flang.

Or in other words:
* Does ';' count as character, readable by list-directed formatted I/O? (ifort,
ifx, flang)
* Or doesn't it? (gfortran since at least 4.9)

* * *

In F2023 (23-007r1), "13.10.2 Values and value separators":

"A value separator is
 • a comma optionally preceded by one or more contiguous blanks and
   optionally followed by one or more contiguous blanks,
   unless the decimal edit mode is COMMA, in which case a semicolon is
   used in place of the comma,
 • a slash optionally preceded by one or more contiguous blanks and
   optionally followed by one or more contiguous blanks, or
 • one or more contiguous blanks between two nonblank values
   or following the last nonblank value, where a nonblank value
   is a constant, an r*c form, or an r* form."

(where 'r' is an positive integer and 'c' is a literal constant [with ...].)


To me it reads as if the semicolon should be read just fine.

* * *

I now have tried another testcase with decimal=COMMA, which works just fine
with ifort / ifx /flang as shown at
  https://godbolt.org/z/ajeTjzEfY

But with GCC it fails with:
  Fortran runtime error: Comma not allowed as separator with DECIMAL='comma'

See godbolt link above for gfortran vs. ifort vs. ifx. vs. flang
or the attached testcase.

[Bug libfortran/114304] [14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-11 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #6 from Tobias Burnus  ---
[For completeness: The LAPACK testsuite change Richard mentioned in comment 2
is
https://github.com/Reference-LAPACK/lapack/commit/64e8a7500d817869e5fcde35afd39af8bc7a8086
- That's for g95 and was applied 2020.]

[Bug libfortran/114304] [14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-11 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

Tobias Burnus  changed:

   What|Removed |Added

   Keywords||wrong-code
Summary|[14 Regression] Rejects |[14 Regression] libgfortran
   |lapack test |I/O – bogus "Semicolon not
   ||allowed as separator with
   ||DECIMAL='point'"

--- Comment #5 from Tobias Burnus  ---
I just noticed that the change
  libgfortran: [PR105473] Fix checks for decimal='comma'.
got also backported to
  r13-8411-g7ecea49245bc6aeb6c889a4914961f94417f16e5 
on Thu Mar 7, 2024.

Thus, GCC 13 is now affected as well!

[Disclaimer: I have not checked the spec, but it seems very much like a
wrong-code bug.]

[Bug fortran/105473] semicolon allowed when list-directed read integer with decimal='point'

2024-03-11 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105473

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #32 from Tobias Burnus  ---
See PR114304 for an issue that was caused by the fix in comment 27.

[Bug libfortran/114304] [14 Regression] Rejects lapack test

2024-03-11 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #4 from Tobias Burnus  ---
Created attachment 57668
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57668=edit
Testcase

Testresults of the attached testcase:
   See also https://godbolt.org/z/q4rG61EvW

The attached testcase shows with ifort and flang:
   1.234350   1243.240   13.24000 a
   1.234350   1243.240   13.24000 a
   1.234350   1243.240   13.24000
   1.234350   1243.240   13.24000
   1.234350   1243.240   13.24000 ;
   1.234350   1243.240   13.24000 ;

With GCC mainline:
   1.23434997   1243.23999   13.238 a
   1.23434997   1243.23999   13.238 a
Fortran runtime error: Semicolon not allowed as separator with DECIMAL='point'

With GCC 13 (and 4.9):
   1.23434997   1243.23999   13.238 a
   1.23434997   1243.23999   13.238 a
   1.23434997   1243.23999   13.238
   1.23434997   1243.23999   13.238
Fortran runtime error: End of file

[Bug libfortran/114304] [14 Regression] Rejects lapack test

2024-03-11 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

Tobias Burnus  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |---
 CC||burnus at gcc dot gnu.org,
   ||jvdelisle at gcc dot gnu.org

--- Comment #3 from Tobias Burnus  ---
I think the semicolon is not permitted as item separator but if I have as input
  '1.234 134.23 abc'
or likewise:
  '1.234, 134.23 abc'
and
  read (..., * ) x(1:2)

it works – i.e. only reading two floats and then stopping before reading 'abc'.

But if I do the very same but replace ' abc' by ' ;', I get the error, which
seems to be rather inconsistent — what's the difference between 'abc' and ';'
in this case?

* * *

The message is new since
  r14-9050-ga71d87431d0c4e
  libgfortran: [PR105473] Fix checks for decimal='comma'.

Jerry, can you check?

[Bug fortran/114283] New: [OpenMP] Dummy procedures/proc pointers and 'defaultmap', 'default', 'firstprivate' etc.

2024-03-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114283

Bug ID: 114283
   Summary: [OpenMP] Dummy procedures/proc pointers and
'defaultmap', 'default', 'firstprivate' etc.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

See also OpenMP specification Issue #3823 [and slightly related PR 114282].

There are two cases:

(A) Dummy procedures

IMHO those aren't variables and gfortran also rejects them when used in
firstprivate, map, shared etc. ("Object 'f1' is not a variable").

However, gfortran does complain with 'default(none)':
   Error: 'f1' not specified in enclosing 'target'

Note: 'default(none)' does not diagnose it.
EXPECTED: There is no diagnosis for 'defaultmap(none)'.


(B) Procedure pointers

Here it is unclear whether it should be regarded as variable or not; gfortran
treats those as variables.

Depends on OpenMP specification Issue #3823.

It seems as if handling it as variable, but using 'firstprivate' as default for
implicit mapping makes most sense. – But also treating it as non-variable would
make sense.

[Bug middle-end/114282] New: [OpenMP] Implicit mapping of function/procedure pointers should use 'firstprivate'

2024-03-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114282

Bug ID: 114282
   Summary: [OpenMP] Implicit mapping of function/procedure
pointers should use 'firstprivate'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization, openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

For C/C++ function pointers and Fortran dummy procedures / procedure pointers,
the assumption is that pointer is either pointing to the desired function
or it has the address of a function on the initial device with 'declare target
indirect'.

GCC produces for implicit mapping:
 map(alloc:MEM[(char *)g] [len: 0]) map(firstprivate:g [pointer assign, bias:
0])

But a simple
  map(firstprivate:g)
would do here.

Likewise for an explicit map, where there is also no need for a pointer
assignment.

NOTE: For Fortran, it might be that explicit 'map' clauses will get disallowed
per pending OpenMP specification Issue #3823, which is tracked elsewhere.


Trivial testcases (for more complex, see 'indirect' testcases inside GCC or
just add an explicit 'map' clause yourself):


void f(){
  void (*g)();
  #pragma omp target
g();
}


subroutine f(g)
  procedure(), pointer :: g
  !$omp target map(g)
call g()
  !$omp end target
end

[Bug c++/110347] [OpenMP] private/firstprivate of a C++ member variable mishandled

2024-03-01 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110347

Tobias Burnus  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Tobias Burnus  ---
FIXED on mainline (GCC 14).

[Bug middle-end/113436] [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives

2024-03-01 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436

--- Comment #2 from Tobias Burnus  ---
As mentioned in comment 0, PR110347's testcase (r14-9257-g4f82d5a95a244d)
contains '#if 0' code which has to be enabled once this bug is fixed.
Please remember to take care of:

* libgomp/testsuite/libgomp.c++/firstprivate-1.C's and
* libgomp/testsuite/libgomp.c++/private-1.C's

#if 0  /* FIXME: The following is disabled because of PR middle-end/113436.  */

[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-02-21 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

--- Comment #5 from Tobias Burnus  ---
For:

  int *q;
  #pragma omp target device(y5) map(q, q[:5])

GCC currently generates:
  map(tofrom:q [len: 8]) map(tofrom:*q.4_1 [len: 20]) map(attach:q [bias: 0])

Expected:
  'alloc:' instead of 'attach:'
or even:
  map(tofrom:*q [len: 20]) map(firstprivate:q [pointer assign, bias: 0])

In any case, the first 'tofrom' is pointless!


NOTE: GCC 13 shows:
  error: 'q' appears both in data and map clauses

 * * *

For
  #pragma omp target map(s.p[:5])

GCC should do:
  map(tofrom:s [len: 24][implicit]) map(tofrom:*_5 [len: 16])
map(attach:s.p [bias: 0])

But (regression!) it does:
  map(struct:s [len: 1]) map(alloc:s.p [len: 8]) map(tofrom:*_5 [len: 16])
map(attach:s.p [bias: 0])

Solution:

--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -12381,3 +12381,4 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq
*pre_p,
  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
- || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+ || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH
+ || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH)
break;


However, unless I messed up, this will cause tons of ICE(segfault).

[Bug fortran/114002] New: [OpenACC][OpenACC 3.3] Add 'acc_attach'/'acc_detach' routine

2024-02-19 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114002

Bug ID: 114002
   Summary: [OpenACC][OpenACC 3.3] Add 'acc_attach'/'acc_detach'
routine
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57466
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57466=edit
OpenACC run-time testcase

The problem with acc_attach is that it does not like any temporary variable.

For
   call acc_attach(var%v)

We need at the end:
   acc_attach()

But we will easily get:
   parm.v.data = var.v.data;
   ...
   acc_attach()
which won't work


=> This requires a builtin in the GCC front end to handle this as no Fortran
semantic will handle this. Note that:

subroutine acc_attach (ptr_addr) bind(C)
  type(*), dimension(*), target, optional :: ptr_addr
end subroutine

comes close but gets 'acc_attach (var.v.data)' and not '' as
argument.

[Bug fortran/113997] Bogus 'Warning: Interface mismatch in global procedure' with C binding

2024-02-19 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113997

--- Comment #3 from Tobias Burnus  ---
> Anyway, renaming the binding label, like
>subroutine acc_attach_c(x) bind(C, name="acc_attach_renamed")
> makes the code compile.

Well, the code *does* compile as it is only a warning.

* * *

I think the problem here is a bit that on the Fortran-user side ('acc_attach'
vs. 'acc_attach_c') and on the assembler-level side ('acc_attach_' vs.
'acc_attach') everything is fine (except with -fno-underscore) but, admittedly,
not from the Fortran lanaguage side.

(On the other hand, Fortran itself is perfectly happy with:
'subroutine foo' and 'subroutine bar() Bind(C, name='foo_')' but that will
break with most Fortran compilers.)

Thus, the question is whether we (gfortran) want to do something here - or are
happy with issuing the semi-correct/semi-bogus warning here.

* * *

And renaming "acc_attach_c" does not really help as 'acc_attach' with C binding
does exist. In this case it exists as:
 
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgomp/oacc-mem.c;hb=refs/heads/master#l944
and renaming would just add another wrapper around it.


However, an alternative is the following - which is (nearly) identical, except
that GCC does some GFC-CFC and back conversations – independent whether
implemented in C or in Fortran:

subroutine acc_attach(x) bind(C, name="acc_attach_")
  use iso_c_binding, only : c_loc
  implicit none (external, type)

  type(*), dimension(..), target :: x

  interface
subroutine acc_attach_c(x) bind(C, name="acc_attach")
  use iso_c_binding
  type(c_ptr) :: x
end subroutine
  end interface

  call acc_attach_c(c_loc(x))
end

[Bug fortran/113997] New: Bogus 'Warning: Interface mismatch in global procedure' with C binding

2024-02-19 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113997

Bug ID: 113997
   Summary: Bogus 'Warning: Interface mismatch in global
procedure' with C binding
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

The following warning is bogus, unless
  -fno-leading-underscore
is used:

8 | subroutine foo_c(x) bind(C, name="foo")
  |1
Warning: Interface mismatch in global procedure 'foo_c' at (1): Type mismatch
in argument 'x' (TYPE(c_ptr)/TYPE(*))

* * *

Because for
  'subroutine acc_attach()'
  'subroutine acc_attach_c(x) bind(C, name="acc_attach"')

(A) the global Fortran name 'acc_attach' differs from the local name
'acc_attach_c'

(B) the actual name (DECL_ASSEMBLER_NAME) differs: 'acc_attach_c' is
'acc_attach' but 'acc_attach' is 'acc_attach_c'.

* * *

! The C and Fortran interfaces are part of OpenACC 3.3
! An alternative implementation would be a C implementation using
! ISO_Fortran_binding.h.

subroutine acc_attach(x)
  use iso_c_binding, only : c_loc
  implicit none (external, type)

  type(*), dimension(..), target :: x

  interface
subroutine acc_attach_c(x) bind(C, name="acc_attach")
  use iso_c_binding
  type(c_ptr) :: x
end subroutine
  end interface

  call acc_attach_c(c_loc(x))
end

[Bug target/113331] AMDGCN: Compilation failure due to duplicate .LEHB/.LEHE symbols

2024-02-16 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113331

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #2 from Tobias Burnus  ---
All of the following is in except.cc.

The problem is that the count in the label is relative to 'call_site_base'.

In convert_to_eh_region_ranges, those get bumped - but the function reset it at
the end. They do get accumulated via, e.g., dw2_output_call_site_table but, in
GCN, the output_function_exception_table is exit early because of:

3229  if (!crtl->uses_eh_lsda
3230  || targetm_common.except_unwind_info (_options) ==
UI_NONE)
3231return;

Thus, the next time convert_to_eh_region_ranges is called, it starts with the
very same numbers.


The reason that this gets produced is because there is an ERT_MUST_NOT_THROW
("MUST_NOT_THROW regions prevent all exceptions from propagating.  This region
type is used in C++ to surround destructors being run inside a CLEANUP
region.")

As there are both "-1" (implies no action) and "-2" (MUST_NOT_THROW), GCN
produces this output. For whatever reason, nvptx has no "-1" actions in the
function, thus, after the change to "-2", there is no flip and - hence, no
output is produced - avoiding the issue.

→ I bet that both gcn and nvptx are affected (unless luck or compiled with
-fno-exceptions).

[Bug middle-end/113904] [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113904

--- Comment #3 from Tobias Burnus  ---
See comment 1 for remaining to-do items.

I also note that the Fortran resolution comes too early - during parsing - as
the following shows:

module m
implicit none
contains
subroutine test
  !$omp declare variant (foo) match(user={condition(myTrue)})
  !$omp declare variant (bar) match(user={condition(myCond(1).and.myCond(2))})
  logical, parameter :: myTrue = .true.
end
subroutine foo; end
subroutine bar; end
logical function myCond(i)
  integer :: i
  myCond = i < 3
end
end module m


This fails with the complete bogus:

5 |   !$omp declare variant (foo) match(user={condition(myTrue)})
  |   1
Error: property must be a constant logical expression at (1)

As 'myTrue' is a scalar logical PARAMETER.

The problem is just that this is not known when parsing '!$omp' - for that
reason, Fortran separates parsing and resolution, which the current code does
not handle as it comes way too early.

* * *

Otherwise: It looks as if - except for simple variable names (and probablyalso
for functions calls w/o arguments) - we want to introduce an internal aux
function like:

  logical function __m_MOD_test_DV_cond1() result(res)
 res = myCond(1).and.myCond(2)
  end

which is then called when evaluating the run-time expression.

With header files and, possibly, also C++ modules, we might be able to always
inline the condition - with Fortran modules probably not, such that an aux
function would be needed for the generic case.

[Bug middle-end/113904] [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113904

--- Comment #1 from Tobias Burnus  ---
Patch for rejecting non-const arguments in Fortran (wrong-code bit) to bring it
in line with C/C++:

https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645488.html

* * *

TODO as follow up:

* Permit non-constant values for 'condition' and also for 'device_num'
  -> Middle end changes + update all front ends accordingly

* For C/C++, consider rejecting nonconforming device numbers, if
  known at compile time, i.e. only permit positive numbers and
  omp_initial_device_number (= -1) and omp_invalid_device_number (GCC: -4).

  Cf. OpenMP Issue 3832 for the 'conforming' bit.

  [Current spec wording only permits 0 ... < omp_get_num_devices(),
  i.e. neither the host (= omp_initial_device and == omp_get_num_devices())
  or omp_invalid_device_number are not permitted as explicit value;
  however, if absent, it is as if the trait appeared with the
  default-device-var ICV, which permits the discussed values.]

  -> If device_num(-4) (= omp_invalid_device_number), the selector can be
 folded to not matching.

* Possible testcases for some of the features discussed here:
  - https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645472.html
  - the OpenMP 6.0 Examples' program_control/sources/dispatch.1.{c,f90}

[Bug middle-end/113906] New: [OpenMP][5.2] 'construct' context selectors lack many constructs

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113906

Bug ID: 113906
   Summary: [OpenMP][5.2] 'construct' context selectors lack many
constructs
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, rejects-valid
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: sandra at gcc dot gnu.org
  Target Milestone: ---

GCC only accepts the those constructs that are permitted for
5.1 for the 'construct' selector.

Expected: Those of OpenMP 5.2 are supported as well.


OpenMP 5.1 has:

  The 'construct' selector set defines the _construct_ traits that
  should be active in the OpenMP context.  The following selectors
  can be defined in the construct set: 'target'; 'teams'; 'parallel';
  'for' (in C/C++); 'do' (in Fortran); 'simd' and 'dispatch'.
  Each trait-property of the simd selector is a _trait-property-clause._
  The syntax is the same as for a valid clause of the 'declare simd'
  directive and the restrictions on the clauses from that directive
  apply. The construct selector is an ordered list c1, . . . , cN.

OpenMP 5.2 has [and TR12 has]:

  The 'construct' selector set defines the
   construct traits [TR12: construct trait set] that should be
   active in the OpenMP context. Each [trait] selector that can
   be defined in the 'construct' [selector] set is the directive-name
   of a context-matching construct. Each trait-property of the 'simd'
   selector is a trait-property-clause. The syntax is the same as
   for a valid clause of the declare simd directive and the
   restrictions on the clauses from that directive apply. The
   construct selector is an ordered list c1, . . . , cN.

OpenMP TR12 also adds a helpful glossary entry:

'construct trait set' - The trait set that consists of all
  enclosing constructs at a given point in an OpenMP program up
  to a target construct.

[Bug c/113905] New: [OpenMP] Declare variant rejects variant-function re-usage

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113905

Bug ID: 113905
   Summary: [OpenMP] Declare variant rejects variant-function
re-usage
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, rejects-valid
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, parras at gcc dot gnu.org,
sandra at gcc dot gnu.org
  Target Milestone: ---

The attached testcase works with Clang 17 and prints:

Got 42 (OK)
Got 99 (OK)
Got 1 (OK)
Got 2 (OK)
Got 2 (OK)
Got 1 (OK)

Where foo() and bar() share the variant functions 'var1' and 'var2', which
seems to be perfectly valid.


In GCC it fails to compile:

test.c: In function 'bar':
test.c:8:36: error: 'var1' used as a variant with incompatible 'construct'
selector sets
8 | #pragma omp declare variant (var1) match(construct={target})
  |^
test.c:9:36: error: 'var2' used as a variant with incompatible 'construct'
selector sets
9 | #pragma omp declare variant (var2) match(construct={parallel})
  |^


If I only keep the 'declare variant' for 'foo', it compiles. The gimple dump
shows:


__attribute__((omp declare target, omp declare variant variant (parallel )))
int var1 ()

__attribute__((omp declare target, omp declare variant variant (target )))
int var2 ()

__attribute__((omp declare target, omp declare variant base (var2 construct
target ), omp declare variant base (var1 construct parallel )))
int foo ()


I guess the problem is the 'omp declare variant variant' attribute on 'var1'
and 'var2', which causes the issue I am seeing.

[Bug middle-end/113904] New: [OpenMP][5.0][5.1] Dynamic context selector 'user={condition(expr)}' not handled

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113904

Bug ID: 113904
   Summary: [OpenMP][5.0][5.1] Dynamic context selector
'user={condition(expr)}' not handled
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: accepts-invalid, openmp, rejects-valid, wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: parras at gcc dot gnu.org, sandra at gcc dot gnu.org
  Target Milestone: ---

There are two related problems, leading currently to either
  wrong-code (Fortran - alias accepts-invalid OpenMP 5.0)
  or rejects-valid OpenMP 5.1 (C/C++).

* Fortran accepts non constant values - but the ME does not handle them.
  → OpenMP 5.1 feature supported for parsing but not in the ME
  → wrong-code

* C/C++ rejects non-const values
  → Rejecting valid 5.1 code



gfortran happily accepts non constant values - while gcc/g++ reject them

test.c:22:58: error: the value of 'foo_use_var2' is not usable in a constant
expression
   22 | #pragma omp declare variant (var2)
match(user={condition(foo_use_var2)})


While OpenMP 5.0 only permits
   The user selector set defines the condition selector that provides
   additional user-defined conditions.

   C: The condition(boolean-expr) selector defines a constant expression
   that must evaluate to true for the selector to be true.

   C++: The condition(boolean-expr) selector defines a constexpr
   expression that must evaluate to true for the selector to be true.

   Fortran: The condition(logical-expr) selector defines a constant
   expression that must evaluate to true for the selector to be true.


Since OpenMP 5.1:
  The condition selector contains a single trait-property-expression
  that must evaluate to true for the selector to be true.
  Any non-constant expression that is evaluated to determine the
  suitability of a variant is evaluated according to the data state
  trait in the dynamic trait set of the OpenMP context.
  The user selector set is dynamic if the condition selector is present
  and the expression in the condition selector is not a constant
  expression; otherwise, it is static.

[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

--- Comment #4 from Tobias Burnus  ---
Also not handled:
  struct s { int *p; } s1;
  ...
  #pragma omp target map(s1.p[:N])
p[0] = p[N-1] = 99;

Here, the pointer attachment is missing.  See also PR113724 's attachment 57407
for a testcase for this and (some) other issues.

TODO:
- Fix the extra struct issue (→ this patch or other solution)
- Fix the missing attachment issue (this comment's example)
- Audit whether other changes are required.

[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-13 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

--- Comment #6 from Tobias Burnus  ---
Created attachment 57407
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57407=edit
C testcase – passes with patch (except for '#if 0'ed PR113867 issues)

DejaGNU-ified testcase for this PR and ('#if 0'-ed for PR113867).

Using the attached patch, it no longer gives an ICE and it works except for the
'#if 0' code but it regresses for:

FAIL: libgomp.c/../libgomp.c-c++-common/baseptrs-1.c (internal compiler error:
Segmentation fault)
FAIL: libgomp.c/../libgomp.c-c++-common/baseptrs-1.c (test for excess errors)
FAIL: libgomp.c/../libgomp.c-c++-common/pr109062.c output pattern test
FAIL: libgomp.c/../libgomp.c-c++-common/target-map-1.c execution test
FAIL: libgomp.c/target-52.c execution test

Thus, there is more work needed.

TODO:
- Create a tix which solves this issue without regressing
- Possibly addressing the PR113867 here

[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-02-12 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

--- Comment #3 from Tobias Burnus  ---
Created attachment 57398
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57398=edit
Patch - handling the libgomp issue

Possible patch - lightly tested. This fixes the issue in libgomp.

While always correct and possibly avoiding other corner cases (if there are
any), an alternative approach is to not create those 'struct: s [len: 1]' +
'alloc:s2.p [len: 0]'.

RFC:
- Is there a reason why we want to have the struct in such a case?
  (GCC <= 13 doesn't create this struct)
- Do we want to have this libgomp change even when not generating the struct
  as extra safety net?

[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-02-12 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

Tobias Burnus  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |burnus at gcc dot 
gnu.org
  Component|middle-end  |libgomp

--- Comment #2 from Tobias Burnus  ---
The problem here is in libgomp's gomp_map_vars_internal:

/* Fallthrough.  */
  case GOMP_MAP_STRUCT:
first = i + 1;
last = i + sizes[i];
cur_node.host_start = (uintptr_t) hostaddrs[i];
cur_node.host_end = (uintptr_t) hostaddrs[last]
+ sizes[last];
if (tgt->list[first].key != NULL)
  continue;
if (sizes[last] == 0)
  cur_node.host_end++;
n = splay_tree_lookup (mem_map, _node);
if (sizes[last] == 0)
  cur_node.host_end--;
if (n == NULL && cur_node.host_start == cur_node.host_end)
  {
gomp_mutex_unlock (>lock);
gomp_fatal ("Struct pointer member not mapped (%p)",
(void*) hostaddrs[first]);
  }
if (n == NULL)
...
field_tgt_base = (uintptr_t) hostaddrs[first];
...
field_tgt_clear = last;

here: n == NULL and cur_node.host_end - cur_node.host_start = 8 [i.e.
sizeof(void*)?!]:

For i=1, there is no action to be taken due to the
GOMP_MAP_ZERO_LEN_ARRAY_SECTION.

And for i=2,

if (field_tgt_clear != FIELD_TGT_EMPTY)
  {
k->tgt_offset = k->host_start - field_tgt_base
+ field_tgt_offset;

Here, k->tgt_offset = hostaddr of the struct but we are no longer mapping a
struct here. - Clearly, resetting was forgotten ...

[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-10 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

--- Comment #5 from Tobias Burnus  ---
The runtime issue is now PR113867.

[Bug middle-end/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-02-10 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

--- Comment #1 from Tobias Burnus  ---
Created attachment 57382
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57382=edit
Fortran testcase, kind of, as pointer + pointee mapping cannot be split
(working)

For completeness, a Fortran testcase.
(This Testcase works on GCC 13 and mainline.)

As in Fortran, 'map(ptr, dt%ptr)' will always attempt to map the pointer and
the pointer, it is not possible to split
   map(s2.p[:N])  // map the pointee
   map(s2.p, s2.p[:0]) // map the pointer and try pointer attachment
as in C/C++. And using 'map(s.p)' will prevent a later inner 'map(s.a,...)' as
's' is already partially mapped. Hence, the aux 'ptr' is used, but that kind of
defeats the purpose of this testcase.

[Bug middle-end/113867] New: [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-02-10 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

Bug ID: 113867
   Summary: [14 Regression][OpenMP] Wrong code with mapping
pointers in structs
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57381
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57381=edit
Testcase, compile with 'gcc -fopenmp' and run with an offload device

Split off from PR113724 which mainly about an ICE.

The attached programs work with GCC 13 but fails with mainline.
(Requires actual offloading; tried here with nvptx.)

Probably due to Julian's mapping patches.


With mainline, it fails for 'g()' when executing

   omp target data map(tofrom: s2.p[:100])

(i.e. GOMP_target_data_ext → gomp_copy_host2dev → gomp_device_copy) with

  libgomp: cuMemGetAddressRange_v2 error: named symbol not found
  libgomp: Copying of host object [0x118c500..0x118c690) to dev object
[0x7f7e721cae00..0x7f7e721caf90) failed

It works using a separate target enter/exit data (i.e. for 'f()').


The mainline dump shows:

map(struct:s2 [len: 1]) map(alloc:s2.p [len: 0]) map(tofrom:*_2 [len: 400])
map(attach:s2.p [bias: 0])

I somehow hadn't expected the

   map(struct:s2 [len: 1]) map(alloc:s2.p [len: 0])

which might or might not be the issue. As it works with 'f()' (i.e. enter/exit
data), it might be a red herring (or not).

[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-10 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

--- Comment #4 from Tobias Burnus  ---
Created attachment 57377
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57377=edit
Fixes the ICE – might paper over a real issue; doesn't fix the run-time issue →
TODO + 'data'-issue in PR comment 4

The patch fixes the issue

#pragma omp target data map(S1.p[:N],S1.p,S1.a,S1.b)

This gets split into the groups (reverse order!) 'S1.b' (i = 0), 'S1.a' (i =
1), 'S1.p' (i = 3) and 'S1.p[:N]' (i = 4; map + attach).

In omp_build_struct_sibling_lists the collecting and reordering happens but for
'S1.p' there should be an 'alloc' not a 'tofrom' - such 'S1.p' has grp->deleted
and the attach code will add the 'alloc' code.

Until i = 2, everything is fine:
  *(*group[2])->deleted == true
  *(*group[2])->grp_start == (*group[2])->grp_end

and this encloses 'map(tofrom:S1.p [len: 8])' – which should be removed in
favor of a later (i = 3) added 'map(alloc:S1.p [len: 8])'.

In principle, everything looks fine, until i =3 calls
omp_accumulate_sibling_list, which in turn calls:

  continue_at
= cl ? omp_siblist_move_concat_nodes_after (cl, tail_chain,
grp_start_p, grp_end,
sc)
 : omp_siblist_move_nodes_after (grp_start_p, grp_end, sc);

where 'cl' != NULL_TREE.

After the call, 'tail_chain' alias 'list_p' looks fine - except for the tailing
'map(tofrom:S1.p [len: 8])'.

In principle, 'groups' is no longer touched - except for the the 'grp->deleted'
handling, which fails (deletes the wrong stuff) because grp_begin points to the
wrong tree.

Solution: Do the OMP_CLAUSE_DECL nullifying earlier such that messing around
with groups won't cause issues.


TODO: We should really find out WHY i=2's grp_begin gets updated. If it happens
just for previously processed grp items, that's fine - but what will happen if
it also affects a still to be processed item? - If that indeed happens,
everything will be messed up again!

* * *

The testcase shows another issue:

  target data map(to: S2.p[:N])

gets mapped as:

  map(struct:S2 [len: 1]) map(alloc:S2.p [len: 0])
   map(tofrom:*_14 [len: 400])map(attach:S2.p [bias: 0])

before:

  map(tofrom:*_14 [len: 400]) map(attach:S2.p [bias: 0]

The problem of the former is of course that 'S' is already partially mapped and
that an alloc of length 0 will then fail already in 'target data' for the
attach:S2.p as 0 bytes aren't sufficient for a pointer attachment.

This applies both to target data and target, except that for 'target', 'S'
might appear implicitly - while for 'data' it can only appear explicitly or not
at all.

[Bug fortran/113840] New: [OpenACC] !$acc loop seq – bogus rejection of Fortran's EXIT/CYCLE + C/C++ break/continue

2024-02-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113840

Bug ID: 113840
   Summary: [OpenACC] !$acc loop seq – bogus rejection of
Fortran's EXIT/CYCLE + C/C++ break/continue
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openacc, rejects-valid
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: tschwinge at gcc dot gnu.org
  Target Milestone: ---

OpenACC seems to permit EXIT and CYCLE in "!$ACC LOOP" if there is the SEQ
clause.

The following quote is from OpenACC 3.2 but it can be also found in 2.5 a bit
less explicit and between the lines also for 1.0 and 2.0:

"2.9 Loop Construct" → "Restrictions"

"A loop associated with a loop construct that does not have a seq clause must
be written to
meet all of the following conditions:1931
– The loop variable must be of integer, C/C++ pointer, or C++ random-access
iterator type.
– The loop variable must monotonically increase or decrease in the direction of
its termination condition.
– The loop trip count must be computable in constant time when entering the
loop construct."

Currently, it fails with:

test.f90:4:6:

4 |   EXIT
  |  1
Error: EXIT statement at (1) terminating !$ACC LOOP loop

or

test.c:5:7: error: break statement used with OpenMP for loop
5 |   break;
  |   ^

 * * *

Testcases:

!$acc parallel
!$acc loop seq
do i=1, 5
  EXIT
end do
!$acc end parallel
end



void f() {
 #pragma acc parallel
  #pragma acc loop seq
for (int i=1; i < 5; i++)
  break;
}


 * * *

It seems as if the loop conditions are also relaxed, which needs to be handled
/ supported. (Not folding to OMP_FOR internally – or still? If not: at least
PRIVATE needs to be handled and the SEQ be honored.)

 * * *

Real-world testcase:

https://gitlab.dkrz.de/icon/icon-model/-/blob/release-2024.01-public/src/diagnostics/mo_tropopause.f90?ref_type=heads#L200-L213

[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

--- Comment #3 from Tobias Burnus  ---
Inside omp_build_struct_sibling_lists, the following assignment:

11654   grp->grp_start = new_next;

has on the LHS the [3] array with value:

(gdb) p *grp
$147 = {grp_start = 0x771f9688, grp_end = 0x771f9630, mark = UNVISITED,
deleted = true, reprocess_struct = false, fragile = false, sibling = 0x0, next
= 0x0}

while

(gdb) p new_next
$146 = (tree *) 0x771f96d0

which causes the alias issue we are seeing. Before the assignment:

(gdb) p debug(*(tree*)0x771f9688)
map(tofrom:S1.b) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400])
map(attach_detach:S1.p [bias: 0])

(gdb) p debug(*(tree*)0x771f96d0)
map(tofrom:S1.p) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0])

[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

--- Comment #2 from Tobias Burnus  ---
Inside:  omp_build_struct_sibling_lists

  new_next 
= omp_accumulate_sibling_list (region_type, code,
   struct_map_to_clause, *grpmap,
   grp_start_p, grp_end, addr_tokens,
   , _p,
   grp->reprocess_struct, _tail);

This processing looks okay. But:

  /* Delete groups marked for deletion above.  At this point the order of the
 groups may no longer correspond to the order of the underlying list,
 which complicates this a little.  First clear out OMP_CLAUSE_DECL for
 deleted nodes...  */

  FOR_EACH_VEC_ELT (*groups, i, grp)
if (grp->deleted)
  for (tree d = *grp->grp_start;
   d != OMP_CLAUSE_CHAIN (grp->grp_end);
   d = OMP_CLAUSE_CHAIN (d))
OMP_CLAUSE_DECL (d) = NULL_TREE;


Where we have the following 4 elements. Note that grp_start is identical for
[1] and [2] – where [2] is deleted = true – which causes that the CLAUSE_DECL
are NULL. Namely, 'p (*groups)[i]' for i = 0...3 gives:

$86 = (omp_mapping_group &) @0x30f7a48: {grp_start = 0x76c92070, grp_end =
0x771f96c0, mark = UNVISITED, deleted = false, reprocess_struct = false,
fragile = false, sibling = 0x0, next = 0x0}

$91 = (omp_mapping_group &) @0x30f7a70: {grp_start = 0x771f96d0, grp_end =
0x771f9678, mark = UNVISITED, deleted = false, reprocess_struct = false,
fragile = false, sibling = 0x0, next = 0x0}

$92 = (omp_mapping_group &) @0x30f7a98: {grp_start = 0x771f96d0, grp_end =
0x771f9630, mark = UNVISITED, deleted = true, reprocess_struct = false,
fragile = false, sibling = 0x0, next = 0x0}

$93 = (omp_mapping_group &) @0x30f7ac0: {grp_start = 0x771f9640, grp_end =
0x771f9708, mark = UNVISITED, deleted = false, reprocess_struct = false,
fragile = false, sibling = 0x0, next = 0x0}


Where the '*grp_start' values of [0],[1]+[2], [3] are:

map(struct:S1 [len: 3]) map(tofrom:S1.a) map(tofrom:S1.b) map(alloc:S1.p [len:
8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0])
map(tofrom:S1.p)

map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p
[bias: 0]) map(tofrom:S1.p)

(gdb) p debug(*(tree*)0x771f9640)


And 'grp_end' for [0]...[4] is:

map(tofrom:S1.b) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p [len: 400])
map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p)

map(tofrom:S1.a) map(tofrom:S1.b) map(alloc:S1.p [len: 8]) map(tofrom:*S1.p
[len: 400]) map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p)

map(tofrom:S1.p)

map(attach_detach:S1.p [bias: 0]) map(tofrom:S1.p)


BEFORE that deleted loop, the result is:

(gdb) p debug(*list_p)

map(struct:S1 [len: 3]) map(tofrom:S1.a) map(tofrom:S1.b) map(alloc:S1.p [len:
8]) map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0])
map(tofrom:S1.p)

which looks fine. Obviously, after the deleted, all entries after 'alloc:S.p'
have CLAUSE_DECL == NULL_TREE, causing the fail.

* * *

RFC:
* Why are there two 'grp' with the same *grp_start value?
* Why does it get set to 'deleted' while its clauses are actually used?

[Bug tree-optimization/113731] [14 regression] ICE when building libbsd since r14-8768-g85094e2aa6dba7

2024-02-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113731

--- Comment #10 from Tobias Burnus  ---
(In reply to Tamar Christina from comment #9)
> (In reply to Matthias Klose from comment #8)
> > the proposed patch doesn't fix the amdgcn-amdhsa bootstrap.
> 
> So what is the error with the patch? The output can't be the same as the
> function was removed.

For what it is worth, Jakub's patch (based on Tamar's patch email ),
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645062.html ,
works here, i.e. I can build GCC with the current in-tree newlib.

[Bug middle-end/113724] [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-06 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

--- Comment #1 from Tobias Burnus  ---
Debugging shows:

In gimplify_adjust_omp_clauses
(line numbers are off by 1 as I have a #pragma GCC optimize("O0") on top of the
file):

13717 groups = omp_gather_mapping_groups (list_p);
...
13720 if (groups)
13721   {
13722 grpmap = omp_index_mapping_groups (groups);
13723
13724 omp_resolve_clause_dependencies (code, groups, grpmap);
13725 omp_build_struct_sibling_lists (code, ctx->region_type,
groups,
13726 , list_p);


On the outermost side:

(gdb) p debug(*list_p)
num_teams(-2) thread_limit(0)

(gdb) p debug(*list_p)
map(tofrom:S1.b) map(tofrom:S1.a) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400])
map(attach_detach:S1.p [bias: 0])

The latter goes into the 'if (groups)' and list_p after line 13726 is:

(gdb) p debug(*list_p)
map(struct:S1 [len: 3]) map(tofrom:S1.a) map(tofrom:S1.b)


The later ICE / segfault is because 'map(struct' has len = 3 but only two map
clauses follow.

And the other question is: Why is 'S1.p' gone?

* * *

The 'struct' (GOMP_MAP_STRUCT) with initial length is created by
omp_accumulate_sibling_list:

11120 OMP_CLAUSE_SET_MAP_KIND (l, str_kind);
11121 OMP_CLAUSE_DECL (l) = unshare_expr (base);
11122 OMP_CLAUSE_SIZE (l) = size_int (1);

and later updated via

11462 OMP_CLAUSE_SIZE (*osc)
11463   = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
size_one_node);
11464
11465 if (reprocessing_struct)

* * *

This works fine for:
  map(tofrom:S1.b)
-> create struct with len=1

It works also for:
   map(tofrom:S1.a)
-> update struct len to 2 + add 'S1.a'

But not for:
   map(tofrom:S1.p)
-> does update len to 3 but doesn't add 'S1.p'.

I do note that at:

11382   if (attach_detach && sc == grp_start_p)

(gdb) p attach_detach
$139 = true

(gdb) p sc == grp_start_p
$140 = false

(gdb) p debug(*sc)
map(tofrom:S1.b) map(tofrom:S1.p) map(tofrom:*S1.p [len: 400])
map(attach_detach:S1.p [bias: 0])

(gdb) p debug(*grp_start_p)
map(tofrom:*S1.p [len: 400]) map(attach_detach:S1.p [bias: 0])

[Bug middle-end/113771] New: [14 Regression][GCN] ICE during GIMPLE pass: vect in vect_transform_loop tree-vect-loop.cc:11969

2024-02-05 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113771

Bug ID: 113771
   Summary: [14 Regression][GCN] ICE during GIMPLE pass: vect in
vect_transform_loop tree-vect-loop.cc:11969
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: tnfchris at gcc dot gnu.org
  Target Milestone: ---
Target: gcn

Still to be debugged further. I did work last week, i.e. it is a very recent
regression.

In-tree build of Newlib in the amdgcn-amdhsa build of GCC fails with -O2 (-O1
is okay):



during GIMPLE pass: vect
In file included from /home/tob/repos/gcc/newlib/libc/string/memset.c:29:
/home/tob/repos/gcc/newlib/libc/include/string.h: In function 'memset':
/home/tob/repos/gcc/newlib/libc/include/string.h:33:10: internal compiler
error: Segmentation fault
   33 | void *   memset (void *, int, size_t);
  |  ^~
0x102617f crash_signal
/home/tob/repos/gcc/gcc/toplev.cc:317
0x7efe08c4123f ???
   
/usr/src/debug/glibc-2.39/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x12d28fc gsi_prev(gimple_stmt_iterator*)
/home/tob/repos/gcc/gcc/gimple-iterator.h:236
0x12d28fc move_early_exit_stmts
/home/tob/repos/gcc/gcc/tree-vect-loop.cc:11804
0x12d28fc vect_transform_loop(_loop_vec_info*, gimple*)
/home/tob/repos/gcc/gcc/tree-vect-loop.cc:11969
0x1314321 vect_transform_loops
/home/tob/repos/gcc/gcc/tree-vectorizer.cc:1006
0x131492c try_vectorize_loop_1
/home/tob/repos/gcc/gcc/tree-vectorizer.cc:1152
0x131492c try_vectorize_loop

* * *

In the debugger:

Program received signal SIGSEGV, Segmentation fault.
0x012d28fc in gsi_prev (i=0x7fffc1a0) at
/home/tob/repos/gcc/gcc/gimple-iterator.h:236
236   gimple *prev = i->ptr->prev;


(gdb) p i->ptr
$1 = (gimple_seq_node) 0x0

11802 gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
11803 gsi_move_before (_gsi, _gsi);
11804 gsi_prev (_gsi);


(gdb) p debug_gimple_stmt(stmt)
# .MEM_49 = VDEF <.MEM_81>
*s_72 = _1;

(gdb) p debug_gimple_stmt(*dest_gsi->seq)
# .MEM_49 = VDEF <.MEM_81>
*s_72 = _1;
$11 = void

(gdb) p debug_gimple_stmt(*stmt_gsi->seq)
# DEBUG BEGIN_STMT

(gdb) p debug_gimple_seq(stmt_gsi->ptr)
# DEBUG s => s_48
# DEBUG n => n_46
# DEBUG BEGIN_STMT
s.3_2 = (long unsigned int) s_48;
_3 = s.3_2 & 7;
if (_3 != 0)
$14 = void
(gdb) p debug_gimple_seq(dest_gsi->ptr)


(gdb) p debug_bb(stmt_gsi->bb)
 [local count: 862990464]:
# DEBUG BEGIN_STMT
s_48 = s_72 + 1;
# DEBUG s => s_48
_1 = (char) c_22(D);
# DEBUG s => s_48
# DEBUG n => n_46
# DEBUG BEGIN_STMT
s.3_2 = (long unsigned int) s_48;
_3 = s.3_2 & 7;
if (_3 != 0)
  goto ; [94.50%]
else
  goto ; [5.50%]

$16 = void
(gdb) p debug_bb(dest_gsi->bb)
 [local count: 815525989]:
*s_72 = _1;
goto ; [100.00%]

[Bug target/113721] [14 Regression][OpenMP][nvptx] cuCtxSynchronize fail when calling 'free' in the target regsion

2024-02-02 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113721

Tobias Burnus  changed:

   What|Removed |Added

 Resolution|--- |WORKSFORME
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Tobias Burnus  ---
Hmm, while I did see it on two systems, bisecting failed (always passing) and
now it passes with both the quicker build and full build.

I did not complete the bisecting as I did not see interesting commits in
between, hence, I cannot rule out that there was an in-between issue very
recently. Nor do I rule out some odd build issue.

Oddly, I had the very same issue on two system.

Anyway, close as WORKSFORME.

[Bug middle-end/113724] New: [14 Regression][OpenMP] ICE (segfault) when mapping a struct in omp_gather_mapping_groups_1

2024-02-02 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113724

Bug ID: 113724
   Summary: [14 Regression][OpenMP] ICE (segfault) when mapping a
struct in omp_gather_mapping_groups_1
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57295
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57295=edit
compile with: gcc -fopenmp target_struct_map.4

The attached testcase from the OpenMP examples document now ICEs in the
new omp_gather_mapping_groups_1 function.

* * *

target_struct_map.4.c: In function ‘main’:
target_struct_map.4.c:46:11: internal compiler error: Segmentation fault
   46 |   #pragma omp target data map(S1.p[:N],S1.p,S1.a,S1.b)
  |   ^~~
0x1045382 crash_signal
/home/tburnus/repos/gcc/gcc/toplev.cc:317
0x7fc380e4251f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xcf8263 tree_check(tree_node*, char const*, int, char const*, tree_code)
/home/tburnus/repos/gcc/gcc/tree.h:3611
0xcf8263 omp_gather_mapping_groups_1
/home/tburnus/repos/gcc/gcc/gimplify.cc:9583
0xd0bc32 omp_gather_mapping_groups
/home/tburnus/repos/gcc/gcc/gimplify.cc:9610
0xd0bc32 gimplify_adjust_omp_clauses
/home/tburnus/repos/gcc/gcc/gimplify.cc:13733
0xd23d27 gimplify_omp_workshare

[Bug target/113721] New: [14 Regression][OpenMP][nvptx] cuCtxSynchronize fail when calling 'free' in the target regsion

2024-02-02 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113721

Bug ID: 113721
   Summary: [14 Regression][OpenMP][nvptx] cuCtxSynchronize fail
when calling 'free' in the target regsion
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: tschwinge at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

Created attachment 57294
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57294=edit
Compile with -fopenmp + run with nvptx offloading

I have not fully debugged this, but OpenMP Example Document example
  devices/sources/target_ptr_map.1.c
fails now.

It works if one comments the 'free(ptr3)' line but with it in, it fails with:

libgomp: cuCtxSynchronize error: unspecified launch failure (perhaps abort was
called)
libgomp: cuMemFree_v2 error: unspecified launch failure

* * *

* Fails with today's GCC where nvptx is configured with --with-arch=sm_80.
* Fails also with -foffload=nvptx-none=-march=sm_30

* Works with AMD GPU offload

* Works using the GCC 13 distro compiler
* Works using the GCC 13 distro compiler and
  LD_LIBRARY_PATH set to the mainline compile.

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-29 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615

--- Comment #4 from Tobias Burnus  ---
Patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644181.html

It fixes this issue but two other kind of issues I still see for gfx1100.

[Bug libgomp/110813] [OpenMP] omp_target_memcpy_rect (+ strided 'target update'): Improve GCN performance and contiguous subranges

2024-01-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110813

--- Comment #4 from Tobias Burnus  ---
The GCN specific part has been applied to
GCC 14 mainline in commit:
https://gcc.gnu.org/g:a17299c17afeb92a56ef716d2d6380c8538493c4

Unhandled:

* Strided and optimized strided copy (incl. generic part of the linked comment
3, which still needs to be comitted), the former is "[PATCH 0/5] OpenMP:
Array-shaping operator and strided/rectangular 'target update' support",
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629422.html

* Consider also to use a library function for *inter*-device copy if the device
type or the function pointer is the same.

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615

Tobias Burnus  changed:

   What|Removed |Added

 CC||ams at gcc dot gnu.org

--- Comment #2 from Tobias Burnus  ---
> I'm seeing a lot of ICEs like this when running libgomp testsuite with
> offloading for gfx1030.

I wonder why Andrew S didn't see them (unless he did?). However, I did get a
similar/the same ICE for the testcase in PR113645.

I have not checked whether anything below applies to the PR as well or not but
as Andrew P has marked it as duplicate ...

* * *

Regarding PR113645: While, I have no real idea about GCC backend handling, the
following SEEMS TO FIX THE ISSUE for the ICE of the testcase with -O3 and
gfx1030 and gfx1100, also known as

possible patch:

--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -4273,9 +4273,10 @@
 (define_expand "fold_left_plus_"
  [(match_operand: 0 "register_operand")
   (match_operand: 1 "gcn_alu_operand")
   (match_operand:V_FP 2 "gcn_alu_operand")]
-  "can_create_pseudo_p ()
+  "!TARGET_RDNA2_PLUS
+   && can_create_pseudo_p ()
&& (flag_openacc || flag_openmp
|| flag_associative_math)"
   {
 rtx dest = operands[0];

[Bug target/113645] New: [amdgcn][gfx1030][gfx1100] ICE in RTL pass: vregs with -O3: unrecognizable insn (vector reductions)

2024-01-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113645

Bug ID: 113645
   Summary: [amdgcn][gfx1030][gfx1100] ICE in RTL pass: vregs with
-O3: unrecognizable insn (vector reductions)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: ams at gcc dot gnu.org
  Target Milestone: ---
Target: amdgcn-amdhsa

Created attachment 57247
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57247=edit
Testcase, compile with gcc -fopenmp -O3 -foffload-options=-march=gfx1030 or
gfx1100

Found for BabelStream when compiling with AMD GPU offloading for gfx1030 or
gfx1100 and using -O3.

# git clone https://github.com/UoB-HPC/BabelStream
# cd BabelStream; mkdir build; cd build
# cmake .. -DMODEL=omp
-DCMAKE_CXX_COMPILER=$HOME/projects/gcc-trunk-offload/bin/g++ -DOFFLOAD=ON
-DCXX_EXTRA_FLAGS=-foffload=amdgcn-amdhsa=-march=gfx1100 -fopenmp
# make

 * * *

Simplified testcase attached, compile with:

gcc  -fopenmp -foffload=amdgcn-amdhsa \
 -foffload-options=amdgcn-amdhsa=-march=gfx1030 -O3

or, likewise, gfx1100.

 * * *

foo.c:5:9: error: unrecognizable insn:
5 |   #pragma omp target teams distribute parallel for simd map(tofrom:
sum) reduction(+:sum)
  | ^
(insn 144 143 145 10 (set (reg:V16SF 926)
(unspec:V16SF [
(reg:V16SF 922) repeated x2
(const_int 1 [0x1])
] UNSPEC_PLUS_DPP_SHR)) "foo.c":5:9 -1
 (nil))
during RTL pass: vregs
foo.c:5:9: internal compiler error: in extract_insn, at recog.cc:2812
0x7f6b21 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/tob/repos/gcc/gcc/rtl-error.cc:108
0x7f6b3d _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/tob/repos/gcc/gcc/rtl-error.cc:116
0x7f55e4 extract_insn(rtx_insn*)
/home/tob/repos/gcc/gcc/recog.cc:2812
0xb6b860 instantiate_virtual_regs_in_insn
/home/tob/repos/gcc/gcc/function.cc:1611
0xb6b860 instantiate_virtual_regs
/home/tob/repos/gcc/gcc/function.cc:1994
0xb6b860 execute
/home/tob/repos/gcc/gcc/function.cc:2041

[Bug libgomp/113513] [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c

2024-01-22 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513

--- Comment #2 from Tobias Burnus  ---
Patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643648.html

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-01-22 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Tobias Burnus  ---
FIXED on mainline/GCC 14.

[Bug libgomp/113513] [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c

2024-01-20 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513

--- Comment #1 from Tobias Burnus  ---
Looking at the called GOMP_OFFLOAD_* function, in the failing case, there is:

...
DEBUG GOMP_OFFLOAD_run
DEBUG GOMP_OFFLOAD_dev2host
DEBUG GOMP_OFFLOAD_free
DEBUG: nvptx_attach_host_thread_to_device - 0

and in the successful case:

DEBUG GOMP_OFFLOAD_fini_device 0  <<< called before unregister
DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_FINALIZED
DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_FINALIZED

and then - in the failing case:

DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_INITIALIZED
DEBUG: GOMP_offload_unregister_ver dev=0; state=GOMP_DEVICE_INITIALIZED
DEBUG: gomp_unload_image_from_device
DEBUG GOMP_OFFLOAD_unload_image, 0, 196609
DEBUG: gomp_target_fini; dev=0, state=GOMP_DEVICE_INITIALIZED
DEBUG GOMP_OFFLOAD_fini_device 0
DEBUG: nvptx_attach_host_thread_to_device - 0
libgomp: cuCtxGetDevice error: unknown cuda error


Thus, for some reason, GOMP_OFFLOAD_fini_device then
GOMP_offload_unregister_ver is swapped when
  OMP_DISPLAY_ENV=true and OMP_TARGET_OFFLOAD="mandatory"
are set - but not otherwise.


The call to omp_target_fini comes from:

  if (atexit (gomp_target_fini) != 0)
gomp_fatal ("atexit failed");


While the call to   GOMP_offload_unregister_ver  comes from mkoffload:

  fprintf (out, "static __attribute__((destructor)) void fini (void)\n"
   "{\n"
   "  GOMP_offload_unregister_ver (%#x, __OFFLOAD_TABLE__,"
   " %d/*NVIDIA_PTX*/, _data);\n"
   "};\n",


 * * * 


Actually, the same problem occurs when compiled with:

  -foffload=disable

With that flag + no 'mandatory':

DEBUG GOMP_OFFLOAD_version
DEBUG GOMP_OFFLOAD_get_caps
DEBUG GOMP_OFFLOAD_get_num_devices 0
DEBUG GOMP_OFFLOAD_get_name
DEBUG GOMP_OFFLOAD_get_type
DEBUG GOMP_OFFLOAD_init_device 0
DEBUG: nvptx_open_device - 0
DEBUG: gomp_target_fini; dev=0, state=GOMP_DEVICE_INITIALIZED
DEBUG GOMP_OFFLOAD_fini_device 0
DEBUG: nvptx_attach_host_thread_to_device - 0


And with 'mandatory' + OMP_DISPLAY_ENV=verbose:

DEBUG GOMP_OFFLOAD_version
DEBUG GOMP_OFFLOAD_get_caps
DEBUG GOMP_OFFLOAD_get_num_devices 0
DEBUG GOMP_OFFLOAD_get_name
DEBUG GOMP_OFFLOAD_get_type
< omp_display_env output>
DEBUG GOMP_OFFLOAD_init_device 0
DEBUG: nvptx_open_device - 0

libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device cannot be used for
offloading
DEBUG: gomp_target_fini; dev=0, state=GOMP_DEVICE_INITIALIZED
DEBUG GOMP_OFFLOAD_fini_device 0
DEBUG: nvptx_attach_host_thread_to_device - 

libgomp: cuCtxGetDevice error: unknown cuda error

libgomp: device finalization failed


Thus, the error message is the same – but here no offloading code exists and
just gomp_target_fini is called. - However, there is a prior call to 
'gomp_fatal'  which probably messes things up for the plugin handling - while
in the original code, we have a valid code.

 * * *

If there is no offloading code but
  OMP_DISPLAY_ENV=verbose OMP_TARGET_OFFLOAD="mandatory"
is used, it works:

DEBUG GOMP_OFFLOAD_version
DEBUG GOMP_OFFLOAD_get_caps
DEBUG GOMP_OFFLOAD_get_num_devices 0
DEBUG GOMP_OFFLOAD_get_name
DEBUG GOMP_OFFLOAD_get_type

OPENMP DISPLAY ENVIRONMENT BEGIN
 ...
OPENMP DISPLAY ENVIRONMENT END
DEBUG: gomp_target_fini; dev=0, state=0

 * * *

If there is only one or none of the two env vars, there is no need to search
for devices - and, hence, the nvptx plugin is not called at all and it,
obviously, works as well.

[Bug libgomp/113513] New: [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c

2024-01-19 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513

Bug ID: 113513
   Summary: [OpenMP] libgomp: cuCtxGetDevice error with
OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory"
for libgomp.c/target-52.c
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org
  Target Milestone: ---

When using both OMP_DISPLAY_ENV=true and OMP_TARGET_OFFLOAD="mandatory", the
device has to be initiated early as OMP_DEFAULT_DEVICE (either 0 or -4 =
omp_invalid_device) needs to be known before printing the ICVs.

On my system, this causes
  libgomp: cuCtxGetDevice error: unknown cuda error.

That's with "CUDA Version: 12.3" and "NVIDIA RTX A1000 6GB" with
--with-arch=sm_80.

I am somewhat sure that I have manually tested it before; our tester wasn't
able to remotely set the env vars, hence, I don't know whether it did work
there or not - nor whether it is a regression, depends on CUDA, sm_xx, my card
or ...

[Bug middle-end/113439] New: [OpenMP] Add more collapse testcases mixing precisions, in particular (unsigned) int vs. _BigInt

2024-01-17 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113439

Bug ID: 113439
   Summary: [OpenMP] Add more collapse testcases mixing
precisions, in particular (unsigned) int vs. _BigInt
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Follow up to PR113409 and its testcase testsuite/libgomp.c/bitint-1.c

This is only about adding a testcase.

OpenMP states:

"The iterations of some number of outer associated loops can be collapsed into
one larger logical iteration space that is the collapsed iteration space. The
particular integer type used to compute the iteration count for the collapsed
loop is implementation defined, but its bit precision must be at least that of
the widest type that the implementation would use for the iteration count of
each loop if it was the only associated loop."

Thus, when collapsing two loops with an 'int' and 'long' loop variable, the
iteration-count variable must be (at least) long.

It would be good to ensure that this works fine also when mixing
  (signed/unsigned) int, long, long long, int128_t
with 
  _BigInt

in either order (int, _BigInt and _BigInt, int).

[Bug middle-end/113436] [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives

2024-01-17 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436

--- Comment #1 from Tobias Burnus  ---
BTW: The attach testcase misses 'firstprivate', which obviously needs to be
handled as well.

[Bug middle-end/113436] New: [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives

2024-01-17 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436

Bug ID: 113436
   Summary: [OpenMP] 'allocate' clause has no effect for
(first)private on 'target' directives
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57112
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57112=edit
C/C++ testcase, compile with -fopenmp

The following code fails with an ABORT for the alignment check in the target
region as there is no 'omp_alloc' added for the privatized variables
(private/firstprivate).

It works in the parallel region.

See testcase.

* * *

For dynamic allocators, it depends on the WIP patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637415.html
but that should be a rather independent issue.

Found while working on PR c++/110347 – the being created patch contains for
that PR contains an #if 0 testcase, which shall be enabled once this PR is
fixed.

[Bug fortran/108382] [12/13/14 Regression] Incorrect parsing when acc and omp coexist and -fopenmp -fopenacc is used.

2024-01-12 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108382

--- Comment #2 from Tobias Burnus  ---
Fixed-form Fortran likewise fails for:

!$acc enter
!$acc&   data
!$omp flush
!$omp&  RELEASE
 end  ! fails in this line: "Bad continuation line"

[Bug target/113288] [i386] Missing #define for -mavx10.1-256 and -mavx10.1-512

2024-01-09 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288

--- Comment #4 from Tobias Burnus  ---
The(In reply to Haochen Jiang from comment #3)
> Adding them are quite straightforward.

I guess so. Note: this PR is about the #define in gcc/config/i386, only.

> But I am not quite sure how the whole
> libgomp patch works.

OpenMP has selectors which permits to choose different functions or OpenMap
directives. Several can be evaluated at compile time , some only at runtime.

example (syntax probably not completely right):
 ... match(implementation={arch(x86_64),isa(sse4)})

Here, it can be evaluated at compile time which is done via the function 

TARGET_OMP_DEVICE_KIND_ARCH_ISA

For some, runtime checks are more useful and I am also not sure whether
something like cpuid would make more sense here (in general and especially for
the run-time selector).

But that's a separate issue to this PR.

> Is the patch attempt to check whether it is a perfect match for each ISA
> detected from a hardware? If that is the case, we need them to be added.
> BTW, under this scenario, no need to add an if clause for macro __EVEX512__
> and __EVEX256__ in that patch since those two are not true ISAs.

Something like that. It is also more for completeness and consistency.

For OpenMP we just state:
https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html

which is rather generic for i386/x86_64. We cpuld do less, but the target hook
made it easy to support all of them... I don't think anyone will use
avx10.1-256 as isa context selector with OpenMP.

[Bug target/113288] New: [i386] Missing #define for -mavx10.1-256 and -mavx10.1-512

2024-01-08 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288

Bug ID: 113288
   Summary: [i386] Missing #define for -mavx10.1-256 and
-mavx10.1-512
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: haochen.jiang at intel dot com, hongyuw at gcc dot gnu.org
  Target Milestone: ---
Target: i386,x86_64

As noted in https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642025.html

There is not #define for -mavx10.1-256 and -mavx10.1-512

By contrast, there is one for, e.g.,

__AVX10_512BIT__ and "avx10-max-512bit"
__AVX10_1__ and "avx10.1"
__AMX_FP16__ and -mamx-fp16
etc.

[Bug libgomp/113216] New: [OpenMP] Improve omp_target_is_accessible

2024-01-03 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113216

Bug ID: 113216
   Summary: [OpenMP] Improve omp_target_is_accessible
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

For omp_target_is_accessible + unified address:

* it is not 100% correct to assume that any address is accessible on the host,
  it might be a device-only address

(For handling NULL, see also PR 113213)

Likewise for the nonhost device:

* memory might be accessible even if only unified address

* * *

The host case is a bit trickier as no generic documentation seems to be
available albeit some ranges like 0x7F.. seem to denote device addresses +
a superset has to be formed.

* * *

For the device side, an API function might be available to check for it.

* * 

In case of nvptx / CUDA, the following function seems to be suitable:

CUresult cuMemGetAccess ( unsigned long long* flags,
  const CUmemLocation* location, CUdeviceptr ptr )

checking for flags == CU_MEM_ACCESS_FLAGS_PROT_READWRITE, if I understand it
correctly.

typedef enum CUmemAccess_flags_enum {
 CU_MEM_ACCESS_FLAGS_PROT_NONE   = 0x0,  /**< Default, make the address range
not accessible */
  CU_MEM_ACCESS_FLAGS_PROT_READ  = 0x1,  /**< Make the address range read
accessible */
  CU_MEM_ACCESS_FLAGS_PROT_READWRITE = 0x3,  /**< Make the address range
read-write accessible */
  CU_MEM_ACCESS_FLAGS_PROT_MAX   = 0x7FFF
} CUmemAccess_flags;

* *

In case of HSA/ROCm, I bet there is also some function.

For instance, hipPointerGetAttribute{,s} + hipDrvPointerGetAttributes permit to
query some pointer data.

[Bug libgomp/113213] New: [OpenMP] Update omp_target_is_present / omp_target_is_accessible handling for NULL

2024-01-03 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113213

Bug ID: 113213
   Summary: [OpenMP] Update omp_target_is_present /
omp_target_is_accessible handling for NULL
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Update omp_target_is_present / omp_target_is_accessible handling for NULL

* Including documentation

The non-public issue
   https://github.com/OpenMP/spec/issues/3287
is about to clarify that the result is false.

This has also implications for device == initial device.

[Bug middle-end/113199] New: [14 Regression][GCN] ICE (segfault) when compiling Newlib

2024-01-02 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113199

Bug ID: 113199
   Summary: [14 Regression][GCN] ICE (segfault) when compiling
Newlib
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: ams at gcc dot gnu.org, tnfchris at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56974
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56974=edit
Reduced testcase

This is with mainline - with the patch for PR113163 applied, i.e.
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641555.html

Compiling Newlib's libc/time/wcsftime.c fails as follows.

Here for the reduced testcase, compiling it with:

   gcc -O2 input7.i 

during GIMPLE pass: vect
input7.i: In function '__strftime':
input7.i:14:1: internal compiler error: Segmentation fault
   14 | __strftime (wchar_t *s, size_t maxsize, const wchar_t *format,
  | ^~
0x11dcbff crash_signal
src/gcc-mainline/gcc/toplev.cc:316
0x7f0481ea008f ???
   
/build/glibc-wuryBv/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x1222566 contains_struct_check(tree_node*, tree_node_structure_enum, char
const*, int, char const*)
src/gcc-mainline/gcc/tree.h:3757
0x1222566 verify_gimple_assign_ternary
src/gcc-mainline/gcc/tree-cfg.cc:4334
0x1227c8e verify_gimple_in_cfg(function*, bool, bool)
src/gcc-mainline/gcc/tree-cfg.cc:5602
0x10cde64 execute_function_todo
src/gcc-mainline/gcc/passes.cc:2088
0x10ce3ab execute_todo
src/gcc-mainline/gcc/passes.cc:2142

  * * *

verify_gimple_assign_ternary (stmt=0x77bdb060)
at
/net/build5-fossa-cs/scratch/tburnus/fsf.mainline.x86_64-linux-gnu-amdgcn/src/gcc-mainline/gcc/tree-cfg.cc:4334
4334  tree rhs3_type = TREE_TYPE (rhs3);
(gdb) p rhs3
$1 = (tree) 0x0


(gdb) up
#1  0x01227c8f in verify_gimple_in_cfg (fn=0x77ba4228,
verify_nothrow=verify_nothrow@entry=true, ice=ice@entry=true)
at
/net/build5-fossa-cs/scratch/tburnus/fsf.mainline.x86_64-linux-gnu-amdgcn/src/gcc-mainline/gcc/tree-cfg.cc:5602
5602  err2 |= verify_gimple_stmt (stmt);
(gdb) p stmt
$2 = (gimple *) 0x77bdb060
(gdb) p debug_gimple_stmt(stmt)
loop_mask_46 = VEC_PERM_EXPR ;

[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420

2023-12-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163

--- Comment #6 from Tobias Burnus  ---
and for that condition, we have:

3375  if (!integer_onep (*step_vector))
(gdb) p debug_tree(*step_vector)
  constant 8>

[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420

2023-12-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163

--- Comment #5 from Tobias Burnus  ---
While higher at the call stack:

#3  0x0148714f in vect_transform_loop
(loop_vinfo=loop_vinfo@entry=0x350f2a0,
loop_vectorized_call=loop_vectorized_call@entry=0x0)
at src/gcc-mainline/gcc/tree-vect-loop.cc:11911
11911 epilogue = vect_do_peeling (loop_vinfo, niters, nitersm1,
_vector,

(gdb) p debug_tree(niters)
  constant 6>


One level down:

#2  0x01498154 in vect_do_peeling
(loop_vinfo=loop_vinfo@entry=0x350f2a0, niters=,
niters@entry=0x77bb2030, 
nitersm1=nitersm1@entry=0x77bb2c78,
niters_vector=niters_vector@entry=0x7fffda60,
step_vector=step_vector@entry=0x7fffda68, 
niters_vector_mult_vf_var=niters_vector_mult_vf_var@entry=0x7fffda70,
th=, check_profitability=, 
niters_no_overflow=, advance=)
at src/gcc-mainline/gcc/tree-vect-loop-manip.cc:3399
3399vect_update_ivs_after_vectorizer (loop_vinfo,
niters_vector_mult_vf,

where niters_vector_mult_vf is ssa_name that fails in the assert.


The variable seems to be generated a few lines up in the same function (line
3375 and following):

  if (!integer_onep (*step_vector))
{
  /* On exit from the loop we will have an easy way of calcalating
 NITERS_VECTOR / STEP * STEP.  Install a dummy definition
 until then.  */
  niters_vector_mult_vf = make_ssa_name (TREE_TYPE (*niters_vector));
  SSA_NAME_DEF_STMT (niters_vector_mult_vf) = gimple_build_nop ();
  *niters_vector_mult_vf_var = niters_vector_mult_vf;
}
  else
vect_gen_vector_loop_niters_mult_vf (loop_vinfo, *niters_vector,
 _vector_mult_vf);

[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420

2023-12-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163

--- Comment #2 from Tobias Burnus  ---
Created attachment 56958
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56958=edit
Reduced testcase (

$ amdgcn-amdhsa-gcc -g -O2 inp5.i -march=gfx900

during GIMPLE pass: vect
inp5.i: In function '_l64a_r':
inp5.i:4:1: internal compiler error: in vect_peel_nonlinear_iv_init, at
tree-vect-loop.cc:9420
4 | _l64a_r (struct _reent *rptr,
  | ^~~
0x94cc17 vect_peel_nonlinear_iv_init(gimple**, tree_node*, tree_node*,
tree_node*, vect_induction_op_type)
   
/net/build5-fossa-cs/scratch/tburnus/fsf.mainline.x86_64-linux-gnu-amdgcn/src/gcc-mainline/gcc/tree-vect-loop.cc:9420
0x148bb04 vect_update_ivs_after_vectorizer
src/gcc-mainline/gcc/tree-vect-loop-manip.cc:2267
0x1498153 vect_do_peeling(_loop_vec_info*, tree_node*, tree_node*, tree_node**,
tree_node**, tree_node**, int, bool, bool, tree_node**)
src/gcc-mainline/gcc/tree-vect-loop-manip.cc:3399
0x148714e vect_transform_loop(_loop_vec_info*, gimple*)
src/gcc-mainline/gcc/tree-vect-loop.cc:11911
0x14c9544 vect_transform_loops
src/gcc-mainline/gcc/tree-vectorizer.cc:1006
0x14c9bc3 try_vectorize_loop_1
src/gcc-mainline/gcc/tree-vectorizer.cc:1152
0x14c9bc3 try_vectorize_loop
src/gcc-mainline/gcc/tree-vectorizer.cc:1182
0x14ca224 execute
src/gcc-mainline/gcc/tree-vectorizer.cc:1298

 * * *

Breakpoint 1, vect_peel_nonlinear_iv_init (stmts=0x7fffd698,
init_expr=0x77a56948, skip_niters=0x77bb8798, 
step_expr=0x77a64ba0, induction_type=vect_step_op_shr)
at src/gcc-mainline/gcc/tree-vect-loop.cc:9420
9420  gcc_assert (TREE_CODE (skip_niters) == INTEGER_CST);

(gdb) p debug_tree(skip_niters)
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x77b5e3f0 precision:32 min  max >

def_stmt GIMPLE_NOP
version:54>
$2 = void

[Bug middle-end/113163] New: [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420

2023-12-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163

Bug ID: 113163
   Summary: [14 Regression][GCN] ICE in
vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: tamar.christina at arm dot com
  Target Milestone: ---
Target: amdgcn-amdhsa

The ICE happens when building Newlib with the GCN compiler:

during GIMPLE pass: vect
In file included from src/accel_newlib-mainline/newlib/libc/stdlib/l64a.c:24:
src/accel_newlib-mainline/newlib/libc/include/stdlib.h: In function 'l64a':
src/accel_newlib-mainline/newlib/libc/include/stdlib.h:195:9: internal compiler
error: in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420
  195 | char *  l64a (long __input);
  | ^~~~

[Bug middle-end/113067] [OpenMP][5.1] Context selector - handle 'implementation={requires(...)}'

2023-12-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113067

--- Comment #1 from Tobias Burnus  ---
Created attachment 56901
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56901=edit
Simple testcase (C and Fortran) -  as same-directory .diff

[Bug middle-end/113067] New: [OpenMP][5.1] Context selector - handle 'implementation={requires(...)}'

2023-12-18 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113067

Bug ID: 113067
   Summary: [OpenMP][5.1] Context selector - handle
'implementation={requires(...)}'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

OpenMP 5.1 added:
  'implementation={requires(...)}'
where ... = unified_shared_memory or unified_address etc.

OpenMP 5.0 only had, e.g.
  'implementation={unified_shared_memory}'

the former is not yet handled

* * *

With the about to be committed patch,
  https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640817.html
which is actually at
  https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639797.html
the Fortran parser in principle handles (when removing the 'sorry') and adds
'unified_shared_memory' and 'requires' according to -fdump-tree-*.

For C/C++, it does ICE - which means that more work is required.

And, in either case, depends how we want to handle it in internal
representation.

=> Attached parse-only testcase.

* * *

Independent of this, I am not sure whether we do handle this requirement
correctly.

Namely, for:

(A)  implementation={unified_shared_memory}'
i.e. those which change depending on 'omp requires unified_shared_memory'
being set or not.

(B)  implementation={dynamic_allocators}'
which is currently ignored rather early as it is always true for GCC.

(C) implementation={atomic_default_mem_order(acq_rel)}'

The later is quite interesting as - at least in Fortran - multiple values are
permitted per file (to be checked) and I am not quite sure whether the value is
really handled in the ME.

[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing

2023-12-11 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639

--- Comment #5 from Tobias Burnus  ---
Posted a patch for (A)
  https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639947.html
but it seems as if I might have misunderstood some parts of the example at

  OpenMP spec issue #1796 (TRAC864) / OpenMP Pull Req. #912

Thus, this needs to be rechecked. - It might be that the current state of
mainline is just fine, that some parts of this patch still make sense, or that
more issues exist.

[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing

2023-12-06 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639

--- Comment #4 from Tobias Burnus  ---
There are *two* independent issues:

(A) Predefined firstprivate does not find mappings done in the same
directive, e.g.

  int a[100];
  int *p = [0];
  #pragma omp target teams distribute map(a)
p[0] = 5;


(B) The base pointer is not stored, hence, the following fails:

  int a[100];
  int *p = [0];
  //   #pragma omp target enter data map(a[10:])  /* same result */
  #pragma omp target teams distribute map(a[10:])
p[15] = 5;

Here,
   map(a[10:])  /* or: map(a[start:n])  */
gives:
   map(tofrom:a[start] [len: _7])
  map(firstprivate:a [pointer assign, bias: D.2943])

But then the basepointer is gone. Thus, any later lookup of an address that
falls between basepointer and first mapped storage location is not found.

[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing

2023-12-05 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639

--- Comment #3 from Tobias Burnus  ---
Created attachment 56804
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56804=edit
gimplify.cc patch to ensure that GOVD_MAP_0LEN_ARRAY comes last (does not fix
the issue)

I tried the attached patch see whether it fixes the problem. It doesn't as the
pointer-lookup-for-attachment seems to happen in an earlier 'for' loop than the
'for' loop that does the actual mapping for clauses on the same 'target'
directive (→ gomp_map_vars_internal).

Thus, either this patch is not required - or it is only required in addition;
in any case, it seems as if libgomp/target.c's gomp_map_vars_internal needs to
be modified.

[Bug middle-end/112779] New: [OpenMP] Support omp Metadirectives

2023-11-30 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112779

Bug ID: 112779
   Summary: [OpenMP] Support omp Metadirectives
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

There is a rather complete support for metadirectives on the OG13 branch, i.e.
devel/omp/gcc-13 branch.

Several patches have been already posted.

* * *

Known issues with those patches:

(A) OpenMP 5.2's renaming of clause 'default' is now (also) 'otherwise'

(B) ---
ICE (segfault) for "kind: nohost" in default()

og12-offload/testlogs-2023-05-04 shows several 'internal compiler error:
Segmentation fault' for the 'default' clause of the metadirectives

→ sollve_vv's test_metadirective_target_device{,_kind{,_any}}.c testcases.

The problem for one test case at least is in omp-general.cc's omp_dynamic_cond
:

'kind_sel' = {purpose = "kind", value = "{ purpose: "nohost", value: NULL}" } -
and accessing TREE_VALUE (TREE_VALUE (kind_sel)).

That's for the following code:



  tree kind_sel = omp_get_context_selector (ctx, "target_device", "kind");
  if (kind_sel)
{
  const char *str
= (TREE_VALUE (TREE_VALUE (kind_sel))
   ? TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (kind_sel)))
   : IDENTIFIER_POINTER (TREE_PURPOSE (TREE_VALUE (kind_sel;
  kind =  build_string_literal (strlen (str) + 1, str);
} 



I wonder why that's not already handled in gcc/omp-general.cc's
omp_context_selector_matches (which has some code) – but it might indeed only
be available at run time?!?


(C) --

wonder whether libgomp/target.c's GOMP_evaluate_target_device lacks a check for
kind == "nohost" — I only see "host" (for the host) and "gpu" (for the GPU) and
the generic "any".


(D) -

Fortran's DO with do-end-label{}

Fortran permits loops with label instead of a simple END DO, example:



DO 123 i=1,5
  DO 123 j=1,5
123 CONTINUE  ! or '123 END DO' — Note the *shared* end-do-label (which
invalid/deleted since F2018, before deprecated but valid)

Such code is not handled with metadirectives as already indicated at
gcc/fortran/parse.cc's parse_omp_metadirective_body:





case_omp_do:
  st = parse_omp_do (clause->stmt);
  /* TODO: Does st == ST_IMPLIED_ENDDO need special handling?  */
  break; 

The answer seems to be yes — failing testcase is the following ("Error: END DO
statement expected"):



implicit none
integer :: i, j, psi(5,5)!$omp  metadirective  &
!$omp&when(user={condition(.false. )}: target teams  &
!$omp& distribute parallel do simd collapse(2))  &
!$omp&when(user={condition(.false. )}: target teams  &
!$omp& distribute parallel do)  &
!$omp&default(target teams loop collapse(2))
  DO 50 I=1,5
!$omp  metadirective  &
!$omp& when(user={condition(.false. )}: simd)   
  DO 51 J=1,5
PSI(j,i) = j
   51 CONTINUE
   50 CONTINUE
end 


(E) ---

internal compiler error: in c_parser_omp_metadirective, at c/c-parser.cc:26565
   11 |  #pragma omp metadirective when (user = { condition (USE_GPU == 1) } :
target enter data map(alloc : number[ : SIZE]))

for:

#include 
#include 

int main(int argc, char ** argv){

const int SIZE = 10; 
int USE_GPU = 1;
double number[SIZE];
double *number_d;
 #pragma omp metadirective when (user = { condition (USE_GPU == 1) } : target
enter data map(alloc : number[ : SIZE]))
if (USE_GPU)
number_d = (double *)omp_get_mapped_ptr(number,
omp_get_default_device());
else
number_d = number;
printf("number_d = %pnumber= %p\n", number_d, number);

return 0;
}


(F) 

For C/C++, begin/end metadirective is not handled - it is for Fortran, where it
is much more useful.

Note: It is less useful that it sounds. From an internal bug tracker:

This was a deliberate design decision. From the OpenMP 5.0 spec (2.3.4):

"The begin metadirective directive behaves identically to the metadirective
directive, except that the directive syntax for the specified directive
variants must accept a paired end directive."

so having 'target enter data' in a 'begin metadirective' is invalid.

The only OpenMP directive supported in GCC that takes an end directive in C/C++
is 'declare target' (is this still true?), and we have already said that we
would not support declarative constructs in metadirectives. So the 'begin/end
metadirective' support was left out in C/C++.

(Invalid) testcase:

#pragma omp begin metadirective  when (user = { condition 

[Bug middle-end/112763] New: [OpenMP] ICE in gimplify_adjust_omp_clauses, at gimplify.cc:13238 – with defaultmap(firstprivate) for C++ member variables

2023-11-29 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112763

Bug ID: 112763
   Summary: [OpenMP] ICE in gimplify_adjust_omp_clauses, at
gimplify.cc:13238 – with defaultmap(firstprivate) for
C++ member variables
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56718
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56718=edit
Testcase; compile with 'g++ -fopenmp -DDEFAULT_FIRSTPRIVATE' for this PR or
without -D for PR110347

The following code uses a C++ member variable and 'defaultmap(firstprivate)'
causes an ICE.

I am not quite sure what the result should be but an ICE is surely wrong.

Vaguely related to OpenMP spec issue #3343, strongly related is my email today
to the omp-lang@ spec email list.

* * *

In member function ‘void myClass::tgt()’:
28:14: internal compiler error: in gimplify_adjust_omp_clauses, at
gimplify.cc:13238
   28 |  #pragma omp target defaultmap(firstprivate) private(d) if (0)
  |  ^~~


0x880609 gimplify_adjust_omp_clauses
../../repos/gcc-trunk-commit/gcc/gimplify.cc:13238
0x10094b4 gimplify_omp_workshare
../../repos/gcc-trunk-commit/gcc/gimplify.cc:15783


Compile the attached testcase with:
 g++ -fopenmp -DDEFAULT_FIRSTPRIVATE
Note: it compiles with  -DNON_MEMBER (and fails at runtime, which might be
correct).

See also PR 110347 for the case of unset DEFAULT_FIRSTPRIVATE, i.e. using
  firstprivate(member_var)  instead of   default{,map}(firstprivate)

[Bug middle-end/112667] New: [OpenMP] C++: Handle static local variable in target regions

2023-11-22 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112667

Bug ID: 112667
   Summary: [OpenMP] C++: Handle static local variable in target
regions
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, rejects-valid
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org
  Target Milestone: ---

Cross ref, for initialization of static global C++ variables, see:

[PATCH] OpenMP: Constructors and destructors for "declare target" static
aggregates
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618340.html 

* * *

Based on Thomas' email to
https://mailman.openmp.org/mailman/private/omp-lang/2023/018626.html

Assume 'declare target' as needed:

struct S
{
  S() { }
  ~S() { }
};

static void f()
{
  static S s;
}

int main()
{
#pragma omp target
  {
f();
  }
}

This fails in GCC as:

"error: variable ‘_ZGVZL1fvE1s’ has been referenced in offloaded code but
hasn’t been marked to be included in the offloaded code"

where "c++filt _ZGVZL1fvE1s" prints: "guard variable for f()::s"

 * * *

Regarding the validity, Tom replied:

It is valid OpenMP offload code, we have specific language to handle the 
initialization as well actually.  We don't explicitly state that the 
initialization should be protected, but it falls through from the base 
language rules.  For cases where the static has a separate corresponding 
instance on the device, I would expect it to have its own guard variable 
on the device (probably _each_ device actually) as well. For cases with 
unified shared memory, or generally where the static is using the same 
storage as the original, I would expect the guard to use node scope 
rather than device scope.

The challenging question is whether the offload device is allowed to 
actually do the initialization in this case (or language says each 
device initializes its own instance, but this is control flow dependent 
now).  In an ideal world it would probably be something like a reverse 
offload that does it for the unified case, but I'm pretty sure following 
the requirements through would say "whichever thread on whichever device 
gets there first" is the one that does it.

[And continued:]

[...]we have some language about how 
initialization happens.  I can try to dig it out if you like but 
essentially it boils down to this:

* static lifetime variables at global/class scope are initialized before 
code runs in the same TU, by each device which has a separate instance
* at function scope initialized when first encountered

* * *

Jakub remarked:

I believe we should in the omp_discover_* sub-pass handle with
a help of a langhook automatically mark the guard variables (possibly
iff the guarded variable is marked?), or e.g. rtti info (_ZTS*, _ZTI*)
and eventually figure out what we should do about virtual tables (_ZTV*).
The last case is most complicated, as it contains function pointers, and we
need to figure out if we mark all methods, or say replace some pointers in
the virtual table with NULLs or something that errors or terminates if it
isn't marked.

And sure, __cxa_guard_* would need to be implemented in the offloading
libsupc++.a or libstdc++.a.

* * *

Side remark regarding virtual tables: OpenMP since 5.2 has in "13.8 target
Construct" [287:10-12]:

"[C++] Invoking a virtual member function of an object on a device other than
the device on which the object was constructed results in unspecified behavior,
unless the object is accessible and was constructed on the host device."

Thanks to OpenMP 5.1's 'indirect' we already have a means to lookup on the
device the function pointers on the host. (Implicitly assumes unified_address,
which is the case of all offload devices in GCC.)

[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing

2023-11-21 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639

--- Comment #2 from Tobias Burnus  ---
> If 'a' is already present on the device (e.g. 'omp target enter data 
> map(a)'), it works.

This applies to both the comment 0 example where only a section of 'a' is
mapped start > 0 and for the comment 1 example where the whole of 'a' is
mapped.

It also works fine if 'p' points inside 'A'.

* * *

As spec ref:

TR12 states in "14.8 target Construct" [379:8-10]:

"[C/C++] If a list item in a map clause has a base pointer that is
predetermined firstprivate (see Section 6.1.1) and on entry to the target
region the list item is mapped, the firstprivate pointer is updated via
corresponding base pointer initialization."


OpenMP 5.1 has in the mentioned C/C++-only section "2.21.7.2 Pointer
Initialization for Device Data Environments" that is too long to be quoted.


[The TR12 wording 'on entry to the target region' makes it clear that
effectively ordering needs to happen. The 5.1 wording is a bit unclear whether
it can be mapped with that very target construct - or the storage needs to be
present before the target directive. - However, the examples in OpenMP issue
#1796 implies that also 5.1 permit mapping the data and the pointer be on the
same directive.]

* * *

The implicit handling of the 'p' in this example happens in gimplify.cc's
gimplify_adjust_omp_clauses_1 for 'else if (code == OMP_CLAUSE_MAP && (flags &
GOVD_MAP_0LEN_ARRAY) != 0)'.

[Bug middle-end/110639] [OpenMP][5.1] Predefined firstprivate for pointers - attachment missing

2023-11-21 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110639

--- Comment #1 from Tobias Burnus  ---
Testing shows that the offsets are correctly handled but that there is an
ordering problem. Example:

int
main ()
{
  int a[100] = {};
  int *p = [0];
  uintptr_t iptr;
  #pragma omp target map(a, iptr)
iptr = (uintptr_t) p;

This will fail - as the implicitly added 'firstprivate' arrives too early at
GOMP_target_ext - before 'a' is mapped:

   map(alloc:MEM[(char *)p] [len: 0])
 map(firstprivate:p [pointer assign, bias: 0])
   map(tofrom:iptr [len: 8])
 map(tofrom:a [len: 400])

If 'a' is already present on the device (e.g. 'omp target enter data map(a)'),
it works.

Solution: The implicitly mapped C/C++ pointer variable 'p' must be added at the
end of the clauses.

[Bug c++/110347] [OpenMP] private/firstprivate of a C++ member variable mishandled

2023-11-21 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110347

--- Comment #2 from Tobias Burnus  ---
An explicit 'firstprivate(x)' will be turned in the compiler from a FIELD_DECL
to:

int D.2935 [value-expr: ((struct t *) this)->x];
  #pragma omp target firstprivate(D.2934) firstprivate(D.2935)
{
  (void) (D.2935 = 5)

in semantics.cc's omp_privatize_field, called by finish_omp_clauses for
handle_field_decl. The gimple dump then looks like:

  int x [value-expr: ((struct t *) this)->x];
  #pragma omp target ... firstprivate(x)
  map(alloc:MEM[(char *)this] [len: 0])
  map(firstprivate:this [pointer assign, bias: 0])
{
  this->x = 5;

i.e. there is already a pointless 'this' mapping.

For 'private', we could do in omp_privatize_field a simple:
  v = build_decl (input_location, VAR_DECL, DECL_NAME (t),
  TREE_TYPE (t));
but for 'firstprivate' that would miss the initialization - and adding a
pointless assignment is not really the best, especially not for larger objects
(like structs, arrays, reference types).

* * *

And for 'defaultmap(firstprivate)', the current code already adds 'this'
mapping in the original dump:

  #pragma omp target map(tofrom:*(struct t *) this [len: 44])
map(firstprivate:(struct t *) this [pointer assign, bias: 0])
defaultmap(firstprivate:all)
{
  {
  (void) (((struct t *) this)->x = 5);

That's due to 'finish_omp_target_clauses_r' + data->this_expr_accessed = true;
it either needs to be suppressed here - or later in the ME removed again. The
former has the problem that 'defaultmap(firstprivate)' is not handled here, the
latter means a special case - ensuring that it is only removed if all member
accesses are for firstprivatized members.

While 'private' does not exist as defaultmap, compiler-internal handling or
(not checked) predefined/implicit mapping might.

[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:

2023-11-21 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634

--- Comment #5 from Tobias Burnus  ---
@Kostadin: Sebastian posted a patch at
  https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637451.html
that should be fine as workaround, even if it is not completely correct, cf.
  https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637543.html

[Bug fortran/110415] (Re)allocation on assignment to allocatable polymorphic variable from allocatable polymorphic function result

2023-11-20 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110415

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #3 from Tobias Burnus  ---
Andrew Jenner's submitted patch (gcc-patches@ only):
  https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636671.html
and (fortran@ only):
  https://gcc.gnu.org/pipermail/fortran/2023-November/059928.html
(Replies should got to both lists ...)

* * *

Technical it is a regression caused by
 https://gcc.gnu.org/r13-6747-gd7caf313525a46f200d7f5db1ba893f853774aee
but before that commit there was no finalization.

Comparing the versions:
  GCC 7+8: ICE in build_function_decl
  GCC 10+11+12: memory leak in 'func'
  GCC 13+mainline: segfault at runtime  (at 'a = func()' in the main program).

* * *

I had analyzed the issue the elsewhere, let's copy it here for completeness and
possibly to aid the patch review. (Note: The following was written before the
patch was written and analyzed the current status.)

---_vptr;  // save old value of vptr
D.4328 = func ();   // new value

desc.0.data = (void * restrict) D.4328._data;
// As scalar, there is not really a problem, but an
//desc.0.dtype.elem_len = D.4328->_vptr->size;
// is missing here.
desc.0.span = (integer(kind=8)) desc.0.dtype.elem_len;

if (__builtin_expect ((integer(kind=8)) (a->_data == 0B), 0, 42))
a->_data = (struct p *) __builtin_malloc (MAX_EXPR <(unsigned long)
a->_vptr->_size, 1>);
  // WRONG: That should use D.4328->_vptr->size!

else
  {
if (a->_vptr != D.4349)
  {
__builtin_realloc ((void *) a->_data, a->_vptr->_size);

Likewise: a->_vptr should be D.4328->_vptr.

Alternatively, a->_vptr had to be updated before the 'if' block.

[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:

2023-11-20 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634

--- Comment #3 from Tobias Burnus  ---
Breakpoint 6, gen_assign_counter_update (gsi=0x7fffcab0,
call=0x77230b48, func=0x7736cb00, result=0x77200b98, name=0x258f5f2
"PROF_time_profile") at ../../../repos/gcc/gcc/tree-profile.cc:247

(gdb) p debug_gimple_stmt(call)
__atomic_add_fetch_8 (&__gcov_time_profiler_counter, 1, 0);

(gdb) p debug_tree(result)
 
BLK

(gdb) p debug_tree(func)
  32
+ ? BUILT_IN_ATOMIC_ADD_FETCH_8:
+ BUILT_IN_ATOMIC_ADD_FETCH_4);
+  gcall *call = gimple_build_call (f, 3, addr, one, relaxed);
+  gen_assign_counter_update (gsi, call, f, result, name);

with the new gen_assign_counter_update:

+  if (result)
+{
+  tree result_type = TREE_TYPE (TREE_TYPE (func));

+  tree tmp = make_temp_ssa_name (result_type, NULL, name);
+  gimple_set_lhs (call, tmp);
+  gsi_insert_after (gsi, call, GSI_NEW_STMT);
+  gassign *assign = gimple_build_assign (result, tmp);
+  gsi_insert_after (gsi, assign, GSI_NEW_STMT);

* * *

Thus, it looks as if f's alias func's 'result_type' is unsigned while the rest
is all signed.

[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:

2023-11-20 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634

--- Comment #2 from Tobias Burnus  ---
That's r14-5578-ga350a74d6113e3

[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:

2023-11-20 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634

Tobias Burnus  changed:

   What|Removed |Added

 CC||sebastian.huber@embedded-br
   ||ains.de

--- Comment #1 from Tobias Burnus  ---
Bisecting points to:

commit a350a74d6113e3a84943266eb691275951c109d9 (HEAD)
Author: Sebastian Huber 
Date:   Sat Oct 21 15:52:15 2023 +0200

gcov: Add gen_counter_update()

  1   2   3   4   5   6   7   8   9   10   >