https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #2 from Tom de Vries ---
(In reply to Tom de Vries from comment #1)
> Can't reproduce.
>
> It this not fixed by:
> ...
> commit 7862f6ccd85a001e4d70abb00bb95d8c7846ba80
> Author: Tom de Vries
> Date: Wed Feb 23 09:33:33 2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #1 from Tom de Vries ---
Can't reproduce.
It this not fixed by:
...
commit 7862f6ccd85a001e4d70abb00bb95d8c7846ba80
Author: Tom de Vries
Date: Wed Feb 23 09:33:33 2022 +0100
[nvptx] Fix dummy location in gen_comment
...
?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #9 from Tom de Vries ---
Created attachment 52647
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52647=edit
Tentative patch with test-cases, rationale and changelog
I'll put this through testing, and submit if no problems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #8 from Tom de Vries ---
(In reply to Jakub Jelinek from comment #6)
> And yes, #c1 is valid.
Thanks for confirming.
> But would be nice to have similar test with && and
> initial result = 2; and arr[] say { 1, 2, 3, 4, 5, 6, 7,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #7 from Tom de Vries ---
Alternative fix that doesn't require fiddling with the 'code' var:
...
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index d932d74cb03..d0ddd4a6142 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -6734,7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #4 from Tom de Vries ---
This fixes it:
...
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index d932d74cb03..f2ac8f98e32 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -6734,7 +6734,21 @@ lower_rec_input_clauses (tree clauses,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #3 from Tom de Vries ---
Hmm, that seems to be actually due to:
...
if (sctx.is_simt)
{
if (!simt_lane)
simt_lane = create_tmp_var
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #2 from Tom de Vries ---
I think the problem can be seen already at omp-lower, in the body of the
butterfly loop.
Let's first look at what we have if we use reduction op '|':
...
D.2173 = .GOMP_SIMT_VF ();
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
Tom de Vries changed:
What|Removed |Added
Keywords||openmp
--- Comment #1 from Tom de Vries
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
--- Comment #3 from Tom de Vries ---
The OvO testsuite, when run at -O2 passes, because it inlines all .alias
instances.
But at -O0, it doesn't. With -foffload=-malias that's fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
--- Comment #2 from Tom de Vries ---
So, what do we get after specifying -malias -mptx=6.3?
Alias attribute only for functions, not variables.
No support for weak alias (allowing this does compile, but we run into
execution fails in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
--- Comment #1 from Tom de Vries ---
Created attachment 52636
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52636=edit
Tentative patch
Patch that I'm currently working on.
Adds -malias, off by default.
It's off by default because when
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
[ There is a number of nvptx PRs open about alias support. The focus of this
PR is $subject, rather than supporting some specific
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
--- Comment #4 from Tom de Vries ---
This:
...
$ cat alias.c
void __f ()
{
__builtin_printf ("hello\n");
}
void f () __attribute__ ((alias ("__f")));
int
main (void)
{
f ();
return 0;
}
...
works fine at -O0 and -O1:
...
$ ./gcc.sh -O0
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
At docs for ASM_WEAKEN_LABEL (stream, name) we find:
...
If you don’t define this macro or ASM_WEAKEN_DECL, GCC will not support weak
symbols
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104768
--- Comment #1 from Tom de Vries ---
Hmm, reading about it a bit more, it's more about enabling algorithms that were
not possible before, than about performance improvements.
So, we should aim at having test-cases, both openacc and openmp that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104893
--- Comment #1 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> The per-thread call stack is handled for .local memory by the CUDA driver.
>
> For the 'soft stack' that's not the case.
Hmm, actually there's .local memory
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
The switch -muniform-simt attempts to deal with a problem outside simt regions
by rewriting
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
--- Comment #3 from Tom de Vries ---
Anyway, having reread the volta architecture whitepaper again, I think it's ok
to use the solution I already found that does work (see PR104783): add a warp
sync at simt exit.
The tricky bit is that we rely
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
When building with the patch from comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916#c2 , but allowing a shuffle
for V2SImode, we run into an assert when trying to create
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
--- Comment #2 from Tom de Vries ---
Created attachment 52629
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52629=edit
Attempt, runs into driver internal error
FTR, this is an attempt at a fix.
It does the "predicate ld/st to only
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
There are some pseudo registers that are used for a specific function, like
cfun->machine->unisimt_outside_simt_predicate.
It would be good to
nt: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
In the ptx isa doc I read:
...
PTX allows the percentage sign as the first character of an identifier. The
percentage sign can be used to avoid name conflicts, e.g., be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
--- Comment #1 from Tom de Vries ---
We could try the same solution as for atomic: predicate ld/st to only execute
in lane 0, and propagate ld result.
Another solution might be to wrap each ld/st in two bar.warp.sync.
: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
The problem -muniform-simt is trying to address is to make sure that a register
produced outside an openmp simd region
: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
We use -msoft-stack for openmp programs:
...
'-msoft-stack'
Generate code that does not use '.local' memory directly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
--- Comment #3 from Tom de Vries ---
With this additionally:
...
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 1a89c1bc77f..2e1a2dad9fe 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -968,7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
--- Comment #2 from Tom de Vries ---
Created attachment 52606
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52606=edit
Tentative patch
With this patch and:
- current trunk
- misa default set to sm_75 (so 3.1 multilib disabled, because
: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
We currently have in the nvptx port in nvptx_option_override:
...
/* Set flag_no_common, unless explicitly disabled. We
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #6 from Tom de Vries ---
Created attachment 52593
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52593=edit
Tentative patch
(In reply to Tom de Vries from comment #4)
> The patch I have works for target boards unix and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104840
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104815
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Status|REOPENED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104857
--- Comment #1 from Tom de Vries ---
Created attachment 52592
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52592=edit
Tentative patch
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
Add a macro that can be used to test what .version x.y will be emitted in the
.s file.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #4 from Tom de Vries ---
The patch I have works for target boards unix and unix/-foffload=-mptx=3.1, but
I run into the hang for --target_board=unix/-foffload=-misa=sm_75.
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
[ From the category wild ideas... ]
The current default is sm_35, and soon will revert back to sm_30.
This makes all libraries use sm_30.
We could add multilibs for higher sm_xx, but I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #3 from Tom de Vries ---
Created attachment 52584
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52584=edit
Tentative patch
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
The nvptx port has:
...
(define_attr "predicable" "false,true"
(const_string "true"))
...
and here and there:
...
[(set_attr "
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104815
--- Comment #1 from Tom de Vries ---
With the tentative patch, I'm running into:
...
ptxas 2224-1.o, line 72; error : Result discard mode is not allowed for
instruction 'ld'
nvptx-as: ptxas terminated with signal 11 [Segmentation fault],
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
Consider this source code:
...
enum memmodel
{
MEMMODEL_RELAXED = 0
};
unsigned long long int *p64;
unsigned long long int v64;
int
main()
{
__atomic_fetch_add
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104780
--- Comment #4 from Tom de Vries ---
(In reply to Andrew Pinski from comment #3)
> So if you file a bug there
Done: https://sourceware.org/bugzilla/show_bug.cgi?id=28945
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #2 from Tom de Vries ---
Hmm, the atom insn sets a register that is not used anywhere. So the shuffle
communicating the result doesn't make much sense.
We can fix that by doing:
...
diff --git a/gcc/config/nvptx/nvptx.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #1 from Tom de Vries ---
Hmm, I wonder if nvptx_reorg_uniform_simt should run inbetween SIMT_ENTER and
SIMT_EXIT.
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
Minimized from
https://github.com/TApplencourt/OvO/blob/master/test_src/cpp/hierarchical_parallelism/atomic_add-float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104780
--- Comment #2 from Tom de Vries ---
(In reply to Tom de Vries from comment #1)
> This looks like a bug in newlib/libc/machine/nvptx/calloc.c:
> ...
> void *
> calloc (size_t size, size_t len)
> {
>void *p = malloc (size * len);
>if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104780
Tom de Vries changed:
What|Removed |Added
CC||tschwinge at gcc dot gnu.org
---
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
[ Quadro K2000 with sm_30, driver 470.103.01 ]
With the tentative patch for PR104758 (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
--- Comment #6 from Tom de Vries ---
I'm now looking at:
...
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index c83ceb3568b1..fea99c5d4069 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -53,7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
Tom de Vries changed:
What|Removed |Added
Resolution|FIXED |---
Last reconfirmed|
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
The current situation is:
- default: -misa=sm_35 -mptx=6.0
- libraries: -misa=sm_30 -mptx=3.1
There's an open question on whether we need or want multilibs for different
values of misa
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
Starting with sm_70, a fundamental change in the architecture occurred, called
"Independent Thread Scheduling". It means war
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
--- Comment #2 from Tom de Vries ---
FWIW, I ordered an sm_30 board, to be able to test this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100408
Tom de Vries changed:
What|Removed |Added
CC||vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
Tom de Vries changed:
What|Removed |Added
CC||tschwinge at gcc dot gnu.org
---
: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
I.
Let's start out with some history.
gcc-5, gcc-6, gcc-7, gcc-8:
only sm_30 support
gcc-9, gcc-10:
sm_30 + sm_35 support, default: sm_30
gcc-11:
sm_30 + sm_35 support, default: sm_35
II
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104717
--- Comment #6 from Tom de Vries ---
(In reply to Tom de Vries from comment #5)
> However, somehow the A.3 remains part of the BLOCK_VARS of foo, so when ipa
> inline (activated by pta-ipa, which does node->get_body ()) inlines foo into
> main,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104717
--- Comment #5 from Tom de Vries ---
At original:
...
void foo ()
...
#pragma acc parallel
...
integer(kind=4) A.3[0:D.4266];
...
At gimple:
...
void foo ()
...
#pragma omp target oacc_parallel
...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104717
--- Comment #4 from Tom de Vries ---
(In reply to Tom de Vries from comment #3)
> The mismatch seems to be:
> ...
> (gdb) call debug_generic_expr (name.typed.type)
> integer(kind=4)[0:D.4266] *
> (gdb) call debug_generic_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104717
--- Comment #3 from Tom de Vries ---
The test that is failing, is:
...
760 if (SSA_NAME_VAR (ssa_name) != NULL_TREE
761 && TREE_TYPE (ssa_name) != TREE_TYPE (SSA_NAME_VAR (ssa_name)))
762 {
763 error ("type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102429
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104714
--- Comment #1 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> [ FWIW, it would be great if we could simply specify -march=native, and have
> gcc query the nvidia driver to see what board there is using
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102429
--- Comment #1 from Tom de Vries ---
Created attachment 52524
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52524=edit
Tentative patch
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
I'm testing on a couple of boards, with some different settings, and one of
those settings is: test native architecture.
That is, for an NVIDIA T400 with sm_75, test with -misa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97005
--- Comment #11 from Tom de Vries ---
(In reply to Tom de Vries from comment #2)
> (In reply to Tom de Vries from comment #1)
> > Created attachment 52359 [details]
> > Cuda reproducer
>
> Filed at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84958
Tom de Vries changed:
What|Removed |Added
Target|nvptx |gcn
--- Comment #7 from Tom de Vries
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97338
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99555
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596
--- Comment #3 from Tom de Vries ---
Submitted patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590721.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104146
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
Tom de Vries changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #11 from Tom de Vries ---
Posted patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590627.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98321
Tom de Vries changed:
What|Removed |Added
Severity|normal |enhancement
--- Comment #7 from Tom de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596
--- Comment #2 from Tom de Vries ---
(In reply to Andrew Pinski from comment #1)
> I am trying to understand what you are trying to do.
> You want to mark an insn with a comment
One ore more insns, yes.
> which is emitted during formation of
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
I wanted to mark some insns in a way that is visible in the assembly, without
having to tinker with the .md file.
The user-level equivalent would be something like
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104580
--- Comment #1 from Tom de Vries ---
Created attachment 52457
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52457=edit
Tentative patch
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
I had the following idea:
The prevent_branch_around_nothing workaround was added to force
a uniform warp after
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #10 from Tom de Vries ---
A good thing to note at this point: why doesn't init-regs work here?
The pass works per insn, and when hitting the insn with the problematic use:
...
(gdb) call debug_rtx (insn)
(insn 18 17 19 4 (set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #9 from Tom de Vries ---
(In reply to Tom de Vries from comment #1)
> Tentative patch that fixes example:
> ...
> diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
> index 5b26c0f4c7dd..4dc154434853 100644
> ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #8 from Tom de Vries ---
Created attachment 52456
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52456=edit
Tentative patch, introducing -minit-regs=<0|1|2>
This patch fixes the problem, and survived a standalone build and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104423
--- Comment #5 from Tom de Vries ---
Created attachment 52438
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52438=edit
Tentative patch (GOMP_TARGET_ENV_ITER)
A more generic solution using env var GOMP_TARGET_ENV_ITER, which allows us to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104423
--- Comment #4 from Tom de Vries ---
Created attachment 52416
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52416=edit
[libgomp, testsuite, nvptx] Add GOMP_NVPTX_JIT_ITER (libgomp.c/c.exp only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104489
Tom de Vries changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104456
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104489
--- Comment #1 from Tom de Vries ---
Created attachment 52407
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52407=edit
reproducer
$ xgcc -B/home/vries/nvptx/trunk/build-gcc/./gcc/ -O2 -S mulhc3.c
: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
With patch:
...
$ git diff
diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h
index edffd088b15a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97005
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
ty: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
Testing on a GT 1030, with driver 510.x, GOMP_NVPTX_JIT=-00 and -mptx=3.1, I
run into:
...
FAIL: libgomp.oacc-c/../libgom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104422
--- Comment #6 from Tom de Vries ---
(In reply to Tom de Vries from comment #5)
> Still on GT1030, does not reproduce with 470.x, neither the minimal nor the
> complete for-3.c.
And the same for 510.x.
So, I'm parking this for now. This may
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104422
--- Comment #5 from Tom de Vries ---
Still on GT1030, does not reproduce with 470.x, neither the minimal nor the
complete for-3.c.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104422
--- Comment #4 from Tom de Vries ---
(In reply to Tom de Vries from comment #3)
> Reproduces both with and without GOMP_NVPTX_JIT=-O0.
Pff, that was an artefact of having bumped the default ptx isa to 6.3.
So, let's try again ...
Reproduced
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104423
--- Comment #3 from Tom de Vries ---
(In reply to Thomas Schwinge from comment #2)
> For OpenMP test cases, we'd either have to manually mark them up (error
> prone and generally ugly), or scan the source file(s) (error prone and
> generally
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104283
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104422
--- Comment #3 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> While testing libgomp using legacy driver 390.x on a maxwell card, Quadro
> K620 I ran into a for-3.exe execution failure.
Reproduced with 390.147 driver on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104423
--- Comment #1 from Tom de Vries ---
One of the dimensions that I test is env var GOMP_NVPTX_JIT, with values:
- -O0, and
- default (using unset GOMP_NVPTX_JIT), which supposedly is -O4.
Looking at f.i. test-case for-3.c, compilation takes 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104422
--- Comment #2 from Tom de Vries ---
Hmm, I reran on a(In reply to Tom de Vries from comment #0)
> #pragma distribute simd
omp missing ... I need to reproduce this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104364
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97006
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Component|target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #4 from Tom de Vries ---
(In reply to Andrew Pinski from comment #2)
> I thought there was another bug that reported a similar issue.
You mean related to nvptx, or in general?
FWIW, I do remember looking at this issue before in
101 - 200 of 3233 matches
Mail list logo