[Bug target/116571] [15 Regression] GCN vs. "lower SLP load permutation to interleaving"

2024-09-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116571 --- Comment #6 from Andrew Stubbs --- (In reply to Richard Biener from comment #5) > (In reply to Thomas Schwinge from comment #4) > > The GCN target FAILs that I originally had reported here: > > > > > [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-30 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104 Andrew Stubbs changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103 --- Comment #8 from Andrew Stubbs --- (In reply to Thomas Schwinge from comment #4) > (In reply to Richard Biener from comment #2) > > if (VECTOR_BOOLEAN_TYPE_P (type) > > && SCALAR_INT_MODE_P (TYPE_MODE (type))) > > return true; >

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104 --- Comment #4 from Andrew Stubbs --- The problem insn is this: (insn 31 30 32 2 (set (reg:V2SI 711) (ashift:V2SI (reg:V2SI 161 v1) (const_vector:V2SI [ (const_int 3 [0x3]) repeated x2 ]))

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104 --- Comment #3 from Andrew Stubbs --- (In reply to Jeffrey A. Law from comment #1) > So, how am I supposed to reproduce this? I don't have an assembler/binutils > for amdgcn and thus libgcc won't configure. Thus I can't extract a testcase. >

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-28 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #18 from Andrew Stubbs --- That should fix the broken validation check. All V32 permutations should work now on RDNA GPUs, I think. V16 and smaller were already working fine.

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #16 from Andrew Stubbs --- On 26/06/2024 14:41, rguenther at suse dot de wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 > > --- Comment #15 from rguenther at suse dot de --- >>> Btw, the above looks quite odd for nelt

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #14 from Andrew Stubbs --- On 26/06/2024 13:34, rguenth at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 > > --- Comment #13 from Richard Biener --- > (In reply to Richard Biener from comment #12) >>

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #10 from Andrew Stubbs --- On 26/06/2024 12:05, rguenth at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 > > --- Comment #8 from Richard Biener --- > (In reply to Richard Biener from comment #7) >> I

[Bug target/115640] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #3 from Andrew Stubbs --- (In reply to Richard Biener from comment #2) > If you force GCN to use fixed length vectors (how?), does it work? How's > it behaving on aarch64 with SVE? (the CI was happy, but maybe doesn't > enable SVE)

[Bug target/115631] [15 Regression] GCN: [-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-overflow-6.c -O2 execution test

2024-06-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115631 --- Comment #1 from Andrew Stubbs --- It was writing 0 to s12 (scalar register) and then moving the zero to lane zero of v0 (vector register). Now it's writing the 0 directly to v0, of which all but lane zero is masked. These should be identic

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304 --- Comment #11 from Andrew Stubbs --- (In reply to rguent...@suse.de from comment #10) > On Mon, 3 Jun 2024, ams at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304 > > > > --- Comme

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304 --- Comment #9 from Andrew Stubbs --- (In reply to Richard Biener from comment #6) > The best strathegy for GCN would be to gather V4QImode aka SImode into the > V64QImode (or V16SImode) vector. For pix2 we have a gap of 28 elements, > doing co

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717 --- Comment #3 from Andrew Stubbs --- Can this be filtered (safely) in mkoffload? That tool is offload-target-specific, so no problem with "if offload target were to support it".

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302 --- Comment #4 from Andrew Stubbs --- Yes, that's what the simd-math-3* tests do. The simd-math-5* tests are explicitly supposed to be doing this in the context of the autovectorizer. If these tests are being compiled as (newly) intended then

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302 --- Comment #2 from Andrew Stubbs --- The execution test checks that each of the libgcc routines work correctly, and the scan assembler tests make sure that we're getting coverage of all of them. In this case, the failure indicates that we're n

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2024-02-12 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #8 from Andrew Stubbs --- (In reply to seurer from comment #7) > On the BE machine: > > seurer@nilram:~/gcc/git/build/gcc-test$ ulimit -a > real-time non-blocking time (microseconds, -R) unlimited > ... > max locked memory

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2024-02-08 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #6 from Andrew Stubbs --- (In reply to seurer from comment #5) > I should note that pinned-2 also fails on powerpc64 LE. > > make -k check-target-libgomp RUNTESTFLAGS="c.exp=libgomp.c/alloc-pinned-*" > FAIL: libgomp.c/alloc-pinned-

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615 --- Comment #3 from Andrew Stubbs --- I did see these, but I hadn't had time to chase them up. The proposed patch is exactly the sort of solution I was expecting to find, short term. Have you confirmed that it fixes all the cases? A proper sol

[Bug middle-end/113199] [14 Regression][GCN] ICE (segfault) due to invalid 'loop_mask_46 = VEC_PERM_EXPR' when compiling Newlib's wcsftime.c

2024-01-09 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113199 --- Comment #5 from Andrew Stubbs --- I can confirm that I can now build the amdgcn toolchain once more. :-) Thanks.

[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420

2024-01-02 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #11

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2023-12-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #4 from Andrew Stubbs --- It's going to be difficult to make this test work when only one page of locked memory is available. :-( I will look at making it "unsupported".

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2023-12-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #1 from Andrew Stubbs --- That is a typo. I don't want to make it pass on machines that have insufficient memory configured because it will mask the case where it fails for another reason. However, the testcase was originally suppo

[Bug target/113022] GCN offloading bricked by "amdgcn: Work around XNACK register allocation problem"

2023-12-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113022 --- Comment #1 from Andrew Stubbs --- This is what I get for trying to get this done before vacation. :( Yes, there's probably something in mkoffload that has to match the default change from -mxnack=any to -mxnack=off on the older ISAs.

[Bug target/112937] [14 Regression] GCN: FAILs due to unconditional 'f->use_flat_addressing = true;'

2023-12-11 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112937 --- Comment #2 from Andrew Stubbs --- Flat addressing *should* be the safe option that always works (although using "global" address space permits slightly more efficient offset options).

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481 Andrew Stubbs changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481 --- Comment #7 from Andrew Stubbs --- Simply changing to OPTAB_WIDEN solves the ICE, but I don't know if it does so in a sensible way, for RISC V. @@ -7489,7 +7489,7 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size,

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-13 Thread ams at gcc dot gnu.org via Gcc-bugs
||2023-11-13 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #4 from Andrew Stubbs --- It fails because optab_handler fails to find an instruction for "and_optab" in SImode.

[Bug target/112308] [14 Regression] GCN: 'error: literal operands are not supported' for 'v_add_co_u32'

2023-11-10 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112308 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/112313] [14 Regression] GCN target 'gcc.dg/pr111082.c' ICE, 'during RTL pass: vregs': 'error: unrecognizable insn'

2023-11-10 Thread ams at gcc dot gnu.org via Gcc-bugs
|RESOLVED Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #2 from Andrew Stubbs --- This is now fixed.

[Bug target/112308] [14 Regression] GCN: 'error: literal operands are not supported' for 'v_add_co_u32'

2023-11-09 Thread ams at gcc dot gnu.org via Gcc-bugs
||2023-11-09 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org

[Bug target/112088] [14 Regression] GCN target testing broken by "amdgcn: add -march=gfx1030 EXPERIMENTAL"

2023-10-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112088 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/112088] [14 Regression] GCN target testing broken by "amdgcn: add -march=gfx1030 EXPERIMENTAL"

2023-10-27 Thread ams at gcc dot gnu.org via Gcc-bugs
|1 Last reconfirmed||2023-10-27 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #1 from Andrew Stubbs --- I'm testing a fix for this.

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313 --- Comment #5 from Andrew Stubbs --- One thing that is unusual about the GCN stack pointer is that it's actually two registers. Could this be breaking some cprop assumptions? GCN can't fit an address in one (SImode) register so all (DImode) po

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313 --- Comment #3 from Andrew Stubbs --- It's curious that this affects the Fiji target only, and not the newer targets at all. There are some additional register options for multiply instructions, some differences to atomics, but mostly the diffe

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313 --- Comment #1 from Andrew Stubbs --- This ICE also affect the following standalone test failures (raw amdgcn, no offloading): gfortran.dg/assumed_rank_21.f90 gfortran.dg/finalize_38.f90 gfortran.dg/finalize_38a.f90

[Bug testsuite/108898] [13 Regression] Test introduced by r13-6278-g3da77f217c8b2089ecba3eb201e727c3fcdcd19d failed on i386

2023-03-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108898 --- Comment #4 from Andrew Stubbs --- I did not know there was a way to do that! I'll add this to my to-do list.

[Bug testsuite/108898] [13 Regression] Test introduced by r13-6278-g3da77f217c8b2089ecba3eb201e727c3fcdcd19d failed on i386

2023-02-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108898 --- Comment #1 from Andrew Stubbs --- I tested it on i686-pc-linux-gnu before I posted the patch, and it was working then. Can you be more specific what configuration you were testing, please?

[Bug target/107510] gcc/config/gcn/gcn.cc:4930:9: style: Same expression on both sides of '||'. [duplicateExpression]

2022-11-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug other/89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck, clang-static-analyzer, PVS-studio) find that gcc misses

2022-11-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863 Bug 89863 depends on bug 107510, which changed state. Bug 107510 Summary: gcc/config/gcn/gcn.cc:4930:9: style: Same expression on both sides of '||'. [duplicateExpression] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510 What|R

[Bug target/107510] gcc/config/gcn/gcn.cc:4930:9: style: Same expression on both sides of '||'. [duplicateExpression]

2022-11-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510 Andrew Stubbs changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org

[Bug tree-optimization/107096] Fully masking vectorization with AVX512 ICEs gcc.dg/vect/vect-over-widen-*.c

2022-10-10 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107096 --- Comment #4 from Andrew Stubbs --- I don't understand rgroups, but I can say that GCN masks are very simply one-bit-one-lane. There are always 64-lanes, regardless of the type, so V64QI mode has fewer bytes and bits than V64DImode (when writt

[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64

2022-09-30 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088 --- Comment #9 from Andrew Stubbs --- I can confirm that the patch fixes the amdgcn build.

[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64

2022-09-30 Thread ams at gcc dot gnu.org via Gcc-bugs
-*-* CC||ams at gcc dot gnu.org --- Comment #7 from Andrew Stubbs --- I get the same failure on amdgcn building newlib/libm/math/kf_rem_pio2.c

[Bug tree-optimization/106476] New: ICE generating FOLD_EXTRACT_LAST

2022-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ams at gcc dot gnu.org CC: rguenther at suse dot de Target Milestone: --- Target: amdgcn-amdhsa Commit 8f4d9c1deda "amdgcn: 64-bit not" exposed an ICE in tree-vect_stmts.cc when

[Bug target/105873] [amdgcn][OpenMP] task reductions fail with "team master not responding; slave thread aborting"

2022-06-08 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105873 --- Comment #4 from Andrew Stubbs --- I think unused threads should be given a no-op function to run, not a null pointer. The GCN implementation cannot tell the difference between a null pointer and an unset pointer (which is what happens when t

[Bug target/105246] [amdgcn] Use library call for SQRT with -ffast-math + provide additional option to use single-precsion opcode

2022-04-13 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105246 --- Comment #2 from Andrew Stubbs --- When we first coded this we only had the GCN3 ISA manual, which says nothing about the accuracy. Now I look in the Vega manual (GCN5) I see: Square root with perhaps not the accuracy you were hoping for

[Bug target/100181] hot-cold partitioned code doesn't assemble

2022-02-11 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 --- Comment #13 from Andrew Stubbs --- I've updated the LLVM version documentation at https://gcc.gnu.org/wiki/Offloading#For_AMD_GCN: It's LLVM 9 or 13.0.1 now (nothing in between), and will be 13.0.1+ for the next release (dropping LLVM 9 bec

[Bug middle-end/104026] [12 Regression] ICE in wide_int_to_tree_1, at tree.c:1755 via tree-vect-loop-manip.c:673

2022-01-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104026 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #6

[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821

2021-11-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821

2021-11-24 Thread ams at gcc dot gnu.org via Gcc-bugs
|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from Andrew Stubbs --- I think I have a fix for this. It happens when the link register has to be saved because it is used

[Bug target/103201] [12 Regression] trunk 20211111 ftbfs for amdgcn – libgomp/teams.c:49:6: error: 'struct gomp_thread' has no member named 'num_teams'

2021-11-12 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103201 --- Comment #3 from Andrew Stubbs --- I did some preliminary testing on your patch: the libgomp.c/target-teams-1.c testcase runs fine on amdgcn. I presume that that covers most of the existing features of those runtime calls?

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-10-04 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #8 from Andrew Stubbs --- Did you get the C version to return anything other than "-1"? (The expected result is "2".) I'm still trying to determine if the device is compatible, but the mapping problem looks like a different issue.

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-10-01 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #5 from Andrew Stubbs --- Sorry, I should have said to compile with -fopenacc. If you did do that, please post the GCN_DEBUG output.

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-10-01 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #3 from Andrew Stubbs --- That output shows that we have the correct libgomp and rocm is installed and working. Libgomp initialized the GCN plugin, but did not attempt to initialize the device (the next message in the output should h

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-09-30 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #1 from Andrew Stubbs --- Please set "export GCN_DEBUG=1", try it again, and post the output.

[Bug target/102260] amdgcn offload compiler fails to configure, not matching target directive's target id

2021-09-09 Thread ams at gcc dot gnu.org via Gcc-bugs
||2021-09-09 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #1 from Andrew Stubbs --- In addition to changing the amdgcn_target syntax in LLVM 13, the LLVM GCN guys have also renamed the

[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"

2021-07-21 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #5 from Andrew Stubbs --- [Note: all of my comments refer to the amdgcn case. nvptx has somewhat different support in this area.] (In reply to Jonathan Wakely from comment #4) > But it's a waste of space in the .so to build lots of

[Bug target/100208] amdgcn fails to build with llvm-mc from llvm12

2021-07-21 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100208 Andrew Stubbs changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"

2021-07-21 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #3 from Andrew Stubbs --- The standalone amdgcn configuration does not support C++. There are a number of technical reasons why it doesn't Just Work, but basically it comes down to no-one ever working on it. Our customers were primar

[Bug target/101484] [12 Regression] trunk 20210717 ftbfs for amdgcn-amdhsa (gcn offload)

2021-07-17 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101484 Andrew Stubbs changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/97827] bootstrap error building the amdgcn-amdhsa offload compiler with LLVM 11

2021-07-02 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827 Andrew Stubbs changed: What|Removed |Added CC||xw111luoye at gmail dot com --- Comment

[Bug target/95023] Offloading AMD GCN wiki cannot be followed

2021-07-02 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95023 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-06 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 --- Comment #13 from Andrew Stubbs --- I found a lot more ICEs when testing my patch. They look to be unrelated (TImode come back to haunt us), but it makes it hard to be sure.

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-05 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 --- Comment #9 from Andrew Stubbs --- I found a couple of other places to put force_operand and the full case works now. Running more tests

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-05 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 Andrew Stubbs changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed|

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-05 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 --- Comment #4 from Andrew Stubbs --- Alexandre's patch has this: emit_move_insn (rem, plus_constant (ptr_mode, rem, -blksize)); Is that generally a valid thing to do? It seems like other places do similar things...

[Bug target/100208] amdgcn fails to build with llvm-mc from llvm12

2021-04-22 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100208 --- Comment #1 from Andrew Stubbs --- LLVM changed the default parameters, so we either have to change the expectations in the ".amdgcn_target" string (which is basically an assert), or set the attributes be want explicitly on the assembler comm

[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394

2020-10-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521 --- Comment #22 from Andrew Stubbs --- (In reply to Andrew Stubbs from comment #21) > (In reply to Richard Biener from comment #19) > > GCN also uses MODE_INT for the mask mode and thus may be similarly affected. > > Andrew - are the bits in the

[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394

2020-10-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521 --- Comment #21 from Andrew Stubbs --- (In reply to Richard Biener from comment #19) > GCN also uses MODE_INT for the mask mode and thus may be similarly affected. > Andrew - are the bits in the mask dense? Thus for a V4SImode compare > would th

[Bug tree-optimization/84958] int loads not eliminated against larger stores

2020-10-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84958 --- Comment #6 from Andrew Stubbs --- (In reply to Tom de Vries from comment #5) > I've removed the xfail for nvptx. > > The only remaining xfail is for gcn. Is that one still necessary? The test still fails for gcn.

[Bug libgomp/97332] [gcn] GCN_NUM_GANGS/GCN_NUM_WORKERS override compile-time constants

2020-10-08 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97332 Andrew Stubbs changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed|

[Bug target/96306] gcn libgomp build broken after "libomp: Add omp_depend_kind to omp_lib.{f90,h}"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306 --- Comment #8 from Andrew Stubbs --- I'm loath to enable TImode if it's going to ICE all over the place, and I can't just drop everything else and implement working TImode unless there's an easy solution. It's always been on the nice-to-have lis

[Bug target/95730] GCN offloading ICEs after commit fe7ebef7fe4f9acb79658ed9db0749b07efc3105 "Add support for __builtin_bswap128"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95730 --- Comment #4 from Andrew Stubbs --- In fact default_scalar_mode_supported_p does return *false* for TImode (because LONG_LONG_TYPE_SIZE == 64, and BITS_PER_WORD == 32). Therefore int128_t does not exist, as far as users are concerned. I'm not

[Bug target/96306] gcn libgomp build broken after "libomp: Add omp_depend_kind to omp_lib.{f90,h}"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306 --- Comment #5 from Andrew Stubbs --- GCC will automatically generate libgcc calls for types up to 2*BITS_PER_WORD, but no further. Since BITS_PER_WORD is 32 on GCN this means no automatic TImode support for anything that would go that route (suc

[Bug target/96306] gcn libgomp build broken after "libomp: Add omp_depend_kind to omp_lib.{f90,h}"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306 --- Comment #3 from Andrew Stubbs --- TImode was added for use by a few instructions that take two 64-bit values in consecutive registers. It's also useful for the SLP fake vectorization stuff. It wasn't intended for use with user types; I proba

[Bug target/95864] [11 Regression] GCN offloading execution regressions after commit f062c3f11505b70c5275e5bc0e52f3e441f8afbc "amdgcn: Switch to HSACO v3 binary format"

2020-06-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95864 --- Comment #1 from Andrew Stubbs --- I'm aware of these issues. I fixed all the test failures that were definitely bugs in the HSACOv3 implementation, and the ones that remain appear to be either latent bugs uncovered by the new driver configur

[Bug target/95730] GCN offloading ICEs after commit fe7ebef7fe4f9acb79658ed9db0749b07efc3105 "Add support for __builtin_bswap128"

2020-06-18 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95730 --- Comment #3 from Andrew Stubbs --- The GCN port does not define a scalar_mode_supported, and I think the default definition is allowing TImode (as long long int). As I said, the SLP fake-vector load/store use it fine as a substitute for V4SI o

[Bug target/95730] GCN offloading ICEs after commit fe7ebef7fe4f9acb79658ed9db0749b07efc3105 "Add support for __builtin_bswap128"

2020-06-17 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95730 --- Comment #1 from Andrew Stubbs --- GCN uses TImode for a few special purposes, but lacks real TImode support. (Basically, it allows TImode loads and stores for the SLP fake vectorization, and there's one instruction that needs two DImode valu

[Bug middle-end/93488] [OpenACC] ICE in type-cast 'async', 'wait' clauses

2020-04-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93488 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug testsuite/94725] Tests with proprietary license notices

2020-04-23 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94725 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #2

[Bug other/94629] 10 issues located by the PVS-studio static analyzer

2020-04-23 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94629 --- Comment #23 from Andrew Stubbs --- (In reply to Jakub Jelinek from comment #12) > (In reply to Andrew Stubbs from comment #11) > > (In reply to Jakub Jelinek from comment #10) > > > or if instead we should drop the "status = " for the cases w

[Bug target/94282] [amdgcn] ld: error: undefined symbol: __gxx_personality_v0

2020-04-23 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94282 --- Comment #6 from Andrew Stubbs --- I think we've decided to with Thomas's approach. Thomas, please go ahead and commit.

[Bug target/94278] [amdgcn] Offloading build failures due to 'llvm-mc' SIGSEGV

2020-04-23 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94278 --- Comment #4 from Andrew Stubbs --- Almost all the tests listed in pr81430 pass for me (and the exception I found is a link error). I don't understand what's happening with your build, but from my point of view the patch fixes an issue that do

[Bug target/94248] [amdgcn] Doesn't build with RTL checking

2020-04-22 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94248 --- Comment #7 from Andrew Stubbs --- I'd rather remove the whole if branch, but given you've tested this already then it's probably the best short term fix. Please go ahead.

[Bug target/94278] [amdgcn] Offloading build failures due to 'llvm-mc' SIGSEGV

2020-04-21 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94278 --- Comment #2 from Andrew Stubbs --- Well, it works for me: PASS: libgomp.c/examples-4/async_target-2.c (test for excess errors) PASS: libgomp.c/examples-4/async_target-2.c execution test That's with an unmodified LLVM 9 we built ourselves.

[Bug target/94248] [amdgcn] Doesn't build with RTL checking

2020-04-21 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94248 --- Comment #5 from Andrew Stubbs --- (In reply to Thomas Schwinge from comment #4) > (In reply to Andrew Stubbs from comment #3) > > Actually, I think that recent changes to the register alignment mean that > > this can't happen any more, so the

[Bug other/94629] 10 issues located by the PVS-studio static analyzer

2020-04-17 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94629 --- Comment #11 from Andrew Stubbs --- (In reply to Jakub Jelinek from comment #10) > or if instead we should drop the "status = " for the cases where nothing > checks it. Andrew? I think checking the status is probably good practice, even thoug

[Bug target/94282] [amdgcn] ld: error: undefined symbol: __gxx_personality_v0

2020-03-26 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94282 --- Comment #3 from Andrew Stubbs --- (In reply to Andrew Pinski from comment #2) > (In reply to Tobias Burnus from comment #1) > > The symbol __gxx_personality_v0 is part of libsupc++ – which I believe is > > not build to to lacking/restricted C

[Bug target/94248] [amdgcn] Doesn't build with RTL checking

2020-03-23 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94248 --- Comment #3 from Andrew Stubbs --- Actually, I think that recent changes to the register alignment mean that this can't happen any more, so the whole check is probably obsolete. I thought that --enable-checking=yes was already covering this.

[Bug bootstrap/93409] [10 Regression] gcn libgomp plugin fails to build for x32

2020-01-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93409 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #1

[Bug tree-optimization/92772] wrong code vectorizing masked max

2019-12-17 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92772 Andrew Stubbs changed: What|Removed |Added Priority|P3 |P5 Severity|critical

[Bug tree-optimization/92772] wrong code vectorizing masked max

2019-12-04 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92772 --- Comment #6 from Andrew Stubbs --- (In reply to Richard Biener from comment #4) > Btw, isn't the issue that the reduction looks at all lanes? That is, > I think the code simply assumes that for fully masked loops at least > one iteration is p

[Bug tree-optimization/92772] wrong code vectorizing masked max

2019-12-04 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92772 --- Comment #3 from Andrew Stubbs --- The GCN architecture can handle the masking, but I don't know how we'd represent or apply that in the middle end? I can probably implement extract_last, and that might be more efficient, but I don't see how

[Bug tree-optimization/92772] New: wrong code vectorizing masked max

2019-12-03 Thread ams at gcc dot gnu.org
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ams at gcc dot gnu.org Target Milestone: --- The testcase pr65947-10.c fails on amdgcn because there are more vector lanes than there is data, and the algorithm created doesn't allow for this. (Actually there

[Bug tree-optimization/91198] GCC not generating AVX-512 compress/expand instructions

2019-07-18 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91198 --- Comment #1 from Andrew Stubbs --- I don't believe GCC detects that operation automatically. It does support the instruction via intrinsics (builtin functions that correspond to low-level machine features). You should investigate "__builtin_i

[Bug middle-end/90779] Fortran array initialization in offload regions

2019-06-17 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90779 --- Comment #14 from Andrew Stubbs --- (In reply to Jakub Jelinek from comment #7) > if I compile just the first TU without the foo () call in there, and > .global .align 4 .u32 var$lto_priv$1[1] = { 5 }; > .global .align 4 .u32

[Bug middle-end/90779] Fortran array initialization in offload regions

2019-06-14 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90779 --- Comment #8 from Andrew Stubbs --- On GCN I get the lto_priv names, but not the globalization. I think that shows what the expected behaviour is, thanks ... I just need to find that magic. That being so, I think I can confirm that your origin

[Bug middle-end/90779] Fortran array initialization in offload regions

2019-06-14 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90779 --- Comment #6 from Andrew Stubbs --- There's not observable difference. I don't quite follow what the patch is trying to achieve, but seems like adding the variable to the offload variables does not address the issue here. I've added a hack to

  1   2   >