[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317 --- Comment #5 from jules at gcc dot gnu.org --- Hi, Unfortunately I no longer have access to any PowerPC machines, which limits my ability to help with this somewhat. I guess it's *something* like a CONVERT_EXPR/NOP_EXPR in an unexpected place, and is probably quite easy to fix with a suitable reproducer. Julian
[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #2 from jules at gcc dot gnu.org --- I haven't been able to reproduce this with our local testing infrastructure on the powerpc64le machine available to me (i.e., the test works for me there). I'd be a bit surprised about a machine dependence at this early stage in parsing, but here we go! Let me know if there's any other information that might help track this down.
[Bug target/113331] New: AMDGCN: Compilation failure due to duplicate .LEHB/.LEHE symbols
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113331 Bug ID: 113331 Summary: AMDGCN: Compilation failure due to duplicate .LEHB/.LEHE symbols Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jules at gcc dot gnu.org Target Milestone: --- Created attachment 57037 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57037=edit Test case The attached file fails to compile with AMD GCN offloading on current mainline. It also fails on GCC 13.2.0 as packaged in Debian, so this doesn't appear to be a regression. $ g++ -fopenmp -foffload=amdgcn-amdhsa -save-temps dup-syms.cc -o dup-syms ./dup-syms.xamdgcn-amdhsa.mkoffload.2.s:256:1: error: symbol '.LEHB0' is already defined .LEHB0: ^ ./dup-syms.xamdgcn-amdhsa.mkoffload.2.s:258:1: error: symbol '.LEHE0' is already defined .LEHE0: ^ ./dup-syms.xamdgcn-amdhsa.mkoffload.2.s:288:1: error: symbol '.LEHB1' is already defined .LEHB1: ^ ./dup-syms.xamdgcn-amdhsa.mkoffload.2.s:290:1: error: symbol '.LEHE1' is already defined .LEHE1: ^ The test case doesn't trigger with NVPTX offloading, but I don't think that definitely implies that this is something GCN-specific (vs. generically offload-related).
[Bug middle-end/113279] New: OpenMP 5 - Audit GOMP_MAP_FIRSTPRIVATE_REFERENCE handling for references to pointers to pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113279 Bug ID: 113279 Summary: OpenMP 5 - Audit GOMP_MAP_FIRSTPRIVATE_REFERENCE handling for references to pointers to pointers Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jules at gcc dot gnu.org Target Milestone: --- The patch (approved and soon to be committed) to support "lvalue" parsing for C++ has a corner-case issue whereby references to pointers to pointers do not quite work interchangeably with pointers to pointers (i.e. when implicitly mapped). See the "ref2ptrptr_offset_decl_member_slice" functions in the libgomp test cases baseptrs-4.C and baseptrs-6.C. https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641924.html It is possible that a slight semantic tweak to GOMP_MAP_FIRSTPRIVATE_REFERENCE handling in gimplify.cc or omp-low.cc might be able to repair these cases, if indeed that is required by the spec.
[Bug middle-end/113006] New: OpenMP 5 - lvalue parsing support for map/to/from clause
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113006 Bug ID: 113006 Summary: OpenMP 5 - lvalue parsing support for map/to/from clause Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jules at gcc dot gnu.org Target Milestone: --- The following test: libgomp/testsuite/libgomp.c++/baseptrs-4.C partly depends on the following in-review patches: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629365.html https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629364.html Tobias points out there is currently an unresolved OpenMP issue, 2618 ("Clarify behavior of mapping lvalues on target construct"), which talks about whether code like the following: map(*p = 10) map(x = 20) map(x ? y[0] : p[1]) map(f(y)) is valid or not. He writes that the sentiment was to require that a 'map' clause list item must have a base pointer or a base variable. As far as I know, none of these "problematic" cases are permitted by the above patches, but it might be worth following the OpenMP issue to make sure that nothing diverges.
[Bug middle-end/113004] New: OpenMP 5 - structs are not mapped element-wise by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113004 Bug ID: 113004 Summary: OpenMP 5 - structs are not mapped element-wise by default Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jules at gcc dot gnu.org Target Milestone: --- According to e.g. OpenMP 5.2, "5.8.2 Mapper Identifiers and mapper Modifiers": "A structure type T has a predefined default mapper that is defined as if by a declare mapper directive that specifies v in a map clause with the alloc map-type and each structure element of v in a map clause with the tofrom map-type." At present, a variable of struct type will be mapped as as a single block (the pre-OpenMP 5.0 legacy behaviour). This works fine most of the time, but for example fails in the case where one of the struct's members is a reference, demonstrated in the test case: libgomp/testsuite/libgomp.c++/target-49.C Once "declare mapper" support is committed, cases where a struct with an implicit default mapper should invoke a struct with a *non*-default mapper, will also probably be mishandled.
[Bug libgomp/108624] [OpenMP] Map(ptr, ptr[2:N-2]) will cause a segfault, 'map(ptr, ptr[0:N])'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108624 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #1 from jules at gcc dot gnu.org --- This works with the in-review series: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609031.html i.e. for the mapping: #pragma omp target map(ptr1[1:N-1]) map(ptr2[:N]) we get (gimplify): _1 = ptr1 + 4; map(tofrom:*_1 [len: 252]) map(firstprivate:ptr1 [pointer assign, bias: 4]) and for the mapping: #pragma omp target map(ptr1, ptr1[1:N-1]) map(ptr2[:N]) we get (gimplify): _4 = ptr1.2_3 + 4 map(tofrom:ptr1 [len: 8]) map(tofrom:*_4 [len: 252]) map(attach:ptr1 [bias: 4]) the test runs fine either way.
[Bug target/102856] New: [nvptx] Misaligned accesses with cheap vectorization enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102856 Bug ID: 102856 Summary: [nvptx] Misaligned accesses with cheap vectorization enabled Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jules at gcc dot gnu.org Target Milestone: --- Since revision 2b8453c401b699ed93c085d0413ab4b5030bcdb8 I am seeing several OpenMP tests fail with misaligned access errors: PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-11.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-12.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-16.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-3.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-5.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-6.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-9.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-11.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-12.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-16.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-3.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-5.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-6.c execution test PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-9.c execution test These look like, e.g.: $ ./for-11.exe libgomp: cuCtxSynchronize error: misaligned address libgomp: cuMemFree_v2 error: misaligned address libgomp: device finalization failed I suspect the reason is that an operation that is now being vectorized (e.g. "st.v2.u64 [%frame], %r28;") requires higher alignment than the original scalar accesses it replaces. I haven't spotted an obvious culprit for the problem in the nvptx backend. This is OpenMP, so it could be the soft stack handling -- or it could be something else.
[Bug libgomp/95590] OpenACC 'attach' behavior if already attached to different data
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95590 --- Comment #1 from jules at gcc dot gnu.org --- In discussion: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548679.html https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550623.html https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550654.html
[Bug middle-end/95270] OpenACC 'enter data attach' looks up target memory object displaced by pointer size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95270 --- Comment #2 from jules at gcc dot gnu.org --- TBH I've suspected problems with misuse of the bias for attach/detach before, but I've not come up with a test case. I'll have a look. Thanks!
[Bug middle-end/92929] OpenACC/OpenMP 'target' 'exit data'/'update' optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92929 --- Comment #7 from jules at gcc dot gnu.org --- Test case & further discussion in: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547424.html
[Bug fortran/93025] [OpenACC] ICE with copy(y(1)%cc(:)%i) type of strided access – permitted since commit of OpenACC 2.6 deep copy patch series
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93025 jules at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from jules at gcc dot gnu.org --- I believe this is fixed on trunk now.
[Bug libgomp/93030] [OpenACC] libgomp.oacc-c-c++-common/deep-copy-10.c FAILS on AMDGCN – invalid 'async' usage?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93030 --- Comment #1 from jules at gcc dot gnu.org --- Thanks for the report -- there's a fix for this on the og9 branch, but I'm not sure if I've posted that upstream (the GCN worker-partitioning patches were separated out from the deep-copy patches, so fixes around the intersection of those two features are easy to overlook). I'll dig that up and post it.
[Bug middle-end/92929] OpenACC/OpenMP 'target' 'exit data'/'update' optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92929 --- Comment #6 from jules at gcc dot gnu.org --- Apologies for breakage. This part of the patch was originally from the og9 patch supporting Fortran polymorphic class pointers posted at https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00752.html. The rationale was as follows: [...] OpenACC "enter data" and "exit data" now have GOMP_MAP_POINTER and GOMP_MAP_PSET mappings removed during gimplification. In some circumstances, passing an array to a function/subroutine and then doing an "enter data" on it could leave dangling references to the function's stack, although the actual array data is defined outside the function. In any case, the pointer/pointer-set mappings don't seem to be necessary for OpenACC "enter data". I observed the described problem with a large test program, and unfortunately did not manage to come up with a minimised test case at the time. Someone more familiar with Fortran might be able to do so more easily than me! In terms of backwards compatibility, we can't do anything about the "parasitic binding" to a function's expired stack frame as generated by an older version of the compiler in this case, I don't think. That's most likely going to manifest as an intermittent crash.
[Bug libgomp/92881] [OpenACC] In async context, need to use 'gomp_remove_var_async' instead of 'gomp_remove_var'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92881 --- Comment #1 from jules at gcc dot gnu.org --- Author: jules Date: Fri Dec 13 23:14:15 2019 New Revision: 279388 URL: https://gcc.gnu.org/viewcvs?rev=279388=gcc=rev Log: Fix potential race condition in OpenACC "exit data" operations PR libgomp/92881 libgomp/ * libgomp.h (gomp_remove_var_async): Add prototype. * oacc-mem.c (delete_copyout): Call gomp_remove_var_async instead of gomp_remove_var. * target.c (gomp_unref_tgt): Change return type to bool, indicating whether target_mem_desc was unmapped. (gomp_unref_tgt_void): New. (gomp_remove_var): Reimplement in terms of... (gomp_remove_var_internal): ...this new helper function. (gomp_remove_var_async): New, implemented using above helper function. (gomp_unmap_vars_internal): Use gomp_unref_tgt_void instead of gomp_unref_tgt. Reviewed-by: Thomas Schwinge Modified: trunk/libgomp/ChangeLog trunk/libgomp/libgomp.h trunk/libgomp/oacc-mem.c trunk/libgomp/target.c
[Bug libgomp/92843] [OpenACC] Wrong/missing dynamic reference counting for structured 'REFCOUNT_INFINITY'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92843 --- Comment #12 from jules at gcc dot gnu.org --- For an adaptation of the refcount checking code as alluded to above -- see PR92848.
[Bug libgomp/92848] [OpenACC] Memory leak for simple 'acc_create', 'acc_delete' sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92848 --- Comment #6 from jules at gcc dot gnu.org --- Created attachment 47479 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47479=edit Regressions with rc checking enabled These are the OpenACC tests that regress with refcount checking enabled at present.
[Bug libgomp/92848] [OpenACC] Memory leak for simple 'acc_create', 'acc_delete' sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92848 jules at gcc dot gnu.org changed: What|Removed |Added Attachment #47451|0 |1 is obsolete|| --- Comment #7 from jules at gcc dot gnu.org --- Created attachment 47480 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47480=edit Rebased patch for PR92848 Here's a rebased version of Thomas's patch. Any merge errors are my own!
[Bug libgomp/92848] [OpenACC] Memory leak for simple 'acc_create', 'acc_delete' sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92848 --- Comment #5 from jules at gcc dot gnu.org --- Created attachment 47478 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47478=edit Refcount checking for current trunk Hi, This is a merge of the refcount checking patch to current trunk, with the name of the "virtual_refcount" field changed back to the current "dynamic_refcount". With RC_CHECKING macro defined, this flags consistency errors in a large number of OpenACC tests: how many of those are user-visible problems, I am not sure. In at least some cases, we certainly have "two wrongs making a right" in that the consistency errors do not result in a user-visible problem. An assumption here is that "dynamic_refcount" is supposed to mean the same thing as "virtual_refcount" does in the overhaul patch: those references that represent dynamic data lifetimes without having links in the interlinked splay tree/target_mem_desc structure. Or otherwise, the number which the static refcount should be decreased by on encountering a "finalize" operation for the data item in question. Anyway: I hope this is useful in evaluating the rest of the refcount overhaul patch.
[Bug libgomp/92848] [OpenACC] Memory leak for simple 'acc_create', 'acc_delete' sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92848 --- Comment #4 from jules at gcc dot gnu.org --- (In reply to Thomas Schwinge from comment #3) > (In reply to jules from comment #2) > > Again, please don't change this code under the feet of the refcount overhaul > > patch! > > But why? This here is one independent chance. The big "OpenACC reference > count overhaul" will then either lose some of its changes, or replace this > here by something else ('GOMP_MAP_VARS_OPENACC_ENTER_DATA'), but it's an > incremental step (forward). It's hard for me to evaluate if the changes are correct, given that I don't trust the current baseline, mostly. > > Using the (currently OpenMP-specific) GOMP_MAP_VARS_ENTER_DATA is > > going to end up mighty confusing from OpenACC-specific code. > > Why do you think so? It has exactly! the semantics that we need here, for > this PR. Is the desired behaviour change just the reference count initialized by target.c:gomp_map_vars_internal? I suppose that's OK. The danger is if the semantics needed for OpenMP ever shift away from what we need for OpenACC here. But I guess this code might not be around for long anyway?
[Bug libgomp/92843] [OpenACC] Disallow 'acc_delete' etc. for everything without a dynamic reference count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92843 --- Comment #9 from jules at gcc dot gnu.org --- FWIW, I wrote this at the time wrt. the refcounting changes (some parts refer to a previous iteration of the manual deep copy patches)... Writing a couple of new attach/detach tests, I realised that reference counting for OpenACC was pretty seriously broken. In an attempt to fix that, I've: - written self-checking code to verify reference counts. This walks over the whole memory-mapping splay tree and re-counts references, making sure they match up with the running tallies kept by the mapping/unmapping routines. - removed the dubious "finalize" arguments from the unmap routines you pointed out. Call "detach" in more appropriate places. - rewritten gomp_acc_{insert,remove}_pointer, and change the way "enter data" mapping works in OpenACC to be more like the implementation for OpenMP (gomp_acc_remove_pointer is now modelled on target.c:gomp_exit_data). - ...which means we can get rid of the "acc_dev->openacc.data_environ" list. I think I added that to start with (ICBW!) but now I don't think we need it. That means we avoid a lot of scanning through linear linked lists to find the "right" target_mem_desc to unmap -- which it turns out was buggy, and also probably conceptually wrong. (In GOACC_enter_exit_data, that meant each mapping clause got its own "target_mem_desc" when entering, which isn't how things normally work for gomp_map_vars. That meant that when "exit data"ing, again individual clauses could remove the corresponding target_mem_desc. What *didn't* work is if the mappings for a single target_mem_desc returned by GOACC_enter_exit_data refer to *other* target_mem_descs in their argument lists, rather than pointing back to the same one. A lot of the code that was checking "t->refcount" for "2" or "minrefs" in oacc-mem.c was simply broken.) - lookup_dev scans over the splay tree instead of looking through the data_environ list. It should have done that to start with -- it means it'll be able to find data mapped inside "pragma acc data" blocks, which it wouldn't have been able to previously. It's a lot of changes, but (I hope!) with the RC_CHECKING feature we'll be able to be a little more confident that we've got the memory map refcounting right, at least.
[Bug libgomp/92843] [OpenACC] Disallow 'acc_delete' etc. for everything without a dynamic reference count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92843 --- Comment #8 from jules at gcc dot gnu.org --- (In reply to Thomas Schwinge from comment #7) > We first need to establish a stable baseline, with test cases, and then (or, > as part of that) merge the several independent pieces of the big "OpenACC > reference count overhaul". OK -- I think I misunderstood your strategy here, which is to avoid committing that patch in one go. IMO (:-)) it's a clear improvement overall, particularly with the consistency checking. Test coverage could probably be better -- but, I'm sure many of the new tests introduced by the manual deep-copy patch would not operate correctly without the refcounting fixes underneath (and the two patches were all merged together to start with, of course). > > The existing code was rewritten for a > > reason -- that being, I hit various problems with it (albeit long since > > forgotten) that interfered with the manual deep-copy implementation. > > So you'll have to dig out your notes, and/or we'll have to figure out any > rationale again, now. Patches that change such fundamental things in > libgomp we cannot just commit on the basis that once they made sense to > somebody. They have to come with rationale, and test cases. That's holding the new code to a rather higher standard than some of the existing code! But, that's OK, I suppose. I will see if there are any notes I made at the time that might be helpful. > > We have > > refcount self-checking code for the overhauled code. > > ... which surely can be adapted. If you do that adaptation now, you might get a better idea of the state of the current refcounting implementation! Not sure if that'll be helpful. > And, per my understanding, this is only checking libgomp internal > consistency, but not the semantics exposed to users via OpenACC > directives/API calls etc., which in part you're changing of your patch > (without test cases). This, again, I've spent the best part of the past > weeks understanding, writing test cases for, filing GCC PRs, resolving them > one by one, independently, incrementally, comprehensibly. No, not just internal consistency. At least with the deep copy bits, there was user-visible breakage with the existing code (again, IIRC). > > We can address corner-case bug fixes as follow-ons once the main overhaul > > patch is committed. > > Further bug fixes, yes, but we have to make some reasonable effort to not > introduce new bugs with the big "OpenACC reference count overhaul" changes. That's what the testsuite is for, I guess -- the burden of proof for getting patches approved is lack of regressions in existing tests, generally. Yes there could be more, but this is a real fundamental thing that most of the existing OpenACC tests will touch in some way, so it's not *that* bad.
[Bug libgomp/92854] [OpenACC] Always-true conditional in 'libgomp/oacc-mem.c:acc_unmap_data'?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92854 --- Comment #6 from jules at gcc dot gnu.org --- Created attachment 47453 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47453=edit Patch for acc_map_data-device_already-3.c problem This patch fixes the acc_map_data-device_already-3.c problem, which I guess has probably been broken forever. The single tgt_mem_desc used for all the function and global variable mappings (created in gomp_load_image_to_device) does not have its tgt_start and tgt_end fields set properly, so oacc-mem.c:lookup_dev cannot find any global variable. We can fix this by calculating the total address range covered by (the union of) all offloaded functions and global variables, and set tgt_start and tgt_end appropriately. We then go through each splay tree key's tgt_offset, and adjust it to be an offset within that range. I've yet to run full tests with this, but it works for the test case mentioned.
[Bug libgomp/92848] [OpenACC] Memory leak for simple 'acc_create', 'acc_delete' sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92848 --- Comment #2 from jules at gcc dot gnu.org --- Again, please don't change this code under the feet of the refcount overhaul patch! Using the (currently OpenMP-specific) GOMP_MAP_VARS_ENTER_DATA is going to end up mighty confusing from OpenACC-specific code.
[Bug libgomp/92843] [OpenACC] Disallow 'acc_delete' etc. for everything without a dynamic reference count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92843 --- Comment #6 from jules at gcc dot gnu.org --- Please don't start making changes to the reference-counting code that is being replaced by my overhaul patch. The existing code was rewritten for a reason -- that being, I hit various problems with it (albeit long since forgotten) that interfered with the manual deep-copy implementation. We have refcount self-checking code for the overhauled code. We can address corner-case bug fixes as follow-ons once the main overhaul patch is committed.
[Bug libgomp/92854] [OpenACC] Always-true conditional in 'libgomp/oacc-mem.c:acc_unmap_data'?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92854 --- Comment #5 from jules at gcc dot gnu.org --- Huh, yes, I missed that line in the spec (about existing mappings). I'll have a look at the test case you mentioned.
[Bug libgomp/92840] [OpenACC] Disallow 'acc_unmap_data' for everything other than 'acc_map_data'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92840 --- Comment #2 from jules at gcc dot gnu.org --- I think that looks OK, thanks.
[Bug libgomp/92854] [OpenACC] Always-true conditional in 'libgomp/oacc-mem.c:acc_unmap_data'?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92854 --- Comment #2 from jules at gcc dot gnu.org --- For strictly-paired acc_map_data/acc_unmap_data calls that don't interfere with other mappings -- no, probably not. But (like I guess you noticed too) the existing code feels wrong (or at least ugly) nonetheless. It's probably possible to come up with a legal (but odd) test case in which the condition is false -- maybe something like this (untested!): int arr1[100], arr2[100], arr3[100]; void foo (void) { #pragma acc data copy(arr1) copy(arr2) copy(arr3) { void *darr2 = acc_deviceptr (arr2); acc_map_data (arr2, darr2, sizeof (arr2)); [...] acc_unmap_data (arr2); } } Now I think the acc_map_data call will reuse the tgt for the structured data mapping, which binds together several array copies into a single device block. Forcing the first key's refcount to infinity (as is done in the current implementation of acc_map_data) is also wrong in that case.
[Bug libgomp/92843] [OpenACC] Disallow 'acc_delete' etc. for everything without a dynamic reference count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92843 --- Comment #2 from jules at gcc dot gnu.org --- I don't think your example is valid, but I'm not sure it will be fail in quite the right way with the current version of my refcount overhaul patch. Actually I think the acc_map_data implementation using REFCOUNT_INFINITY is probably wrong too. I will try changing the implementation as follows: - calls to acc_delete with the structured reference count being non-zero (or infinity in the case of '#pragma acc declare'd data) should raise an error. - acc_map_data should use GOMP_MAP_VARS_OPENACC_ENTER_DATA instead of forcing the refcount to infinity (i.e., making the behaviour the same as "enter data (create)"). Does that match your understanding of what the behaviour should be?
[Bug c/91985] Unsupported DFP not diagnosed with constants or built-in functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91985 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #4 from jules at gcc dot gnu.org --- SVN r278684 appears to cause a problem with offloading compilation in lto1 with an nvptx offloading-enabled compiler. The following backtrace is from libgomp/testsuite/libgomp.fortran/nestedfn5.f90, but many other tests fail also. lto1: internal compiler error: in operator[], at vec.h:867 0x894d9a vec::operator[](unsigned int) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/vec.h:867 0x893924 vec::operator[](unsigned int) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/vec.h:1433 0x1366f4a streamer_tree_cache_get_tree /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/tree-streamer.h:98 0x136b605 streamer_get_pickled_tree(lto_input_block*, data_in*) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/tree-streamer-in.c:1110 0xde36fc lto_input_tree_1(lto_input_block*, data_in*, LTO_tags, unsigned int) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto-streamer-in.c:1505 0xde3958 lto_input_tree(lto_input_block*, data_in*) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto-streamer-in.c:1552 0x136a982 lto_input_ts_list_tree_pointers /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/tree-streamer-in.c:863 0x136b4aa streamer_read_tree_body(lto_input_block*, data_in*, tree_node*) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/tree-streamer-in.c:1075 0xde3303 lto_read_tree_1 /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto-streamer-in.c:1375 0xde3480 lto_read_tree /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto-streamer-in.c:1416 0xde389e lto_input_tree_1(lto_input_block*, data_in*, LTO_tags, unsigned int) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto-streamer-in.c:1528 0xde34fa lto_input_scc(lto_input_block*, data_in*, unsigned int*, unsigned int*) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto-streamer-in.c:1440 0x88dd1c lto_read_decls /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto/lto-common.c:1830 0x88ed54 lto_file_finalize /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto/lto-common.c:2227 0x88edae lto_create_files_from_ids /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto/lto-common.c:2237 0x88ef93 lto_file_read /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto/lto-common.c:2292 0x891ee9 read_cgraph_and_symbols(unsigned int, char const**) /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto/lto-common.c:2744 0x8698f3 lto_main() /scratch/jbrown/nvptx-mainline/src/gcc-mainline/gcc/lto/lto.c:630 It looks like something is unprepared to handle a NULL_TREE or error_mark_node -- I've not finished investigating.
[Bug libgomp/92511] [OpenACC] Support subset subarray mappings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92511 --- Comment #2 from jules at gcc dot gnu.org --- Author: jules Date: Wed Nov 20 17:51:09 2019 New Revision: 278514 URL: https://gcc.gnu.org/viewcvs?rev=278514=gcc=rev Log: OpenACC "present" subarrays: runtime API return value and unmapping fixes PR libgomp/92511 libgomp/ * oacc-mem.c (present_create_copy): Fix device pointer return value in case of "present" subarray. Use tgt->tgt_start instead of tgt->to_free in non-present/create case. (delete_copyout): Change error condition to fail only on copies outside of mapped block. Adjust error message accordingly. * testsuite/libgomp.oacc-c-c++-common/copyin-devptr-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/copyin-devptr-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/lib-20.c: Adjust expected error message. * testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-22.c: Allow test to pass now. * testsuite/libgomp.oacc-c-c++-common/lib-30.c: Likewise. Reviewed-by: Thomas Schwinge Added: trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/copyin-devptr-1.c trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/copyin-devptr-2.c Modified: trunk/libgomp/ChangeLog trunk/libgomp/oacc-mem.c trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-20.c trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-22.c trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-23.c trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-30.c
[Bug libgomp/92503] [OpenACC] Behavior of 'acc_free' if the memory space is still used in a mapping
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92503 --- Comment #1 from jules at gcc dot gnu.org --- FWIW, I don't think we should do this implicit unmapping, particularly since it implies an expensive device-to-host-address lookup.
[Bug lto/71959] [OpenACC] lto1: ICE in inline_read_section, at ipa-fnsummary.c:3314
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71959 --- Comment #11 from jules at gcc dot gnu.org --- Author: jules Date: Thu Jan 10 12:32:03 2019 New Revision: 267806 URL: https://gcc.gnu.org/viewcvs?rev=267806=gcc=rev Log: Add testcase from PR71959 libgomp/ PR lto/71959 * testsuite/libgomp.oacc-c++/pr71959-aux.cc: New. * testsuite/libgomp.oacc-c++/pr71959.C: New. Added: trunk/libgomp/testsuite/libgomp.oacc-c++/pr71959-aux.cc trunk/libgomp/testsuite/libgomp.oacc-c++/pr71959.C Modified: trunk/libgomp/ChangeLog
[Bug lto/71959] [OpenACC] lto1: ICE in inline_read_section, at ipa-fnsummary.c:3314
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71959 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #10 from jules at gcc dot gnu.org --- The og7/og8 test case libgomp.oacc-c++/pr71959.C appears to be fixed by: https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01167.html
[Bug middle-end/86336] [9 regression] ICE in omp-low.c:7879
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86336 --- Comment #3 from jules at gcc dot gnu.org --- Author: jules Date: Wed Sep 12 15:21:19 2018 New Revision: 264244 URL: https://gcc.gnu.org/viewcvs?rev=264244=gcc=rev Log: [OpenACC] C++ reference mapping 2018-09-09 Cesar Philippidis Julian Brown PR middle-end/86336 gcc/cp/ * semantics.c (finish_omp_clauses): Treat C++ references the same in OpenACC as OpenMP. gcc/ * gimplify.c (gimplify_scan_omp_clauses): Set target_firstprivatize_array_bases in OpenACC parallel and kernels region contexts. Remove GOMP_MAP_FIRSTPRIVATE_REFERENCE clauses from OpenACC data regions. libgomp/ * testsuite/libgomp.oacc-c++/non-scalar-data.C: Remove XFAIL. Modified: trunk/gcc/ChangeLog trunk/gcc/cp/ChangeLog trunk/gcc/cp/semantics.c trunk/gcc/gimplify.c trunk/libgomp/ChangeLog trunk/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C
[Bug middle-end/86757] [og8,nvptx] gangprivate related regressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86757 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #1 from jules at gcc dot gnu.org --- I believe this is fixed in the patch posted upstream: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00749.html
[Bug libgomp/65742] [5/6 Regression] Several libgomp.oacc-* failures after r221922.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65742 --- Comment #5 from jules at gcc dot gnu.org --- Author: jules Date: Thu May 28 09:29:19 2015 New Revision: 223801 URL: https://gcc.gnu.org/viewcvs?rev=223801root=gccview=rev Log: PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. libgomp/ * oacc-init.c (plugin/plugin-host.h): Include. (acc_on_device): Check whether we're in an offloaded region for host_nonshm plugin. Don't use __builtin_acc_on_device. * plugin/plugin-host.c (GOMP_OFFLOAD_openacc_parallel): Set nonshm_exec flag in thread-local data. (GOMP_OFFLOAD_openacc_create_thread_data): Allocate thread-local data for host_nonshm plugin. (GOMP_OFFLOAD_openacc_destroy_thread_data): Free thread-local data for host_nonshm plugin. * plugin/plugin-host.h: New. Added: trunk/libgomp/plugin/plugin-host.h Modified: trunk/gcc/ChangeLog trunk/gcc/builtins.c trunk/libgomp/ChangeLog trunk/libgomp/oacc-init.c trunk/libgomp/plugin/plugin-host.c
[Bug libgomp/65742] [5/6 Regression] Several libgomp.oacc-* failures after r221922.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65742 --- Comment #6 from jules at gcc dot gnu.org --- Author: jules Date: Thu May 28 09:38:40 2015 New Revision: 223802 URL: https://gcc.gnu.org/viewcvs?rev=223802root=gccview=rev Log: PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. libgomp/ * oacc-init.c (plugin/plugin-host.h): Include. (acc_on_device): Check whether we're in an offloaded region for host_nonshm plugin. Don't use __builtin_acc_on_device. * plugin/plugin-host.c (GOMP_OFFLOAD_openacc_parallel): Set nonshm_exec flag in thread-local data. (GOMP_OFFLOAD_openacc_create_thread_data): Allocate thread-local data for host_nonshm plugin. (GOMP_OFFLOAD_openacc_destroy_thread_data): Free thread-local data for host_nonshm plugin. * plugin/plugin-host.h: New. Added: branches/gomp-4_0-branch/libgomp/plugin/plugin-host.h Modified: branches/gomp-4_0-branch/gcc/ChangeLog.gomp branches/gomp-4_0-branch/gcc/builtins.c branches/gomp-4_0-branch/libgomp/ChangeLog.gomp branches/gomp-4_0-branch/libgomp/oacc-init.c branches/gomp-4_0-branch/libgomp/plugin/plugin-host.c
[Bug libgomp/65904] Memory corruption with acc_shutdown, nvptx offloading, libgomp.oacc-c-c++-common/asyncwait-1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65904 jules at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2015-05-01 Ever confirmed|0 |1 --- Comment #1 from jules at gcc dot gnu.org --- Patch posted: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00021.html
[Bug target/49423] [4.7/4.8/4.9/4.10 Regression] [arm] internal compiler error: in push_minipool_fix
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49423 --- Comment #26 from jules at gcc dot gnu.org --- The testcase I previously had for this bug no longer reproduces since LRA was enabled by default on ARM. So, it's possible the bug is now dormant, or indeed fixed. The difference in the postreload dump around the insn which previously failed (136) is, using -mno-lra/-mlra on current trunk: -(insn 136 7 12 2 (set (reg:SI 11 fp [orig:231 D.6590 ] [231]) -(zero_extend:SI (mem/u/c:QI (symbol_ref/u:SI (*.LC0) [flags 0x2]) [0 S1 A8]))) ../../aes/aeskey.c:509 182 {*arm_zero_extendqisi2_v6} +(insn 952 7 136 2 (set (reg:SI 12 ip [1488]) +(const_int 0 [0])) ../../aes/aeskey.c:509 666 {*arm_movsi_vfp} + (nil)) +(insn 136 952 12 2 (set (reg:SI 11 fp [orig:231 D.6590 ] [231]) +(zero_extend:SI (reg:QI 12 ip [1488]))) ../../aes/aeskey.c:509 182 {*arm_zero_extendqisi2_v6} Or, does anyone know of a testcase which still causes this ICE with -mlra enabled? If not, we might be able to consider this fixed. (Insn 136 -- of the form which causes breakage -- was generated by reload.) Thanks, Julian
[Bug middle-end/59134] New: Infinite loop between store_fixed_bit_field and store_split_bit_field with STRICT_ALIGNMENT
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59134 Bug ID: 59134 Summary: Infinite loop between store_fixed_bit_field and store_split_bit_field with STRICT_ALIGNMENT Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jules at gcc dot gnu.org Compiling code using packed structures with a single non-zero-sized element and a zero-sized array can lead to segfaults due to stack exhaustion on STRICT_ALIGNMENT targets (at least PowerPC E500 targets are affected). See the following for further details: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01458.html Earlier bugs that are possibly related include: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55438 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748
[Bug target/57717] error: unrecognizable insn compiling ./strtod_l.c from glibc on powerpc-gnuspe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57717 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #7 from jules at gcc dot gnu.org --- Here's another candidate patch: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00668.html
[Bug target/49423] [4.7/4.8/4.9 Regression] [arm] internal compiler error: in push_minipool_fix
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49423 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #20 from jules at gcc dot gnu.org --- I've posted a new potential fix (and a new testcase which breaks on current mainline) here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01191.html
[Bug rtl-optimization/57159] New: Latent bug in RTL GCSE/PRE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57159 Bug #: 57159 Summary: Latent bug in RTL GCSE/PRE Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: minor Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ju...@gcc.gnu.org Created attachment 30018 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30018 testcase We encountered a wrong-code bug in an out-of-tree port (4.6-based, though the affected code doesn't appear to have changed that much) in target-independent code, though I have not been able to reproduce the problem on a currently-supported target. The port in question is unusual in that it allows read-modify-write arithmetic directly on memory locations (for some operations, at least). That means that during RTL expansion, the compiler may attach REG_EQUAL notes to insns that contain memory operands: (insn 16 15 17 (set (reg:SI 70) (mem/s:SI (reg/v/f:SI 68 [ iter ]) [3 iter_4(D)-xhv_riter+0 S4 A32])) hvmin.c:32 -1 (nil)) (insn 17 16 0 (set (reg:SI 56 [ D.4066 ]) (plus:SI (reg:SI 70) (const_int 1 [0x1]))) hvmin.c:32 -1 (expr_list:REG_EQUAL (plus:SI (mem/s:SI (reg/v/f:SI 68 [ iter ]) [3 iter_4(D)-xhv_riter+0 S4 A32]) (const_int 1 [0x1])) (nil))) That in itself is not a problem, but certain parts of gcse.c's PRE code examine only the instruction itself (e.g. when building tables of interesting memory locations -- compute_ld_motion_mems), whereas other parts pay attention to the REG_EQUAL note (hash_scan_set). That means that memory references which are too complicated for PRE to handle properly can leak through to later parts of the algorithm and cause misoptimisations -- in the case of the attached file, the increment in the loop is optimised away, and the loop spins forever. The fix then is to prevent complicated memory references which may be present in REG_EQUAL notes from being considered for load motion, by stopping them from being added to the relevant tables to start with. That's what the attached patch does, and it works for us -- though without a way of reproducing the bug on mainline, it's not quite obvious that it should be applied there. Interestingly, on x86 (which also allows read-modify-write operations), it looks like the bug is latent because a normal addition clobbers the flags register: (insn 28 91 94 4 (parallel [ (set (reg:SI 60 [ D.2122 ]) (plus:SI (reg:SI 74 [ iter_4(D)-xhv_riter ]) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) hvmin.c:32 252 {*addsi_1} (expr_list:REG_DEAD (reg:SI 74 [ iter_4(D)-xhv_riter ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_EQUAL (plus:SI (mem/s:SI (reg/v/f:DI 72 [ iter ]) [3 iter_4(D)-xhv_riter+0 S4 A32]) (const_int 1 [0x1])) (nil) This is apparently sufficient to prevent the misoptimisation (probably because of want_to_gcse_p's call to can_assign_to_reg_without_clobbers_p). I tried to reproduce on m68k also, but addsi patterns are expanded differently there so the failing condition doesn't trigger.
[Bug rtl-optimization/57159] Latent bug in RTL GCSE/PRE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57159 --- Comment #1 from jules at gcc dot gnu.org 2013-05-03 11:56:53 UTC --- Created attachment 30019 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30019 patch
[Bug rtl-optimization/57159] Latent bug in RTL GCSE/PRE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57159 --- Comment #4 from jules at gcc dot gnu.org 2013-05-03 19:56:33 UTC --- Created attachment 30029 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30029 Before/after dumps Here are some before/after dumps taken from the out-of-tree port. Notice how in the before pre dump, the results of insns 27 and 28 are considered to be equivalent to the results of insns 16 and 17, despite the store (insn 18) making that equivalence invalid -- this is the misoptimisation alluded to earlier. This happens because oprs_available_p mistakenly thinks that insn 16/17's results are available at the end of their BB, because oprs_unchanged_p - load_killed_in_block_p - mems_conflict_for_gcse_p - find_rtx_in_ldst finds a load/store that it thinks it can deal with ((mem/s:SI (reg/v/f:SI 68 [ iter ]))), but actually that mem is used (in the REG_EQUAL note, as extracted by hash_scan_set) in a way that it cannot handle.
[Bug rtl-optimization/57159] Latent bug in RTL GCSE/PRE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57159 --- Comment #5 from jules at gcc dot gnu.org 2013-05-03 20:05:19 UTC --- Actually the last paragraph might not make sense -- insn 16/17's *operands* are not available at the end of the BB, but only if the REG_EQUAL note contents are substituted for the insn pattern. But that's kind of what happens in hash_scan_set, so I think the overall idea's right.
[Bug testsuite/54622] gcc.dg/vect test failures for arm big-endian
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54622 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #5 from jules at gcc dot gnu.org 2012-09-26 16:25:08 UTC --- Unfortunately (due to the reasons Richard outlined) lots of things don't work wrt. vectorization for Neon in big-endian mode, so are deliberately disabled in the backend. IMO, the right thing to do (at least until that is fixed) is to make the effective-target checks for vectorization features depend on little-endian ARM mode.
[Bug rtl-optimization/47918] [4.6/4.7 Regression] noreturn discovery broke non local gotos on m68k and i386
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47918 --- Comment #12 from jules at gcc dot gnu.org 2011-11-01 18:38:45 UTC --- Author: jules Date: Tue Nov 1 18:38:42 2011 New Revision: 180740 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=180740 Log: PR rtl-optimization/47918 * reload1.c (set_initial_label_offsets): Use initial offsets for labels on the nonlocal_goto_handler_labels chain. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/reload1.c
[Bug rtl-optimization/47918] [4.6/4.7 Regression] noreturn discovery broke non local gotos on m68k and i386
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47918 --- Comment #11 from jules at gcc dot gnu.org 2011-10-28 10:48:36 UTC --- Author: jules Date: Fri Oct 28 10:48:32 2011 New Revision: 180611 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=180611 Log: PR rtl-optimization/47918 * reload1.c (set_initial_label_offsets): Use initial offsets for labels on the nonlocal_goto_handler_labels chain. Modified: trunk/gcc/ChangeLog trunk/gcc/reload1.c
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #12 from jules at gcc dot gnu.org 2011-10-05 12:41:55 UTC --- I don't much like the idea of using builtins for operations as fundamental as integer arithmetic. How about this as a straw-man suggestion: adding new qualifiers for fat integers-with-flags, somewhat in the spirit of the embedded-C fractional/saturating types? So you might have: int x, y; void foo (void) { _Flagged int res; res = (_Flagged int) x + y; if (_Carry (res)) printf (sum carried\n); if (_Overflow (res)) printf (sum overflowed\n); } this avoids problems with global state, and allows for the programming style which (I vaguely get the impression that) people seem to want -- performing a normal integer operation, then querying for carry/overflow flags afterwards. These types wouldn't be allowed to cross ABI boundaries (although of course you could use the magic builtins _Carry/_Overflow to extract values to pass to functions), and _Overflow would only be allowed for signed types. You could also have _Borrow for subtraction maybe (whose meaning would be the inverse of _Carry). Signalling variants could look like, e.g.: void bar (void) { int res; res = (_Signalling _Flagged int) x + y; } Things would get awkward if you tried to use these new constructs in more-complicated expressions, I suppose. Anyway, just an idea.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #13 from jules at gcc dot gnu.org 2011-10-05 13:05:47 UTC --- Coming to think of it, if _Sat were allowed on plain integers too, a _Flagged _Sat int could also be queried for saturation using a similar mechanism, like: int foo (_Sat int x, _Sat int y) { return _Saturated ((_Sat _Flagged) x + y); } I'm probably getting ahead of myself :-).
[Bug rtl-optimization/42575] arm-eabi-gcc 64-bit multiply weirdness
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42575 jules at gcc dot gnu.org changed: What|Removed |Added Status|RESOLVED|REOPENED CC||jules at gcc dot gnu.org Resolution|FIXED | --- Comment #9 from jules at gcc dot gnu.org 2011-09-20 19:03:43 UTC --- This appears to have regressed on mainline. I now get the following assembly output for the test case added by Maxim: longfunc: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. stmfd sp!, {r4, r5} umull r4, r5, r0, r2 mul r3, r0, r3 mla r1, r2, r1, r3 mov r0, r4 add r1, r1, r5 ldmfd sp!, {r4, r5} bx lr
[Bug tree-optimization/45932] execute/pr37573.c fails after Neon misaligned patch.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45932 --- Comment #2 from jules at gcc dot gnu.org 2010-10-08 16:17:58 UTC --- Could you be more specific about testsuite options, etc. needed to observe this apparent regression? I haven't been able to reproduce it using default options, nor by specifying [-mthumb] -march=armv7-a -mfpu=neon -mfloat-abi=softfp to the test run. I do see a failure like so: .../gcc/testsuite/gcc.c-torture/execute/pr37573.c:64:1: sorry, unimplemented: gimple bytecode streams do not support machine specific builtin functions on this target for -flto and -fwhopr, but those happen irrespective of whether the misalignment patch is applied or not. Thanks!