[Bug fortran/42958] Weird temporary array allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958 --- Comment #26 from Thomas Koenig --- (In reply to Dominique d'Humieres from comment #25) > > If you find anything still missing in the library, please let me know. > > I thought I had converted everything to the macros, which are fairly > > easy to change, but I may be mistaken. > > Is there anything left? or could this PR be closed as FIXED? We still do { if ((real(kind=4)[0:] * restrict) a.data != 0B) { __builtin_free ((void *) a.data); (real(kind=4)[0:] * restrict) a.data = 0B; } } Also, we have space array descriptor fields now, so for GCC 10, we can finally implement the "allocated" flag.
[Bug fortran/42958] Weird temporary array allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958 --- Comment #25 from Dominique d'Humieres --- > If you find anything still missing in the library, please let me know. > I thought I had converted everything to the macros, which are fairly > easy to change, but I may be mistaken. Is there anything left? or could this PR be closed as FIXED?
[Bug fortran/42958] Weird temporary array allocation
--- Comment #22 from paul dot richard dot thomas at gmail dot com 2010-04-29 10:02 --- Subject: Re: Weird temporary array allocation As an aside, for the 4.6 array descriptor update, there has been some discussion to move from the current (lbound, ubound, stride) triplet for every dimension to (lbound, stride, extent). Which would change these kinds of expressions. Yes, indeed. As soon as I can free up fortran-dev from the vtable implementation of dynamic dispatch, I will start on this reform. It is my intention to change the array descriptor representation and to start with an API that provides the (lbound, ubound,stride) values. In fact, this API has already been put in place in the FE but is incompletely implemented in the library (or, at least, last time I looked). Following this first step, the performance bottle necks due to the API will have to be weeded out. Having the extent to hand will greatly simplify issues such as this PR. Then the plethora of new descriptor fields will have to be added and the various associated PRs dealt with. At this point, the upgrade should be ready for applying to trunk. Paul -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #23 from jakub at gcc dot gnu dot org 2010-04-29 13:18 --- Created an attachment (id=20514) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20514action=view) d3.c BTW, I've timed different variants of the C code for the tonto loop, and time /tmp/d3 1 real0m0.667s user0m0.666s sys 0m0.001s time /tmp/d3 2 real0m0.640s user0m0.639s sys 0m0.000s time /tmp/d3 3 real0m0.616s user0m0.614s sys 0m0.002s and clearly the last one (versioning the loop if the temporary can be easily avoided for -O3 at least) is the winner here. I've started looking into that yesterday. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #24 from tkoenig at gcc dot gnu dot org 2010-04-29 19:09 --- (In reply to comment #22) It is my intention to change the array descriptor representation and to start with an API that provides the (lbound, ubound,stride) values. In fact, this API has already been put in place in the FE but is incompletely implemented in the library (or, at least, last time I looked). If you find anything still missing in the library, please let me know. I thought I had converted everything to the macros, which are fairly easy to change, but I may be mistaken. -- tkoenig at gcc dot gnu dot org changed: What|Removed |Added CC||tkoenig at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #18 from rguenth at gcc dot gnu dot org 2010-04-28 14:52 --- Updating the status on this bugreport. I am working on middle-end support for hoisting/sinking malloc/free pairs out of loops (in case the size is loop invariant). The Fortran FE makes this somewhat difficult. For subroutine test0(esss,Ix) integer(kind=kind(1)), dimension(:), pointer :: esss integer(kind=kind(1)), dimension(:), pointer :: Ix esss = Ix + Ix end subroutine It creates D.1552 = ix-dim[0].lbound; D.1553 = ix-dim[0].ubound; ... D.1562 = D.1553 - D.1552; ... D.1571 = D.1562 0; D.1572 = D.1562 + 1; D.1573 = D.1571 ? 0 : D.1572 * 4; D.1574 = (void * restrict) __builtin_malloc (MAX_EXPR D.1573, 1); D.1575 = D.1574; atmp.0.data = D.1575; ... { ... S.1 = 0; while (1) { if (S.1 D.1562) goto L.1; ... } L.1:; S.1 = 0; while (1) { if (S.1 D.1562) goto L.2; ... } L.2:; } { void * D.1576; D.1576 = (void *) atmp.0.data; if (D.1576 != 0B) { __builtin_free (D.1576); } } two unfortunate facts remain. 1) __builtin_free is executed conditionally even though free(0) is well-defined 2) __builtin_malloc (MAX_EXPR D.1571 ? 0 : D.1572 * 4, 1) later causes DOM to jump-thread this into two malloc calls, one malloc(1) and one malloc(D.1573) especially the latter is very hard to get rid of at the tree level later (the conditional free can possibly be made unconditional by optimizers). Thus I would request that for the purpose of allocating array temporaries from the scalarizer you 1) unconditionally free the memory 2) drop the MAX_EXPR, malloc (0) is well-defined and we can just fold that to NULL if DOM still thinks to jump-thread that 3) for the same reason you can also drop the + 1 in computing the allocation size which is currently (ubound - lbound + 1) * 4 even better would be to guard the allocation by the loop entry check (thus ubound - lbound = 0) If that's all acceptable I will work on this soon. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rguenth at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2010-03-27 18:55:47 |2010-04-28 14:52:15 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #19 from amonakov at gcc dot gnu dot org 2010-04-28 15:15 --- (In reply to comment #18) 3) for the same reason you can also drop the + 1 in computing the allocation size which is currently (ubound - lbound + 1) * 4 Sorry, but isn't +1 needed because bounds are inclusive? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #20 from jv244 at cam dot ac dot uk 2010-04-28 15:20 --- (In reply to comment #18) If that's all acceptable I will work on this soon. FYI, this would fix PR38318 and PR21046 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #21 from jb at gcc dot gnu dot org 2010-04-28 15:43 --- (In reply to comment #19) (In reply to comment #18) 3) for the same reason you can also drop the + 1 in computing the allocation size which is currently (ubound - lbound + 1) * 4 Sorry, but isn't +1 needed because bounds are inclusive? Yes. As an aside, for the 4.6 array descriptor update, there has been some discussion to move from the current (lbound, ubound, stride) triplet for every dimension to (lbound, stride, extent). Which would change these kinds of expressions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #17 from burnus at gcc dot gnu dot org 2010-04-16 08:17 --- Another case where the if NULL check is not needed before the free are automatic arrays: subroutine sub(n) integer :: a(n) a(1) = 0 end Additionally, the dump looks overly complicated and a least two variables are set and not used. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #12 from burnus at gcc dot gnu dot org 2010-03-28 12:57 --- (In reply to comment #8) atmp.5.dim[0].lbound = 0; atmp.5.dim[0].ubound = D.1612; D.1616 = D.1612 0; D.1617 = D.1612 + 1; D.1618 = D.1616 ? 0 : D.1617 * 8; but there's still the conditional on negative D.1612. I think that check is needed: If one has: array(1:0) or array(1:-200) or array(1:100:-1) that means that one has a zero-sized array. If one had no such check one would allocate negative memory which matches - as the malloc argument is unsigned - a large positive number. Maybe there are cases where one knows before hand that the array cannot be zero-sized, but in the general case one cannot. (I am thinking of call f(array(...)) where one knows that f does not allow for a zero-sized array argument. In the other known cases, the upper bound should be already known at compile time and the check should be folded away. Maybe there are more such cases.) Also deallocation is D.1621 = (void *) atmp.5.data; if (D.1621 != 0B) __builtin_free (D.1621); where the check for NULL is not necessary (though the middle end might be able to remove that condition). I think one can remove the check as __builtin_free also can deal with NULL arguments, but it is not known before hand whether the argument is NULL as - in the general case - one can manually deallocate arrays. Well, for generated temporary arrays (which the user cannot directly access), one can know whether the pointer is NULL and thus one could remove the check. Such a patch should be rather simple. another missed middle-end optimization, p = malloc(1); free (p); isn't optimized away I think it would be useful, if the middle end were able to do so. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #13 from burnus at gcc dot gnu dot org 2010-03-28 13:05 --- (In reply to comment #12) (I am thinking of call f(array(...)) where one knows that f does not allow for a zero-sized array argument. s/where/if/ - in the general case one does not know. Especially for dummyArg(n) and dummyArg(*) [and if no interface it known] one does not know it, while for dummyArg(4) one does. I fear, that this is a rare case. For call foo(Ix) in the Tonto example of comment 9, one does not know as the explicit interface interface of foo is not known. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #14 from burnus at gcc dot gnu dot org 2010-03-28 13:56 --- Created an attachment (id=20230) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20230action=view) Draft patch for NULL check for deallocation Draft patch for removing the NULL check before __builtin_free. According to POSIX, passing NULL to free is allowed. The patch removes all the NULL checks where it is likely that the temporary is not-NULL. For procedure arguments (_gfortran_pack/_gfortran_unpack) the chance is high that it is NULL - which is the case if the variable was already contiguous. Thus, I left the check in. The patch removes 5 of the 15 != 0 checks for the test case (attachment 20225). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #15 from burnus at gcc dot gnu dot org 2010-03-28 14:10 --- Actually, I am wondering whether one needs to do D.1620_135 = __builtin_malloc (1); for temporary arrays. For user-accessible ALLOCATABLE arrays one does - because ALLOCATED(array) needs also to be .TRUE. for zero-sized arrays, but for temporary arrays one does not. For those one could just do D.1620_135 = 0B Will the middle-end optimize __builtin_free(NULL) away? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #16 from rguenther at suse dot de 2010-03-28 14:45 --- Subject: Re: Weird temporary array allocation On Sun, 28 Mar 2010, burnus at gcc dot gnu dot org wrote: --- Comment #15 from burnus at gcc dot gnu dot org 2010-03-28 14:10 --- Actually, I am wondering whether one needs to do D.1620_135 = __builtin_malloc (1); for temporary arrays. For user-accessible ALLOCATABLE arrays one does - because ALLOCATED(array) needs also to be .TRUE. for zero-sized arrays, but for temporary arrays one does not. For those one could just do D.1620_135 = 0B Will the middle-end optimize __builtin_free(NULL) away? Not yet, but that's easy to implement. Note that I will be looking in detail into malloc/free optimizations, including moving allocation out of loops. So there's no need to rush anything and it can wait for 4.6. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #7 from pault at gcc dot gnu dot org 2010-03-27 18:55 --- Tobias, Can this be closed now? Certainly, it can be confirmed! Cheers Paul -- pault at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-03-27 18:55:47 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #8 from rguenth at gcc dot gnu dot org 2010-03-27 19:05 --- I suppose I should have provided a testcase. The FE now produces control-flow free allocation like: D.1612 = D.1601 - D.1600; atmp.5.dtype = 537; atmp.5.dim[0].stride = 1; atmp.5.dim[0].lbound = 0; atmp.5.dim[0].ubound = D.1612; D.1616 = D.1612 0; D.1617 = D.1612 + 1; D.1618 = D.1616 ? 0 : D.1617 * 8; D.1619 = (void * restrict) __builtin_malloc (MAX_EXPR D.1618, 1); D.1620 = D.1619; atmp.5.data = D.1620; atmp.5.offset = 0; but there's still the conditional on negative D.1612. Also deallocation is D.1621 = (void *) atmp.5.data; if (D.1621 != 0B) { __builtin_free (D.1621); } where the check for NULL is not necessary (though the middle end might be able to remove that condition). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #9 from rguenth at gcc dot gnu dot org 2010-03-27 19:06 --- Created an attachment (id=20225) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20225action=view) testcase from tonto -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #10 from rguenth at gcc dot gnu dot org 2010-03-27 19:08 --- The gimplifier of course transforms the COND_EXPRs back to control flow, thus the CFG looks like D.1616 = D.1612 0; D.1617 = D.1612 + 1; if (D.1616 != 0) goto bb 32; else goto bb 33; bb 32: iftmp.17 = 0; goto bb 34; bb 33: iftmp.17 = D.1617 * 8; bb 34: D.1618 = iftmp.17; D.1795 = MAX_EXPR D.1618, 1; D.1619 = __builtin_malloc (D.1795); the question is if the check for negative size is really necessary for compiler-generated allocations of array temporaries. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #11 from rguenth at gcc dot gnu dot org 2010-03-27 19:12 --- The middle-end ends up with the following optimized: D.1776_95 = D.1775_94 - D.1591_90; if (D.1776_95 = 0) goto bb 34; else goto bb 33; bb 33: D.1620_204 = __builtin_malloc (D.1795_11); D.1796_281 = iy.dim[1].stride; ... goto bb 35; bb 34: D.1620_135 = __builtin_malloc (1); goto bb 38; bb 35: the loop code bb 38: Invalid sum of incoming frequencies 1062, should be 900 # D.1620_374 = PHI D.1620_204(37), D.1620_135(34) if (D.1620_374 != 0B) goto bb 39; else goto bb 40; bb 39: __builtin_free (D.1620_374); so we are able to prove that if we end up allocating 1 byte only then the loop isn't executed and we just free it (another missed middle-end optimization, p = malloc(1); free (p); isn't optimized away). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #6 from burnus at gcc dot gnu dot org 2010-02-20 08:31 --- Subject: Bug 42958 Author: burnus Date: Sat Feb 20 08:31:25 2010 New Revision: 156923 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=156923 Log: 2010-02-20 Tobias Burnus bur...@net-b.de PR fortran/42958 * libgfortran.h: Add GFC_RTCHECK_MEM. * invoke.texi (-fcheck=): Document -fcheck=mem. * tranc.c (gfc_call_malloc): Remove negative-size run-time error and enable malloc-success check only with -fcheck=mem. * option.c (gfc_handle_runtime_check_option): Add -fcheck=mem. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/invoke.texi trunk/gcc/fortran/libgfortran.h trunk/gcc/fortran/options.c trunk/gcc/fortran/trans.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #5 from jvdelisle at gcc dot gnu dot org 2010-02-17 20:17 --- Tobias, reply to your RFC on fortran list. I think the negative check should go away. Then I think we should consider an option of -fcheck-mem-alloc which we can then make more elaborate and do numerouse useful checks. Default would be -fno-check-mem-alloc. (sorry for posting here, but I have no email access here) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #3 from rguenther at suse dot de 2010-02-05 09:32 --- Subject: Re: Weird temporary array allocation On Fri, 5 Feb 2010, pault at gcc dot gnu dot org wrote: --- Comment #2 from pault at gcc dot gnu dot org 2010-02-05 05:36 --- (In reply to comment #1) Why there is a negative check? Well, I do not know. I also can speculate about a poor man's overflow check, which sometimes indeed works, but often fails. I suspect that you are being generous and that this is rather a sin of omission rather than commission. Paul, what do you think? I think that your arguments are correct. (PS: POSIX Allows ptr = malloc(0); free(ptr), where malloc(0) returns either NULL or a unique pointer.) Indeed. Btw, should there be the same error reporting or if (allocated) behavior on Frontend-generated temporaries? I see this from the temporaries generated by the scalarizer and the introduced control-flow makes it very hard to remove unnecessary temporaries in the middle-end later. Thx, Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #4 from rguenth at gcc dot gnu dot org 2010-02-05 14:23 --- (In reply to comment #3) Subject: Re: Weird temporary array allocation On Fri, 5 Feb 2010, pault at gcc dot gnu dot org wrote: --- Comment #2 from pault at gcc dot gnu dot org 2010-02-05 05:36 --- (In reply to comment #1) Why there is a negative check? Well, I do not know. I also can speculate about a poor man's overflow check, which sometimes indeed works, but often fails. I suspect that you are being generous and that this is rather a sin of omission rather than commission. Paul, what do you think? I think that your arguments are correct. (PS: POSIX Allows ptr = malloc(0); free(ptr), where malloc(0) returns either NULL or a unique pointer.) Indeed. Btw, should there be the same error reporting or if (allocated) behavior on Frontend-generated temporaries? I see this from the temporaries generated by the scalarizer and the introduced control-flow makes it very hard to remove unnecessary temporaries in the middle-end later. Thus, basically adding an argument to gfc_call_malloc () specifying whether we want to do checking at all and shortcut it from at least gfc_trans_allocate_array_storage (). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #1 from burnus at gcc dot gnu dot org 2010-02-04 19:32 --- so, we check if the array-size is empty, ubound - lbound 0. In that case we set size to zero. Otherwise we compute it as (ubound - lbound + 1) *8. Then we check whether size is negative and error out. Then we actually allocate max(1,size). The reason for the setting it to zero is that Fortran allows one to allocate a zero-sized array: ALLOCATE ( Array(0:-2) ) The reason for MAX(1, size) is to make ALLOCATE(A(1:0)); if(ALLOCATED(A)) work. (This was added later than size 0 ? 0 : size.) Why there is a negative check? Well, I do not know. I also can speculate about a poor man's overflow check, which sometimes indeed works, but often fails. * * * Why not simply do size = (ubound - lbound + 1) * 8; malloc (size); I think that would be the future: With the array descriptor (dope vector) reform, we will have an allocated field and thus one can do: descriptor.allocated = true descriptor.data = malloc (max (0,size)) Where if(allocated(A)) translates into if(A.allocated). Actually, maybe one should better use: size = (ubound - lbound + 1) if (size 0) { descriptor.data = malloc(size) if (descriptor.data == NULL) error (allocate failed); } descriptor.allocated = true The allocated member of the descriptor is also needed in order to make the following work properly: integer, target :: int integer, pointer :: ptr ptr = int deallocate(ptr, stat=i) ! shall return i != 0 but not crash Thus, from my side: The negative check should be removed and the simplified version should be applied after the descriptor update. Before the descriptor update (planned for 4.6, breaks ABI) one can use: size = (ubound - lbound + 1) * 8; malloc (max(1,size)); Paul, what do you think? (PS: POSIX Allows ptr = malloc(0); free(ptr), where malloc(0) returns either NULL or a unique pointer.) -- burnus at gcc dot gnu dot org changed: What|Removed |Added CC||pault at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958
[Bug fortran/42958] Weird temporary array allocation
--- Comment #2 from pault at gcc dot gnu dot org 2010-02-05 05:36 --- (In reply to comment #1) Why there is a negative check? Well, I do not know. I also can speculate about a poor man's overflow check, which sometimes indeed works, but often fails. I suspect that you are being generous and that this is rather a sin of omission rather than commission. Paul, what do you think? I think that your arguments are correct. (PS: POSIX Allows ptr = malloc(0); free(ptr), where malloc(0) returns either NULL or a unique pointer.) Indeed. Paul -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958