[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2023-12-19 Thread david at westcontrol dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #51 from David Brown  ---
(In reply to M Welinder from comment #48)
> It's your (1).  gcc is changing a program that can rely on errno not being
> changed to one where the C library can change it.  (The current C library or
> any future library that the resulting binary may be dynamically linked
> against.)
> 
> Is there any real-world situation that benefits from introducing these
> calls?  It has the feel of optimizing for a benchmark.

There are several real-world benefits from transforming back and forth between
library calls for this kind of small standard library function.  One is that
turning explicit code into library calls can give smaller code - often of
importance in small embedded targets.  Sometimes it can also result in run-time
improvements, especially for larger data sizes - user-written code might just
copy byte by byte, while the library implementation uses more efficient larger
blocks.

Another is that turning library calls into inlined code can speed up code by
using additional knowledge of sizes, alignment, etc., to get faster results. 
This is most obvious for calls to memcpy() or memmove(), which can sometimes be
required to get the semantics correct for type manipulation, but may generate
no actual code at all.

A "C implementation" consists of a compiler and a standard library in tandem. 
The C library can make use of its knowledge of the C compiler, and any special
features, in its implementation.  (This is, in fact, required - some things in
the standard library cannot be implemented in "pure" C.)  The C compiler can
make use of its knowledge of the library implementation in its code generation
or analysis.  For the most part, compilers only make use of their knowledge of
the specifications of standard library functions, but they might also use
implementation details.

This means it is quite legitimate for the GCC team to say that gcc requires a C
library that does not set errno except for functions that explicitly say so in
their specifications.  Users don't get to mix and match random compilers and
random standard libraries and assume they form a conforming C implementation -
the pair must always be checked for compatibility.

The question then is if this would be an onerous requirement for standard
library implementations - do common existing libraries set errno in functions
that don't require it?  I cannot say, but I would be very surprised if they
did.  Modern thought, AFAIUI, considers errno to be a bad idea which should be
avoided whenever possible - it is a hinder to optimisation, analysis, and
parallelisation of code, as well as severely limiting C++ constexpr and other
compile-time calculations.

My thoughts here are that GCC should make this library requirement explicit and
public, after first confirming with some "big name" libraries like glibc,
newlib and muslc.  They could also add a flag "-funknown-stdlib" to disable any
transforms back or forth between standard library calls, and assume nothing
about the calls (not even what is given in the standards specifications).


(As a note - the paragraph 7.5p3 allowing standard library functions to set
errno is still in the current draft of C23.)

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2023-12-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Richard Biener  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113082

--- Comment #50 from Richard Biener  ---
Split out to PR113082.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2023-12-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #49 from Richard Biener  ---
(In reply to M Welinder from comment #48)
> It's your (1).  gcc is changing a program that can rely on errno not being
> changed to one where the C library can change it.  (The current C library or
> any future library that the resulting binary may be dynamically linked
> against.)

Ick.  Standards continue to surprise me ;)

> Is there any real-world situation that benefits from introducing these
> calls?  It has the feel of optimizing for a benchmark.

People are good in writing inefficient code and replacing say, an open
coded strlen by an actual call to strlen enables followup transforms
that rely on strlen appearing as strlen and not an open-coded variant
(I realize that technically one might find a way to implement that without
actually emitting a call in the end).

And yes, optimizing (repeated) calls of strlen or replacing open-coded
large memcpy by a library call to optimized functions can make a noticable
difference even for non-benchmarks.

We're currently generating calls to memcpy, memmove, memset and strlen.

We are also replacing memmove with memcpy, printf with puts or putc, all
of those transforms are then invalid because of (1) as well.

We are treating -fno-math-errno as applying to non-math functions and
we don't have any -fno-errno or way of analyzing/annotating whether a
program is interested in the state of errno (not only but mainly because
identifying accesses to errno is non-trivial).

Note this issue (invalid because of (1)) should probably be split out
to a separate bug.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2023-12-18 Thread terra at gnome dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #48 from M Welinder  ---
It's your (1).  gcc is changing a program that can rely on errno not being
changed to one where the C library can change it.  (The current C library or
any future library that the resulting binary may be dynamically linked
against.)

Consider code like this

   fd = open(filename, ...);
   if (fd < 0) {
 fprintf(stderr, "%*s: %s\n",
 MIN(20, mystrlen (filename)), ;
 filename,
 strerror(errno));
 ...;
   }

If the C library is in a bad mood you will print the wrong error message.

strlen isn't the obvious candidate for a C library function changing errno, but
I can see an instrumented library do it.

Is there any real-world situation that benefits from introducing these calls? 
It has the feel of optimizing for a benchmark.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2023-12-18 Thread david at westcontrol dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #47 from David Brown  ---
(In reply to M Welinder from comment #46)
> Should "-std=c99" imply turning off these optimizations?
> 
> Creating calls to, say, strlen is incompatible with the C99 standard and
> perhaps better limited to "-std=gnu-something" or an opt-in f-flag.

How is it incompatible with C99 to create calls to library functions?  I can
think of a two possibilities:

1. If the function implementation plays with errno (allowed in 7.5p3), in a way
that is visible to the code.

2. If the function is called with parameters that may invoke undefined
behaviour (such as calling "strlen" without being sure that the parameter
points to a null-terminated string), where such undefined behaviour is not
already present.

If the user writes code that acts like a call to strlen (let's assume the
implementation knows strlen does not change errno), then the compiler can
replace it with a library call.  Similarly, if the user writes a call to
strlen, then the compiler can replace it with inline code.

As long as there is no difference in the observable behaviour, the
transformation is allowed.

Or am I missing something here?

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2023-12-18 Thread terra at gnome dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #46 from M Welinder  ---
Should "-std=c99" imply turning off these optimizations?

Creating calls to, say, strlen is incompatible with the C99 standard and
perhaps better limited to "-std=gnu-something" or an opt-in f-flag.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2022-10-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Andrew Pinski  changed:

   What|Removed |Added

 CC||michael.meier at hexagon dot 
com

--- Comment #45 from Andrew Pinski  ---
*** Bug 107415 has been marked as a duplicate of this bug. ***

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2022-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Andrew Pinski  changed:

   What|Removed |Added

 CC||hiraditya at msn dot com

--- Comment #44 from Andrew Pinski  ---
*** Bug 105830 has been marked as a duplicate of this bug. ***

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2020-08-16 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Andrew Pinski  changed:

   What|Removed |Added

 CC||rafael_andreas at hotmail dot 
com

--- Comment #43 from Andrew Pinski  ---
*** Bug 96628 has been marked as a duplicate of this bug. ***

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2019-10-18 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #42 from Eric Gallager  ---
(In reply to Rich Felker from comment #41)
> > Josef Wolf mentioned that he ran into this on the gcc-help mailing list 
> > here: https://gcc.gnu.org/ml/gcc-help/2019-10/msg00079.html
> 
> I don't think that's an instance of this issue.

Well ok, maybe not THAT message specifically; see the rest of the thread
though.

> It's normal/expected that __builtin_foo compiles to a call to foo in the
> absence of factors that lead to it being optimized to something simpler.
> The idiom of using __builtin_foo to get the compiler to emit an optimized
> implementation of foo for you, to serve as the public definition of foo, is
> simply not valid. That's kinda a shame because it would be nice to be able to
> do it for lots of math library functions, but of course in order for this to 
> be
> able to work gcc would have to promise it can generate code for the operation
> for all targets, which is unlikely to be reasonable.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2019-10-18 Thread bugdal at aerifal dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #41 from Rich Felker  ---
> Josef Wolf mentioned that he ran into this on the gcc-help mailing list here: 
> https://gcc.gnu.org/ml/gcc-help/2019-10/msg00079.html

I don't think that's an instance of this issue. It's normal/expected that
__builtin_foo compiles to a call to foo in the absence of factors that lead to
it being optimized to something simpler. The idiom of using __builtin_foo to
get the compiler to emit an optimized implementation of foo for you, to serve
as the public definition of foo, is simply not valid. That's kinda a shame
because it would be nice to be able to do it for lots of math library
functions, but of course in order for this to be able to work gcc would have to
promise it can generate code for the operation for all targets, which is
unlikely to be reasonable.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2019-10-18 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #40 from Eric Gallager  ---
Josef Wolf mentioned that he ran into this on the gcc-help mailing list here:
https://gcc.gnu.org/ml/gcc-help/2019-10/msg00079.html

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2019-06-14 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #39 from Eric Gallager  ---
(In reply to Richard Biener from comment #35)
> Let's try "fixing" this finally for GCC 6.

Uh... for GCC 10 now?

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2017-11-05 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #38 from Marc Glisse  ---
*** Bug 82845 has been marked as a duplicate of this bug. ***

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2017-11-05 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Marc Glisse  changed:

   What|Removed |Added

 CC||david at westcontrol dot com

--- Comment #37 from Marc Glisse  ---
*** Bug 82845 has been marked as a duplicate of this bug. ***

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2016-04-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Richard Biener  changed:

   What|Removed |Added

 CC||knakahara at netbsd dot org

--- Comment #36 from Richard Biener  ---
*** Bug 70798 has been marked as a duplicate of this bug. ***

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2016-02-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #35 from Richard Biener  ---
Let's try "fixing" this finally for GCC 6.  Still waiting for Honza for comment
#27 (lets put that in a symtab->equal_to (enum built_in_function) function).

Similar issue is present for malloc + memset -> calloc.

[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-11-02 Thread fd935653 at opayq dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Evan Langlois fd935653 at opayq dot com changed:

   What|Removed |Added

 CC||fd935653 at opayq dot com

--- Comment #34 from Evan Langlois fd935653 at opayq dot com ---
Grub-2.00 (grub-mkimage utility) will crash with -O3 because of this bug, using
gcc 4.8.2.  GDB shows it going into an infinite loop calling memset() until it
segfaults.  I added the -fno-tree-loop-distribute-patterns and it created code
that doesn't barf on itself.

This is definately a bug and a pretty serious one.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-06-06 Thread terra at gnome dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

M Welinder terra at gnome dot org changed:

   What|Removed |Added

 CC||terra at gnome dot org

--- Comment #31 from M Welinder terra at gnome dot org ---
Extra complication: the C library's memcpy may change errno to any non-zero
value if it so desires.  (C99 section 7.5 #5.)

That means that raw calls to memcpy (and friends) cannot be generated anywhere
where the compiler is unable to prove that the value of errno isn't used.
Extra code to store and restore errno must be emitted otherwise.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-06-06 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #32 from rguenther at suse dot de rguenther at suse dot de ---
On Fri, 6 Jun 2014, terra at gnome dot org wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888
 
 M Welinder terra at gnome dot org changed:
 
What|Removed |Added
 
  CC||terra at gnome dot org
 
 --- Comment #31 from M Welinder terra at gnome dot org ---
 Extra complication: the C library's memcpy may change errno to any non-zero
 value if it so desires.  (C99 section 7.5 #5.)

That's news to me.

 That means that raw calls to memcpy (and friends) cannot be generated anywhere
 where the compiler is unable to prove that the value of errno isn't used.

That's almost impossible.

 Extra code to store and restore errno must be emitted otherwise.

That is not possible.

Note that the compiler emits calls to memcpy for struct copies anyway,
so if there is a problem it is a long-standing one.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-06-06 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #33 from Jakub Jelinek jakub at gcc dot gnu.org ---
Yeah, I'd say we could document that gcc doesn't support any implementations
where memcpy/memmove/memset clobber errno.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-05-06 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #29 from Richard Biener rguenth at gcc dot gnu.org ---
(In reply to Rich Felker from comment #28)
 On Tue, Apr 29, 2014 at 02:16:38PM +, rguenth at gcc dot gnu.org wrote:
  Honza, is there a more fancy way of doing this?
 
 The only correct way to fix this is to honor -ffreestanding and never
 generate references to hosted-C functions (which include memset) when
 -ffreestanding is used.

Done that for 4.8+ now (bah, forgot to reference the PR in the changelog
so the commits don't appear here).  But I still like to fix the obvious
wrong cases in some way.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-05-06 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #30 from Richard Biener rguenth at gcc dot gnu.org ---
Thus, from 4.8.3, 4.9.1 and 4.10.0 on -ffreestanding, -fno-hosted and
-fno-builtin
will cause -ftree-loop-distribute-patterns to _not_ be enabled by default
with -O3+ (you can still enable it manually).


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-04-29 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||exmortis at yandex dot ru

--- Comment #25 from Richard Biener rguenth at gcc dot gnu.org ---
*** Bug 60998 has been marked as a duplicate of this bug. ***


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-04-29 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #26 from Richard Biener rguenth at gcc dot gnu.org ---
(In reply to Janosch Rux from comment #24)
 When upgrading our build environment we ran into this. We worked around the
 way mentioned in the comments.
 
 No Problems with: 4.6.3 
 Broken with:  4.8.2

-ftree-loop-distribute-patterns is on by default at -O3 since GCC 4.6, a
change from GCC 4.5 and before which needed explicit enabling of this.

More recent GCC may have become more clever in recognizing them though
(for example non-zero memset support is quite recent).


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-04-29 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #27 from Richard Biener rguenth at gcc dot gnu.org ---
Ok, so looking at this again.

We don't have a cgraph node for builtin_decl_(implicit|explicit)
(BUILT_IN_MEMSET).

But it seems that decl has DECL_ASSEMBLER_NAME_SET_P (not sure if set
correctly though).

So we can use symtab_node_for_asm (DECL_ASSEMBLER_NAME ()) and
eventually get to symtab_alias_target of that and check if it
is equal to the current function.

Index: gcc/tree-loop-distribution.c
===
--- gcc/tree-loop-distribution.c(revision 209892)
+++ gcc/tree-loop-distribution.c(working copy)
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.
 #include tree-pass.h
 #include gimple-pretty-print.h
 #include tree-vectorizer.h
+#include cgraph.h


 /* A Reduced Dependence Graph (RDG) vertex representing a statement.  */
@@ -1084,6 +1085,15 @@ classify_partition (loop_p loop, struct
  || !dominated_by_p (CDI_DOMINATORS,
  loop-latch, gimple_bb (stmt)))
return;
+  tree fn = builtin_decl_implicit (BUILT_IN_MEMSET);
+  if (DECL_ASSEMBLER_NAME_SET_P (fn))
+   {
+ symtab_node *n1 = symtab_node_for_asm (DECL_ASSEMBLER_NAME (fn));
+ symtab_node *n2 = symtab_get_node (cfun-decl);
+ if (n1 == n2
+ || (n1-alias  symtab_alias_target (n1) == n2))
+   return;
+   }
   partition-kind = PKIND_MEMSET;
   partition-main_dr = single_store;
   partition-niter = nb_iter;


fixes the following testcase:

typedef __SIZE_TYPE__ size_t;
extern void *
memset (void *s, int c, size_t n) __attribute__ ((weak, alias(_memset)));

void *
_memset(void *s, int c, size_t n)
{
  char *q = (char *)s;
  while (n != 0)
{
  *(q++) = c;
  n--;
}
}

it won't fix glibc as that uses

asm(.alias );

for the aliases which we don't parse.

It of course also fixes the very direct recursion.  At least if
the assembler name of the builtin agrees with that of the function.

Honza, is there a more fancy way of doing this?


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-04-29 Thread bugdal at aerifal dot cx
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #28 from Rich Felker bugdal at aerifal dot cx ---
On Tue, Apr 29, 2014 at 02:16:38PM +, rguenth at gcc dot gnu.org wrote:
 Honza, is there a more fancy way of doing this?

The only correct way to fix this is to honor -ffreestanding and never
generate references to hosted-C functions (which include memset) when
-ffreestanding is used.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2014-02-16 Thread janosch.rux at web dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Janosch Rux janosch.rux at web dot de changed:

   What|Removed |Added

 CC||janosch.rux at web dot de

--- Comment #24 from Janosch Rux janosch.rux at web dot de ---
When upgrading our build environment we ran into this. We worked around the way
mentioned in the comments.

No Problems with: 4.6.3 
Broken with:  4.8.2


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-10-02 Thread bernd.edlinger at hotmail dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Bernd Edlinger bernd.edlinger at hotmail dot de changed:

   What|Removed |Added

 CC||bernd.edlinger at hotmail dot 
de

--- Comment #20 from Bernd Edlinger bernd.edlinger at hotmail dot de ---
Just for the record:
This happens also for eCos on ARM
but only if it is compiled with -O3 and not with -O2.
We certainly need a way to tell GCC if this kind of
optimization is OK for us.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-10-02 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #21 from Richard Biener rguenth at gcc dot gnu.org ---
-fno-tree-loop-distribute-patterns is the reliable way to not transform loops
into library calls.

As of the trivial case of generating a recursion - yes, that's reasonably
easy to avoid in simple cases.  But if you consider

t1.c


mymemcpy_impl (...)
{
  for (...)
   ...
}

t2.c


memcpy ()
{
  mymemcpy_impl ()
}

then it's no longer possible to detect conservatively without severely
restricting the set of functions we can operate on.

Not sure if/how other compilers avoid the above situation (or if they
do this at all or rather use private entries into the library functions).


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-10-02 Thread bernd.edlinger at hotmail dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #22 from Bernd Edlinger bernd.edlinger at hotmail dot de ---
(In reply to Richard Biener from comment #21)
 -fno-tree-loop-distribute-patterns is the reliable way to not transform loops
 into library calls.

Thanks!

Adding this fixed the generated code:

#pragma GCC optimize (no-tree-loop-distribute-patterns)

BTW their memset.c looks like this:

externC void *
memset( void *s, int c, size_t n ) __attribute__ ((weak, alias(_memset)));

void *
_memset( void *s, int c, size_t n )
{
  while (...)
}


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-10-02 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #23 from rguenther at suse dot de rguenther at suse dot de ---
On Wed, 2 Oct 2013, bernd.edlinger at hotmail dot de wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888
 
 --- Comment #22 from Bernd Edlinger bernd.edlinger at hotmail dot de ---
 (In reply to Richard Biener from comment #21)
  -fno-tree-loop-distribute-patterns is the reliable way to not transform 
  loops
  into library calls.
 
 Thanks!
 
 Adding this fixed the generated code:
 
 #pragma GCC optimize (no-tree-loop-distribute-patterns)
 
 BTW their memset.c looks like this:
 
 externC void *
 memset( void *s, int c, size_t n ) __attribute__ ((weak, alias(_memset)));
 
 void *
 _memset( void *s, int c, size_t n )
 {
   while (...)
 }

I suspect this is the most common form - glibc also uses aliases but
IIRC they are using global asms for them :/

The above would be still detectable with the new symbol table / alias
handling in GCC 4.9 (and maybe 4.8, I'm not sure).  So it may be
worth special-casing the direct recursion case as a QOI measure.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-27 Thread bugdal at aerifal dot cx
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Rich Felker bugdal at aerifal dot cx changed:

   What|Removed |Added

 CC||bugdal at aerifal dot cx

--- Comment #19 from Rich Felker bugdal at aerifal dot cx ---
We are not presently experiencing this issue in musl libc, probably because the
current C memcpy code is sufficiently overcomplicated to avoid getting detected
by the optimizer as memcpy. However, I'm trying to switch to a new simpler
implementation that's much faster when compiled with GCC 4.7.1 (on ARM), but
hit this bug when testing on another system using GCC 4.6.1 (ARM). On the
latter, even -fno-tree-loop-distribute-patterns does not make any difference.
Unless there's a reliable workaround for this bug or at least a known blacklist
of bad GCC versions where this bug can't be worked around, I'm afraid we're
going to have to resort to generating the asm for each supported arch using a
known-good GCC and including that asm in the distribution.

This is EXTREMELY frustrating.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-18 Thread pa...@matos-sorge.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #18 from Paulo J. Matos pa...@matos-sorge.com ---
I notice(In reply to Brooks Moses from comment #12)
 
 Now, if this replacement still happens when you compile with -nostdlib, that
 would be a bug since it becomes legal code in that case.  But that's
 somewhat of a separate issue and should be filed separately if it happens. 
 (We should arguably also have a test for it, if we don't already.)


I noticed this in the gcc testsuite with my port. File
./gcc.c-torture/execute/builtins/lib/memset.c contains an implementation of
memset called memset and gcc goes into recursion when it finds this for the
reasons mentioned above. This causes builtin/memset test to fail.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread pa...@matos-sorge.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #11 from Paulo J. Matos pa...@matos-sorge.com ---
(In reply to Brooks Moses from comment #10)
 Other than the documentation issues, this seems like a non-bug.

A non-bug? If you write a memcpy function by hand and call it memcpy, gcc
replaces the function body by a call to memcpy which generates an infinite
loop. How come it's a non-bug?


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread brooks at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #12 from Brooks Moses brooks at gcc dot gnu.org ---
(In reply to Paulo J. Matos from comment #11)
 A non-bug? If you write a memcpy function by hand and call it memcpy, gcc
 replaces the function body by a call to memcpy which generates an infinite
 loop. How come it's a non-bug?

Because if you do that you're invoking undefined behavior.  There's already a
memcpy function in the standard library, so naming your own function memcpy
violates the one-definition-per-function rule.  Even if it worked, naming
your own function memcpy would likely break other standard library functions
that call the real memcpy.

Now, if this replacement still happens when you compile with -nostdlib, that
would be a bug since it becomes legal code in that case.  But that's somewhat
of a separate issue and should be filed separately if it happens.  (We should
arguably also have a test for it, if we don't already.)


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread xanclic at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #13 from Max Reitz xanclic at gmail dot com ---
(In reply to Brooks Moses from comment #12)
 Now, if this replacement still happens when you compile with -nostdlib, that
 would be a bug since it becomes legal code in that case.  But that's
 somewhat of a separate issue and should be filed separately if it happens. 
 (We should arguably also have a test for it, if we don't already.)

Actually, that's why I filed this report in the first place. The test case you
request is in fact given in my OP.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread sch...@linux-m68k.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #14 from Andreas Schwab sch...@linux-m68k.org ---
The relevant option is -ffreestanding, not -nostdlib.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread xanclic at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #15 from Max Reitz xanclic at gmail dot com ---
(In reply to Andreas Schwab from comment #14)
 The relevant option is -ffreestanding, not -nostdlib.

If you're referring to me, I'll be glad to cite my OP for you :D

 Compiling the attached trivial memcpy implementation with -O3 -ffreestanding
 -fno-builtin -nodefaultlibs -nostdlib yields a memcpy which calls itself.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread sch...@linux-m68k.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #16 from Andreas Schwab sch...@linux-m68k.org ---
That's exactly what I wrote.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-17 Thread xanclic at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #17 from Max Reitz xanclic at gmail dot com ---
(In reply to Andreas Schwab from comment #16)
 That's exactly what I wrote.

Ah, okay, sorry I misunderstood.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-07-16 Thread brooks at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Brooks Moses brooks at gcc dot gnu.org changed:

   What|Removed |Added

 CC||brooks at gcc dot gnu.org

--- Comment #10 from Brooks Moses brooks at gcc dot gnu.org ---
FWIW, this issue also affected GLIBC.  Pointer to discussion, along with fixes,
here:
http://sourceware.org/ml/libc-alpha/2013-07/msg00306.html

It seems to me -- based on my own experience, as well as Max's -- that the
-ftree-distribute-patterns documentation could be notably improved.  In my
case, I read it clearly and understood it to mean that it was only responsible
for the loop-distribution portion of the rearrangement in the code examples,
and that the replacement of a loop by a memcpy call was some other optimization
pass.

Other than the documentation issues, this seems like a non-bug.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-06-22 Thread j...@deseret-tech.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

Jeff Cook j...@deseret-tech.com changed:

   What|Removed |Added

 CC||j...@deseret-tech.com

--- Comment #9 from Jeff Cook j...@deseret-tech.com ---
FYI this issue affects WINE. A workaround has been contributed as a
modification to WINE's configuration scripts. See
http://bugs.winehq.org/show_bug.cgi?id=33521 .


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-11 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



Richard Biener rguenth at gcc dot gnu.org changed:



   What|Removed |Added



 Status|ASSIGNED|NEW

 AssignedTo|rguenth at gcc dot gnu.org  |unassigned at gcc dot

   ||gnu.org



--- Comment #8 from Richard Biener rguenth at gcc dot gnu.org 2013-04-11 
11:29:39 UTC ---

-fno-builtin-XXX does not prevent GCC from emitting calls to XXX.  It only

makes GCC not assume anything about existing calls to XXX.



For example to avoid transforming printf to puts in



extern int printf(const char *, ...);

int main()

{

  printf (Hello World\n);

  return 0;

}



it does not work to specify -fno-builtin-puts, but instead you need to

provide -fno-builtin-printf.



Note that -fno-builtin only prevents the C family parsers from recognizing

XXX as builtin decls.  The fact that -fno-builtin was specified or not

cannot be queried in any way from the middle-end.



I consider the inability to specify this to the GCC middle-end as bug

but I am not going to work on it.  The requirement to be able to

generate calls to memset. memcpy and memmove is deep-rooted into

code-expansion as well for aggregate init and assignment.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread mikpe at it dot uu.se


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



Mikael Pettersson mikpe at it dot uu.se changed:



   What|Removed |Added



 CC||mikpe at it dot uu.se



--- Comment #1 from Mikael Pettersson mikpe at it dot uu.se 2013-04-09 
07:52:07 UTC ---

I can reproduce the problem on x86-64 Linux with 4.8-20130404.  This issue

would be fatal for one of my projects which includes an embedded libc.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread mikpe at it dot uu.se


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



--- Comment #2 from Mikael Pettersson mikpe at it dot uu.se 2013-04-09 
09:59:20 UTC ---

Started with Richard Biener's http://gcc.gnu.org/r188261 aka PR53081 fix, which

added or improved memcpy recognition.  I'm guess the new code fails to check

for whatever option is supposed to disable this sort of transformation.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



Jakub Jelinek jakub at gcc dot gnu.org changed:



   What|Removed |Added



 CC||jakub at gcc dot gnu.org



--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2013-04-09 
10:01:31 UTC ---

Just add -fno-tree-loop-distribute-patterns to the already long list of options

you need for compilation of certain routines in your C library.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



Richard Biener rguenth at gcc dot gnu.org changed:



   What|Removed |Added



 Status|UNCONFIRMED |ASSIGNED

   Last reconfirmed||2013-04-09

 AssignedTo|unassigned at gcc dot   |rguenth at gcc dot gnu.org

   |gnu.org |

 Ever Confirmed|0   |1



--- Comment #4 from Richard Biener rguenth at gcc dot gnu.org 2013-04-09 
10:01:48 UTC ---

I will have a look.


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread xanclic at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



--- Comment #5 from Max Reitz xanclic at gmail dot com 2013-04-09 13:02:19 
UTC ---

(In reply to comment #3)

 Just add -fno-tree-loop-distribute-patterns to the already long list of 
 options

 you need for compilation of certain routines in your C library.



This works for me, however, I don't see this parameter documented in gcc's

manpage (which might prove helpful).


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread rguenther at suse dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888



--- Comment #6 from rguenther at suse dot de rguenther at suse dot de 
2013-04-09 13:17:10 UTC ---

On Tue, 9 Apr 2013, xanclic at gmail dot com wrote:



 

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

 

 --- Comment #5 from Max Reitz xanclic at gmail dot com 2013-04-09 13:02:19 
 UTC ---

 (In reply to comment #3)

  Just add -fno-tree-loop-distribute-patterns to the already long list of 
  options

  you need for compilation of certain routines in your C library.

 

 This works for me, however, I don't see this parameter documented in gcc's

 manpage (which might prove helpful).



It is documented in it's positive form, -ftree-loop-distribute-patterns


[Bug middle-end/56888] memcpy implementation optimized as a call to memcpy

2013-04-09 Thread xanclic at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

--- Comment #7 from Max Reitz xanclic at gmail dot com 2013-04-09 13:20:06 
UTC ---
(In reply to comment #6)
 On Tue, 9 Apr 2013, xanclic at gmail dot com wrote:
 
  
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888
  
  --- Comment #5 from Max Reitz xanclic at gmail dot com 2013-04-09 
  13:02:19 UTC ---
  (In reply to comment #3)
   Just add -fno-tree-loop-distribute-patterns to the already long list of 
   options
   you need for compilation of certain routines in your C library.
  
  This works for me, however, I don't see this parameter documented in gcc's
  manpage (which might prove helpful).
 
 It is documented in it's positive form, -ftree-loop-distribute-patterns

Oh, now that's embarrassing…

Sorry :-/

Well then, this seems to be exactly the thing I've been looking for. Thanks!