Hi Keith, 1) > >> GCC never inlines a function that calls alloca() or > >> __builtin_alloca(). > > With respect, while that may be true of ancient historical versions of > GCC, my own testing shows it to be untrue for modern releases
Indeed. My statement above is true for gcc 2.95.3, but wrong for gcc 3.1 to 8.3.0. Tested with this program: ============================================================= #include <alloca.h> #include <stdio.h> #include <stdlib.h> static __attribute__((__always_inline__)) int square (int x) { alloca (1); return x * x; } int main (int argc, char *argv[]) { int n = atoi (argv[1]); int x = atoi (argv[2]); int i; for (i = 0; i < n; i++) printf ("%d -> %d\n", x, square (x)); return 0; } ============================================================= compiled with "-S -Winline -O -finline-functions foo.c". 2) > >> The reason is that if you call this function in a loop, then > >> without inlining it will consume a bounded amount of stack whereas > >> with inlining it might cause a stack overflow. > > I'm sorry to say this, but that just seems disingenuous. Don't call me "disingenuous", because this is a personal attack. Just because I make a statement based on my understanding of the semantics of inline functions, and you make a different statement based on testing with GCC, that just makes my statement wrong. Don't assume bad intentions!!! > For every > version of GCC which I have available, *including* mingw32-gcc-3.4.5, if > I place a call to alloca(), *implemented as a macro*, within a loop, > then the stack grows on each iteration of the loop. The stack grows at each iteration of the loop, if - alloca is implemented as a macro, - or if alloca is implemented as an inline function AND the compiler is GCC. The stack does NOT grow at each iteration of the loop, if - alloca is implemented as an inline function AND the compiler is clang. Tested with this program: =========================================================== #include <alloca.h> #include <stdio.h> #include <stdlib.h> #ifdef AS_MACRO #define my_alloca(x) __builtin_alloca (x) #else static __attribute__((__always_inline__)) inline char * my_alloca (int x) { return __builtin_alloca (x); } #endif int main (int argc, char *argv[]) { int n = atoi (argv[1]); int i; for (i = 0; i < n; i++) { char *y = my_alloca (1000); printf ("%p\n", y); } return 0; } =========================================================== These invocations show 10 different values for variables on the stack: $ gcc-version 8.3.0 -m32 -Winline -O2 -finline-functions foo.c && ./a.out 10 $ gcc-version 8.3.0 -m32 -Winline -O2 -finline-functions -DAS_MACRO foo.c && ./a.out 10 $ gcc-version 4.2.4 -m32 -Winline -O2 -finline-functions foo.c && ./a.out 10 $ gcc-version 4.2.4 -m32 -Winline -O2 -finline-functions -DAS_MACRO foo.c && ./a.out 10 $ gcc-version 8.3.0 -m32 -Winline -O0 foo.c && ./a.out 10 $ gcc-version 8.3.0 -m32 -Winline -O0 -DAS_MACRO foo.c && ./a.out 10 $ gcc-version 4.2.4 -m32 -Winline -O0 foo.c && ./a.out 10 $ gcc-version 4.2.4 -m32 -Winline -O0 -DAS_MACRO foo.c && ./a.out 10 $ clang -m32 -O0 -DAS_MACRO foo.c && ./a.out 10 $ clang -m32 -O2 -finline-functions -DAS_MACRO foo.c && ./a.out 10 On the other hand: $ clang -m32 -O0 foo.c && ./a.out 10 0xff992010 0xff992010 0xff992010 0xff992010 0xff992010 0xff992010 0xff992010 0xff992010 0xff992010 0xff992010 $ clang -m32 -O2 -finline-functions foo.c && ./a.out 10 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c 0xffeb513c In summary, your inline-function based implementation of alloca works as long as the compiler is GCC. It will break when someone attempts to use clang with mingw. 3) Back to Eli's problem. > [1] Well, I actually can reproduce it, even with mingw32-gcc-8.2.0, but > *only* when I deliberately introduce the very bug which Eli has noted, > *and* I compile at -O0; at any other optimization level, this gnulib bug > seems to be optimized away, with __builtin_alloca() itself being > expanded to in-line intrinsic code, as intended. Thanks for the additional detail that it depends on the optimization level. > The other part of the puzzle is that stdlib.h does this: > > # include "alloca.h" > > so Gnulib's alloca.h is bypassed. So, Gnulib's alloca.h and the system's alloca.h are both included, and the system's alloca.h comes last. To avoid this kind of trouble, we need to make use of '#include_next <alloca.h>'. I think this patch should do it. Can you please review it, Eli? (Since it includes <alloca.h> only when GCC or clang is present, it can assume include_next. Since it can assume include_next, it does not need the absolute file name of the system's <alloca.h>, and therefore it is irrelevant whether the file is empty after preprocessing or not.) diff --git a/lib/alloca.in.h b/lib/alloca.in.h index 8aaf64d..84d92e6 100644 --- a/lib/alloca.in.h +++ b/lib/alloca.in.h @@ -36,6 +36,12 @@ #ifndef alloca # ifdef __GNUC__ + /* Some version of mingw have an <alloca.h> that causes trouble when + included after 'alloca' gets defined as a macro. As a workaround, include + this <alloca.h> first and define 'alloca' as a macro afterwards. */ +# if (defined _WIN32 && ! defined __CYGWIN__) && @HAVE_ALLOCA_H@ +# include_next <alloca.h> +# endif # define alloca __builtin_alloca # elif defined _AIX # define alloca __alloca diff --git a/m4/alloca.m4 b/m4/alloca.m4 index 46d60f9..29bd289 100644 --- a/m4/alloca.m4 +++ b/m4/alloca.m4 @@ -1,4 +1,4 @@ -# alloca.m4 serial 14 +# alloca.m4 serial 15 dnl Copyright (C) 2002-2004, 2006-2007, 2009-2019 Free Software Foundation, dnl Inc. dnl This file is free software; the Free Software Foundation @@ -37,6 +37,13 @@ AC_DEFUN([gl_FUNC_ALLOCA], fi AC_SUBST([ALLOCA_H]) AM_CONDITIONAL([GL_GENERATE_ALLOCA_H], [test -n "$ALLOCA_H"]) + + if test $ac_cv_working_alloca_h = yes; then + HAVE_ALLOCA_H=1 + else + HAVE_ALLOCA_H=0 + fi + AC_SUBST([HAVE_ALLOCA_H]) ]) # Prerequisites of lib/alloca.c. diff --git a/modules/alloca-opt b/modules/alloca-opt index d4468de..53bb28d 100644 --- a/modules/alloca-opt +++ b/modules/alloca-opt @@ -21,7 +21,7 @@ if GL_GENERATE_ALLOCA_H alloca.h: alloca.in.h $(top_builddir)/config.status $(AM_V_GEN)rm -f $@-t $@ && \ { echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */'; \ - cat $(srcdir)/alloca.in.h; \ + sed -e 's|@''HAVE_ALLOCA_H''@|$(HAVE_ALLOCA_H)|g' < $(srcdir)/alloca.in.h; \ } > $@-t && \ mv -f $@-t $@ else