RFC: too aggressive nanosleep replacement on 64 bit Linux?
I noticed that nanosleep() was replaced on 64 bit Linux which is due to gnulib checking for the full potential 292 billion years on 64 bit time_t but the kernel supporting only 292 years due to: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/time.h?id=refs/tags/v3.16#n87 Should we be more conservative with our replacement, and be happy with 292 years? cheers, Pádraig.
fpieee module adds inappropriate compiler flags to CPPFLAGS
Hi, we recently encountered a build failure of Octave on alpha [1,2] due to an added indirect dependency on the fpieee gnulib module, which appends compiler options to CPPFLAGS that don't necessarily belong there. This module apparently adds either -mieee or -ieee to CPPFLAGS when building on certain systems. These are decidedly not preprocessor flags, but are likely added this way as a shortcut to ensure that they are used when compiling any language. However, the working assumption that what works for $(CC) will work for anything that takes CPPFLAGS is not necessarily true. For example, even invoking $(CPP) $(CPPFLAGS) using gcc would fail with an error in this environment. In the case of Octave, it fails because CPPFLAGS are passed on to other tools (such as Qt moc) that will work with standard preprocessor options like -D and -I, but not -mieee. Our workaround is to simply filter these options out of CPPFLAGS, because Octave has already had its own logic to append the appropriate compiler options to CFLAGS and CXXFLAGS for several years. I'm not sure what an appropriate fix for this would be in the general case, but ideally it would avoid adding these options to project-wide CPPFLAGS where they really don't belong. If it were possible to add the -mieee or -ieee option to language-specific FLAGS variables used by a given project, that would be a better solution. Less ideally, it could stuff the option into an IEEE_CFLAGS variable, which a project would then have to know about and explicitly use or add to its own CFLAGS/CXXFLAGS/etc, potentially breaking existing uses. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=746924 [2] http://savannah.gnu.org/bugs/?42839 Thoughts? Thanks, -- mike
[PATCH 1/2] accept: document Solaris 10 type glitch
* doc/posix-functions/accept.texi (accept): Mention that Solaris 10 'accept' takes void * last arg, not socklen_t *. --- ChangeLog | 6 ++ doc/posix-functions/accept.texi | 4 2 files changed, 10 insertions(+) diff --git a/ChangeLog b/ChangeLog index bdf743a..43b8fe3 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2014-08-05 Paul Eggert egg...@cs.ucla.edu + + accept: document Solaris 10 type glitch + * doc/posix-functions/accept.texi (accept): Mention that + Solaris 10 'accept' takes void * last arg, not socklen_t *. + 2014-08-04 Paul Eggert egg...@cs.ucla.edu extern-inline: port to FreeBSD, DragonFly diff --git a/doc/posix-functions/accept.texi b/doc/posix-functions/accept.texi index b937e15..65dab37 100644 --- a/doc/posix-functions/accept.texi +++ b/doc/posix-functions/accept.texi @@ -28,4 +28,8 @@ in calls to @code{read}, @code{write}, and @code{close}; you have to use @item Some platforms don't have a @code{socklen_t} type; in this case this function's third argument type is @samp{int *}. +@item +On some platforms, this function's third argument type is @samp{void *}, +not @samp{socklen_t *}: +Solaris 10. @end itemize -- 1.9.3
[PATCH 2/2] sys_select: fix FD_ZERO problem on Solaris 10
* lib/sys_select.in.h: Fix Solaris 10 bug where #include sys/time.h followed by #include sys/select.h caused FD_ZERO to expand to an expression that invoked memset without necessarily including string.h. The problem was that the first include defined _SYS_TIME_H, causing the second include to short-circuit. Fix a similar problem with sys/types.h followed by sys/select.h. Also, fix what appears to be a cut-and-paste typo, by replacing _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TIME_H with _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TYPES_H. --- ChangeLog | 11 +++ lib/sys_select.in.h | 15 --- 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/ChangeLog b/ChangeLog index 43b8fe3..8e74a54 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,16 @@ 2014-08-05 Paul Eggert egg...@cs.ucla.edu + sys_select: fix FD_ZERO problem on Solaris 10 + * lib/sys_select.in.h: Fix Solaris 10 bug where #include + sys/time.h followed by #include sys/select.h caused FD_ZERO + to expand to an expression that invoked memset without necessarily + including string.h. The problem was that the first include + defined _SYS_TIME_H, causing the second include to short-circuit. + Fix a similar problem with sys/types.h followed by sys/select.h. + Also, fix what appears to be a cut-and-paste typo, by replacing + _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TIME_H with + _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TYPES_H. + accept: document Solaris 10 type glitch * doc/posix-functions/accept.texi (accept): Mention that Solaris 10 'accept' takes void * last arg, not socklen_t *. diff --git a/lib/sys_select.in.h b/lib/sys_select.in.h index 6ac7b08..1186f68 100644 --- a/lib/sys_select.in.h +++ b/lib/sys_select.in.h @@ -24,8 +24,8 @@ On Cygwin, sys/time.h includes sys/select.h. Simply delegate to the system's header in this case. */ #if (@HAVE_SYS_SELECT_H@\ + !defined _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TYPES_H \ ((defined __osf__ defined _SYS_TYPES_H_ \ - !defined _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TIME_H \ defined _OSF_SOURCE) \ || (defined __sun defined _SYS_TYPES_H \ (! (defined _XOPEN_SOURCE || defined _POSIX_C_SOURCE) \ @@ -36,12 +36,13 @@ #elif (@HAVE_SYS_SELECT_H@ \ (defined _CYGWIN_SYS_TIME_H \ - || (defined __osf__ defined _SYS_TIME_H_ \ -!defined _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TIME_H\ -defined _OSF_SOURCE) \ - || (defined __sun defined _SYS_TIME_H \ -(! (defined _XOPEN_SOURCE || defined _POSIX_C_SOURCE) \ - || defined __EXTENSIONS__ + || (!defined _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TIME_H \ +((defined __osf__ defined _SYS_TIME_H_ \ + defined _OSF_SOURCE) \ + || (defined __sun defined _SYS_TIME_H \ +(! (defined _XOPEN_SOURCE \ + || defined _POSIX_C_SOURCE) \ + || defined __EXTENSIONS__)) # define _GL_SYS_SELECT_H_REDIRECT_FROM_SYS_TIME_H # @INCLUDE_NEXT@ @NEXT_SYS_SELECT_H@ -- 1.9.3
Re: RFC: too aggressive nanosleep replacement on 64 bit Linux?
[CC'ing Thomas Gleixner, who maintains the Linux kernel's POSIX clocks and timers. Thomas, this thread started at http://lists.gnu.org/archive/html/bug-gnulib/2014-08/msg5.html.] Pádraig Brady wrote: I noticed that nanosleep() was replaced on 64 bit Linux ... Should we be more conservative with our replacement, and be happy with 292 years? It'd be nicer to get the kernel bug fixed (eventually it's bound to break something when the kernel is off by 293 billion years :-). I'm attaching a program that illustrates the bug on Fedora 20 (kernel 3.15.7-200.fc20.x86_64) and on Ubuntu 14.04.1 (kernel 3.13.0-32-generic #57-Ubuntu x86-64). Running this program on a buggy host outputs something like this: Setting alarm for 1 second from now ... Sleeping for 9223372036854775807.9 seconds... After alarm sent off, remaining time is 9223357678.462306617 seconds; i.e., nanosleep claimed that it slept for about 293079448610.606445 years. and the program exits with status 4. Gnulib-using applications have a workaround for this bug, but a workaround shouldn't be necessary. For what it's worth, the bug is fixed in Solaris 11 (x86-64), though it's present in Solaris 10 (64-bit sparc). Thomas, are you the right person to get it fixed in the Linux kernel, or should I email a bug report somewhere else? Thanks. #include time.h #include errno.h #include limits.h #include stdio.h #include signal.h #include unistd.h static void check_for_SIGALRM (int sig) { if (sig != SIGALRM) _exit (1); } int main (void) { static struct sigaction act; struct timespec forever, remaining; time_t time_t_max = (1ull (sizeof time_t_max * CHAR_BIT - 1)) - 1; act.sa_handler = check_for_SIGALRM; sigemptyset (act.sa_mask); sigaction (SIGALRM, act, NULL); forever.tv_sec = time_t_max; forever.tv_nsec = 9; printf (Setting alarm for 1 second from now ...\n); alarm (1); printf (Sleeping for %lld.%09ld seconds...\n, (long long) forever.tv_sec, forever.tv_nsec); if (nanosleep (forever, remaining) == 0) return 2; if (errno != EINTR) return 3; if (remaining.tv_sec time_t_max - 10) { printf (After alarm sent off, remaining time is %lld.%09ld seconds;\n, (long long) remaining.tv_sec, remaining.tv_nsec); printf (i.e., nanosleep claimed that it slept for about %f years.\n, (forever.tv_sec - remaining.tv_sec) / (24 * 60 * 60 * 364.2425)); return 4; } printf (ok\n); return 0; }
Re: RFC: too aggressive nanosleep replacement on 64 bit Linux?
On Tue, 5 Aug 2014, Paul Eggert wrote: [CC'ing Thomas Gleixner, who maintains the Linux kernel's POSIX clocks and timers. Thomas, this thread started at http://lists.gnu.org/archive/html/bug-gnulib/2014-08/msg5.html.] Pádraig Brady wrote: I noticed that nanosleep() was replaced on 64 bit Linux ... Should we be more conservative with our replacement, and be happy with 292 years? It'd be nicer to get the kernel bug fixed (eventually it's bound to break something when the kernel is off by 293 billion years :-). I'm attaching a program that illustrates the bug on Fedora 20 (kernel 3.15.7-200.fc20.x86_64) and on Ubuntu 14.04.1 (kernel 3.13.0-32-generic #57-Ubuntu x86-64). Running this program on a buggy host outputs something like this: Setting alarm for 1 second from now ... Sleeping for 9223372036854775807.9 seconds... After alarm sent off, remaining time is 9223357678.462306617 seconds; i.e., nanosleep claimed that it slept for about 293079448610.606445 years. and the program exits with status 4. Gnulib-using applications have a workaround for this bug, but a workaround shouldn't be necessary. For what it's worth, the bug is fixed in Solaris 11 (x86-64), though it's present in Solaris 10 (64-bit sparc). Thomas, are you the right person to get it fixed in the Linux kernel, or should I email a bug report somewhere else? Thanks. I'm the right person, but in general I prefer bug reports sent to LKML. I'll have a look tomorrow, when brain is awake :) Thanks, tglx
Re: Missing symbols when compiling gettext on OSX 10.8
Paul Eggert egg...@cs.ucla.edu writes: Thanks for reporting this. Since your patch is against the gettext version of install-reloc, I am CC'ing this to bug-gnu-gettext. I don't know why the two versions of install-reloc have diverged -- perhaps it's time to sync gettext back into gnulib? But the problem should be fixed in the upstream version anyway. As far as I checked, gettext uses build-aux/install-reloc from Gnulib as it is, and the patch still applies to the Gnulib master. Regards, -- Daiki Ueno