Re: restrict

2020-02-09 Thread Bruno Haible
Tim Rühsen wrote:
> gcc -Wall tells you so (gcc 8 and upwards):
> $ gcc -Wall msg.c -o msg
> msg.c: In function ‘main’:
> msg.c:11:13: warning: passing argument 1 to restrict-qualified parameter
> aliases with argument 4 [-Wrestrict]
>11 |   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
>   | ^~~~~~
> msg.c:11:35: warning: ‘%s’ directive output may be truncated writing 4
> bytes into a region of size between 1 and 11 [-Wformat-truncation=]
>11 |   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
>   |   ^~
> msg.c:11:3: note: ‘snprintf’ output between 5 and 15 bytes into a
> destination of size 11
>11 |   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
>   |   ^~~~
> 

Oh, the 'restrict' keyword is actually useful (other than for micro-
optimizations)! Thanks for this pointer, Tim.

So, it would make sense to use this keyword in function declarations in
public .h files in gnulib.

For internal .h files in gnulib, there's no point - this code is assumed
correct, no need to try to get GCC to produce warnings here.

What does 'restrict' precisely mean? ISO C 99 § 6.7.3.(7):
  "An object that is accessed through a restrict-qualified pointer has a
   special association with that pointer. This association, defined in
   6.7.3.1 below, requires that all accesses to that object use, directly
   or indirectly, the value of that particular pointer.115) The intended
   use of the restrict qualifier (like the register storage class) is to
   promote optimization, and deleting all instances of the qualifier from
   all preprocessing translation units composing a conforming program
   does not change its meaning (i.e., observable behavior)."

Which function declarations could therefore profit from the 'restrict'
keyword?

  - Only parameters of pointer types, excluding function pointer types,
matter.

  - If all pointer parameters are 'const SOMETHING *', the function
is not writing into them, therefore 'restrict' is redundant.
Example: int strcmp (const char *, const char *);

  - If all pointer parameters point to different types, excluding 'void *',
ISO C forbids aliasing anyway, therefore 'restrict' is redundant.
Example: void dfacomp (char const *, ptrdiff_t, struct dfa *, bool);
and https://pubs.opengroup.org/onlinepubs/9699919799/functions/glob.html

  - If, among the parameters, there are N parameters of type
  [const] TYPE *
adding the 'restrict' keyword to N-1 of them is equivalent to adding
the 'restrict' keyword to all N of them.
See ISO C 99 § 6.7.3.1.(7) [Example 1].

GCC's warning facility understands this, regarding the non-const pointer.
Test case:

#include 
extern int smprintf (char *restrict, size_t, const char *, ...)
 __attribute__ ((__format__ (__printf__, 3, 4)));
int main ()
{
  char msg[sizeof ("try a fool")] = "try a ";
  smprintf (msg, sizeof (msg), "%s%s", msg, "fool");
}


  - If a function has an input parameter of type
  const TYPE *
and an output parameter of type
  TYPE *
then the 'restrict' keyword means that the implementation may write
the output before having finished reading all the input.
Example:
  void arctwo_encrypt (arctwo_context *context, const char *inbuf,
   char * restrict outbuf, size_t length);
Whereas the absence of the 'restrict' keyword means that the
implementation will read all the input _before_ starting to store
the output.
Example:
  int hmac_md5 (const void *key, size_t keylen,
const void *buffer, size_t buflen, void *resbuf);

  - 'restrict' should NOT be used for multiple output parameters of the
same type, when only a single word is written through each parameter.
Example: ssize_t copy_file_range (int ifd, off_t *ipos, int ofd, off_t 
*opos,
  size_t len, unsigned flags);
Rationale: It is OK to have ipos and opos point to different elements
of the same off_t[] array.

Is this all right, or did I misinterpret something?

If it is right, then how about this start of a patch? (Not sure whether I need
to add the 'restrict' keyword also in the function definitions?)

Bruno


diff --git a/lib/amemxfrm.h b/lib/amemxfrm.h
index 7cfc283..0edd3d5 100644
--- a/lib/amemxfrm.h
+++ b/lib/amemxfrm.h
@@ -38,7 +38,8 @@ extern "C" {
freshly allocated string is returned.  In both cases, *lengthp is set to the
length of the returned string.
Upon failure, return NULL, with errno set.  */
-extern char * amemxfrm (char *s, size_t n, char *resultbuf, size_t *lengthp);
+extern char * amemxfrm (char * restrict s, size_t n,
+  

Re: dfa.c no longer usable if no 64-bit support

2020-02-09 Thread arnold
Just FYI, gawk's dfa.c is now in sync w/Gnulib's. 

There are still some problems on Vax/VMS. I suspect it's environmental
but will let you know if not.

Thanks!

Arnold

arn...@skeeve.com wrote:

> Paul,
>
> Thanks for this.  I will work on reducing the differences between
> what's in Gnulib and what's in gawk.
>
> Vax/VMS is dead as a commercial system, true. But it remains alive as
> a hobbyist system, especially as it's very easy to run in simulation
> under SIMH.
>
> Thanks!
>
> Arnold
>
> Paul Eggert  wrote:
>
> > On 1/29/20 7:34 AM, Bruno Haible wrote:
> > > I would say that it's not worth the effort - except for the person(s)
> > > who care a lot about Vax/VMS.
> >
> > Normally I'd agree, but if Arnold cares about VAX/VMS and if we want 
> > Gnulib dfa.c to match Gawk dfa.c, then in this particular case it makes 
> > some sense to support 32-bit-only platforms, as it's easy to revert the 
> > recent patch that made dfa.c assume 64-bit. So I installed the attached.
> >
> > However, I see some other parts of departure for Gawk dfa.c:
> >
> > * Gawk dfa.c/dfa.h does not use flexible array members or the 
> > portable-to-7th-edition-Unix substitute provided by Gnulib, so I suggest 
> > that Gawk import Gnulib lib/flexmember.h, and either "#define 
> > FLEXIBLE_ARRAY_MEMBER 1" in config.h or (better) import Gnulib 
> > m4/flexmember.m4.
> >
> > * Gawk dfa.c doesn't use isblank, but instead defines its own is_blank 
> > that is hard-coded to the C locale. Isn't [[:blank:]] supposed to be 
> > locale-dependent? Or are you assuming that space and tab are the only 
> > blank characters in all single-byte locales?
> >
> > * Gawk dfa.c includes mbsupport.h if __DJGPP__ is defined. I suggest 
> > moving this to Gawk config.h so that dfa.c need not worry about it.
> >
> > * Gawk dfa.c replaces "#include " with:
> >
> > #ifndef VMS
> > #include 
> > #else
> > #define SIZE_MAX __INT32_MAX
> > #define PTRDIFF_MAX __INT32_MAX
> > #endif
> >
> > I suppose we could add something like this to Gnulib dfa.c but it's a 
> > bit ugly; is there a cleaner way to do it? Perhaps Gawk could supply its 
> > own little substitute stdint.h on VMS. (Gnulib does this too but I 
> > assume Gnulib's stdint.h is too heavyweight for Gawk.)



Re: Possible testing case of snprintf.

2020-02-09 Thread Tim Rühsen
On 09.02.20 15:44, Mats Erik Andersson wrote:
> Hello there!
> 
> This note has its origin in a report received at bug-inetutils.
> The following test code for snprintf() is a simplyfied detection
> I have implemented as a warning-only test in Gnu Inetutils.
> My point is that Linux/glibc and kfreebsd/glibc triggers this
> warning, but OpenSolaris, OpenIndiana, FreeBSD, OpenBSD, NetBSD,
> and DragonflyBSD do not! Reading the replacement code for the
> Gnulib module snprintf, neither would your function, should it
> undergo the test. In conclusion, this is a case where the native
> glibc function snprintf() behaves worse than does your replacement.
> 
>   #define MESSAGE   "try a fool"
>   #define WRONG_MESSAGE "fool"
> 
>   char msg[sizeof (MESSAGE)] = "try a ";
> 
>   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
> 
>   if (!strcmp (msg, WRONG_MESSAGE))
> printf ("Warning! snprintf got confused!\n");
> 
> Observe that `msg' is target, as well as source. POSIX mentions
> nothing about such a use case, but glibc will produce "fool",
> whereas all BSD unices as well as OpenSolaris descendants will
> produce "try a fool". Tacitly, POSIX would probably cry out
> a statement like "Undefined"!

s(n)printf declaration uses the restrict keyword. That basically means
that each of the pointers in the arguments points to the same block of
memory.

gcc -Wall tells you so (gcc 8 and upwards):
$ gcc -Wall msg.c -o msg
msg.c: In function ‘main’:
msg.c:11:13: warning: passing argument 1 to restrict-qualified parameter
aliases with argument 4 [-Wrestrict]
   11 |   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
  | ^~~~~~
msg.c:11:35: warning: ‘%s’ directive output may be truncated writing 4
bytes into a region of size between 1 and 11 [-Wformat-truncation=]
   11 |   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
  |   ^~
msg.c:11:3: note: ‘snprintf’ output between 5 and 15 bytes into a
destination of size 11
   11 |   snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);
  |   ^~~~

Except in rare cases, compiler warnings indicate that the programmer is
wrong. Turn them all (well, almost all) on !

Regards, Tim



signature.asc
Description: OpenPGP digital signature


Possible testing case of snprintf.

2020-02-09 Thread Mats Erik Andersson
Hello there!

This note has its origin in a report received at bug-inetutils.
The following test code for snprintf() is a simplyfied detection
I have implemented as a warning-only test in Gnu Inetutils.
My point is that Linux/glibc and kfreebsd/glibc triggers this
warning, but OpenSolaris, OpenIndiana, FreeBSD, OpenBSD, NetBSD,
and DragonflyBSD do not! Reading the replacement code for the
Gnulib module snprintf, neither would your function, should it
undergo the test. In conclusion, this is a case where the native
glibc function snprintf() behaves worse than does your replacement.

  #define MESSAGE   "try a fool"
  #define WRONG_MESSAGE "fool"

  char msg[sizeof (MESSAGE)] = "try a ";

  snprintf (msg, sizeof (msg), "%s%s", msg, WRONG_MESSAGE);

  if (!strcmp (msg, WRONG_MESSAGE))
printf ("Warning! snprintf got confused!\n");

Observe that `msg' is target, as well as source. POSIX mentions
nothing about such a use case, but glibc will produce "fool",
whereas all BSD unices as well as OpenSolaris descendants will
produce "try a fool". Tacitly, POSIX would probably cry out
a statement like "Undefined"!

It is my opinion that this discrepancies should at least be documented
in 'snprintf.texi', were you not to take matter so far as to include
this as some sort of test, after due elaboration.

Gnu Inetutils uses legacy code from BSD4.4, where the related use
is present and did not cause troubles, but as the recent report
submitted to us, the code does produce portability issues when
brought to glibc. It has taken a long time to discover this state
of affaires, but now it has surfaced!

On behalf of Gnu Inetutils,

  Mats Erik Andersson.