bug#11675: stty bad C semantics
Edward Schwartz wrote: Hi, I think there is a bug in main() of stty in coreutils 8.17. The gist of the problem is that two structures are initialized: struct termios mode = { 0, }; and struct termios new_mode = { 0, }; They are then both modified, and then compared with memcmp. The problem is that the structs contain padding bytes. The C99 standard says The value of padding bytes when storing values in structures or unions (6.2.6.1). is unspecified, so the padding bytes may not be set to zero. I don't have any problem compiling with gcc. On my machine, gcc initializes the entire struct memory with a loop that writes 0. I came across the bug when compiling coreutils under CIL, which rewrites many C language constructs to make them easier to analyze. CIL writes 0 to each struct field, leaving padding bytes untouched. Both are correct, under my interpretation of the C99 standard. However, CIL's behavior violates the assumptions of stty's memcmp, which assumes padding bytes are set to zero. The problem is easily fixed by using memset, instead of implied initializations. I am attaching a patch that does this. While it won't affect most coreutils users, it might save some time for someone using a non-standard compiler or analysis platform. Thanks, Ed Index: stty.c === --- stty.c(revision 11019) +++ stty.c(working copy) @@ -729,7 +729,8 @@ { /* Initialize to all zeroes so there is no risk memcmp will report a spurious difference in an uninitialized portion of the structure. */ - struct termios mode = { 0, }; + struct termios mode; + memset(mode, 0, sizeof(mode)); enum output_type output_type; int optc; @@ -1002,8 +1003,9 @@ { /* Initialize to all zeroes so there is no risk memcmp will report a spurious difference in an uninitialized portion of the structure. */ - struct termios new_mode = { 0, }; - + struct termios new_mode; + memset(new_mode, 0, sizeof(new_mode)); + if (tcsetattr (STDIN_FILENO, TCSADRAIN, mode)) error (EXIT_FAILURE, errno, %s, device_name); Hi Ed, Thank you for the report and the patch. That has prompted a nicely animated debate ;-) Here's a way to solve the problem that doesn't require restoring the memset calls. It feels slightly hackish, but there's already a comment in each case, so it seems ok. From 5c2181c870f4bc1abaee8ffd0b088ab05f87a61c Mon Sep 17 00:00:00 2001 From: Jim Meyering meyer...@redhat.com Date: Tue, 12 Jun 2012 16:13:43 +0200 Subject: [PATCH] stty: portability: accommodate CIL * src/stty.c (main): Declare locals mode and new_mode to be static to ensure that each is initialized to zero, *including* all padding. While gcc clears padding of a local automatic initialized to { 0, }, CIL does not, and the C99 standard is not clear on this issue. Reported by Edward Schwartz. See http://bugs.gnu.org/11675 for details. --- THANKS.in | 1 + src/stty.c | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/THANKS.in b/THANKS.in index b9a6c64..51b2c7d 100644 --- a/THANKS.in +++ b/THANKS.in @@ -172,6 +172,7 @@ Doug Coleman cole...@iarc1.ece.utexas.edu Doug McLarendou...@comco.com Dragos Harabor dhara...@us.oracle.com Duncan Roe dunc...@optimation.com.au +Edward Schwartz edmc...@cmu.edu Edward Welbournee...@opera.com Edzer Pebesma edzer.pebe...@rivm.nl Egmont Koblingeregm...@uhulinux.hu diff --git a/src/stty.c b/src/stty.c index a3fc3dd..83b502c 100644 --- a/src/stty.c +++ b/src/stty.c @@ -730,7 +730,7 @@ main (int argc, char **argv) { /* Initialize to all zeroes so there is no risk memcmp will report a spurious difference in an uninitialized portion of the structure. */ - struct termios mode = { 0, }; + static struct termios mode; enum output_type output_type; int optc; @@ -1003,7 +1003,7 @@ main (int argc, char **argv) { /* Initialize to all zeroes so there is no risk memcmp will report a spurious difference in an uninitialized portion of the structure. */ - struct termios new_mode = { 0, }; + static struct termios new_mode; if (tcsetattr (STDIN_FILENO, TCSADRAIN, mode)) error (EXIT_FAILURE, errno, %s, device_name); -- 1.7.11.rc2.5.g68f532f
bug#11675: stty bad C semantics
Jim Meyering wrote: That has prompted a nicely animated debate ;-) ... and it goes on ;-) What about these? find . -name '*.c' | xargs grep -F '0, };' ./src/ls.c:mbstate_t mbstate = { 0, }; ./src/shred.c: struct Options flags = { 0, }; ./src/tr.c: bool in_set[N_CHARS] = { 0, }; ./src/wc.c: mbstate_t state = { 0, }; ./src/pathchk.c: mbstate_t mbstate = { 0, }; Have a nice day, Berny
bug#11675: stty bad C semantics
On 06/12/2012 08:52 AM, Voelker, Bernhard wrote: Jim Meyering wrote: That has prompted a nicely animated debate ;-) ... and it goes on ;-) What about these? find . -name '*.c' | xargs grep -F '0, };' ./src/ls.c:mbstate_t mbstate = { 0, }; ./src/shred.c: struct Options flags = { 0, }; ./src/tr.c: bool in_set[N_CHARS] = { 0, }; ./src/wc.c: mbstate_t state = { 0, }; ./src/pathchk.c: mbstate_t mbstate = { 0, }; What about them? None of them are using memcmp(), so none of them care what the padding bytes (if any) got set to. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#11675: stty bad C semantics
Paul Eggert wrote: On 06/12/2012 07:33 AM, Jim Meyering wrote: Here's a way to solve the problem that doesn't require restoring the memset calls. It feels slightly hackish But it's hackish in a good way! It's a bit faster and smaller and more portable than the existing code, and it's faster and smaller than all the other proposed fixes. It's clearly the best idea yet. Thanks for cutting the Gordian knot. Thanks. With that, I've pushed it and marked this as done.
bug#11675: stty bad C semantics
Hi, I think there is a bug in main() of stty in coreutils 8.17. The gist of the problem is that two structures are initialized: struct termios mode = { 0, }; and struct termios new_mode = { 0, }; They are then both modified, and then compared with memcmp. The problem is that the structs contain padding bytes. The C99 standard says The value of padding bytes when storing values in structures or unions (6.2.6.1). is unspecified, so the padding bytes may not be set to zero. I don't have any problem compiling with gcc. On my machine, gcc initializes the entire struct memory with a loop that writes 0. I came across the bug when compiling coreutils under CIL, which rewrites many C language constructs to make them easier to analyze. CIL writes 0 to each struct field, leaving padding bytes untouched. Both are correct, under my interpretation of the C99 standard. However, CIL's behavior violates the assumptions of stty's memcmp, which assumes padding bytes are set to zero. The problem is easily fixed by using memset, instead of implied initializations. I am attaching a patch that does this. While it won't affect most coreutils users, it might save some time for someone using a non-standard compiler or analysis platform. Thanks, Ed stty.patch Description: Binary data
bug#11675: stty bad C semantics
On 06/11/2012 01:16 PM, Edward Schwartz wrote: Hi, I think there is a bug in main() of stty in coreutils 8.17. The gist of the problem is that two structures are initialized: struct termios mode = { 0, }; and struct termios new_mode = { 0, }; They are then both modified, and then compared with memcmp. The problem is that the structs contain padding bytes. The C99 standard says The value of padding bytes when storing values in structures or unions (6.2.6.1). is unspecified, so the padding bytes may not be set to zero. I disagree with your interpretation of C99, and claim instead that your CIL compiler is buggy. C99 6.7.8 para. 21: If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration. And while there is no way to zero out padding bits by direct assignment, I think we _are_ guaranteed that padding bits are zero when doing the same initialization as static storage duration if we did not call out all the named members. The problem is easily fixed by using memset, instead of implied initializations. I am attaching a patch that does this. While it won't affect most coreutils users, it might save some time for someone using a non-standard compiler or analysis platform. If I understand correctly, gcc is also able to optimize out the memset, so that your patch has no net impact and would allow us to work around your buggy compiler; but I will leave it up to Jim whether it is worth applying. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#11675: stty bad C semantics
On Mon, Jun 11, 2012 at 3:50 PM, Eric Blake ebl...@redhat.com wrote: On 06/11/2012 01:16 PM, Edward Schwartz wrote: I disagree with your interpretation of C99, and claim instead that your CIL compiler is buggy. C99 6.7.8 para. 21: If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration. And while there is no way to zero out padding bits by direct assignment, I think we _are_ guaranteed that padding bits are zero when doing the same initialization as static storage duration if we did not call out all the named members. Why do you think that? The standard is pretty vague here, but I interpret the remainder of the aggregate as the remaining fields, since the initializer list is a list of fields. I just want to make sure the bug is filed in the proper place. If it's a CIL bug, I'll file it with them. The problem is easily fixed by using memset, instead of implied initializations. I am attaching a patch that does this. While it won't affect most coreutils users, it might save some time for someone using a non-standard compiler or analysis platform. If I understand correctly, gcc is also able to optimize out the memset, so that your patch has no net impact and would allow us to work around your buggy compiler; but I will leave it up to Jim whether it is worth applying. Thanks, Ed
bug#11675: stty bad C semantics
Eric Blake ebl...@redhat.com writes: And while there is no way to zero out padding bits by direct assignment, I think we _are_ guaranteed that padding bits are zero when doing the same initialization as static storage duration if we did not call out all the named members. I think the standard is very clear about the state of padding bytes [6.2.6.1#6]: When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. (6.7.8#10 says nothing about the initial value of any padding bytes.) Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
bug#11675: stty bad C semantics
This area is a bit of a mess in the standard, but I think Edward Schwartz is right that initializers of auto variables are not guaranteed to clear padding. As I understand it under Eric Blake's interpretation, a complete initializer for an auto variable is not guaranteed to clear the padding, but an incomplete initializer is guaranteed to do so. This would be pretty strange. I think Edward's interpretation of the incomplete-initializer wording is more likely to be the correct one: namely, that missing initializers are treated as if they were zero. Even if the standard was intended to clear the padding, clearly in practice we have an implementation that doesn't do so, and it no doubt performs better that way, so I'd say something like Edward's patch is advisable.
bug#11675: stty bad C semantics
On 06/11/2012 09:29 PM, Paul Eggert wrote: This area is a bit of a mess in the standard, but I think Edward Schwartz is right that initializers of auto variables are not guaranteed to clear padding. As I understand it under Eric Blake's interpretation, a complete initializer for an auto variable is not guaranteed to clear the padding, but an incomplete initializer is guaranteed to do so. This would be pretty strange. I agree with Eric's interpretation. There are ambiguous statements in the standard, but IMHO the _init_ requirement (C99 6.7.8 para. 21) trumps the _assignment_ requirement (6.2.6.1#6). I see it as a language standard way to init auto vars, so that they match static vars of the same type, and so can be compared. http://www.pixelbeat.org/programming/gcc/auto_init.html I think Edward's interpretation of the incomplete-initializer wording is more likely to be the correct one: namely, that missing initializers are treated as if they were zero. Even if the standard was intended to clear the padding, clearly in practice we have an implementation that doesn't do so, and it no doubt performs better that way, so I'd say something like Edward's patch is advisable. Well we've not had an issue in nearly 6 years with the existing form: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=e245a66 If this was an isolated statement then I'd change for pragmatic reasons, but since it touches multiple places in the code I'd suggest patching coreutils locally (with essentially the reverse of the above), until CIL is fixed to conform to the standard. cheers, Pádraig.
bug#11675: stty bad C semantics
On 06/11/2012 04:47 PM, Pádraig Brady wrote: it touches multiple places in the code Which other places? We do use = { FOO } to initialize in many places in the code, but I don't recall any other places that rely on initializing padding to zero.
bug#11675: stty bad C semantics
On 06/12/2012 01:13 AM, Paul Eggert wrote: On 06/11/2012 04:47 PM, Pádraig Brady wrote: it touches multiple places in the code Which other places? We do use = { FOO } to initialize in many places in the code, but I don't recall any other places that rely on initializing padding to zero. Oh right, sorry. So if this is the only instance then I guess it's pragmatic to memset here (with an appropriate comment). It's still a bug I think in CIL, so it would be great to report there too. cheers, Pádraig.