Hi Tim, On Sat, Jan 18, 2020 at 01:32:49AM +0100, Tim Duesterhus wrote: > Willy, > > I did not touch the `struct pat_time` in pattern.h which appears to be > completely unused. Please check whether adjustments should be made there > as well or whether it should simply be removed. Feel free to amend my > patch if necessary. > > Best regards > Tim Düsterhus > > Apply with `git am --scissors` to automatically cut the commit message. > > -- >8 -- > Subject: [PATCH] CLEANUP: Consistently `unsigned int` for bitfields > > - Non-ints are not allowed by the C standard.
Hmmm that's not exactly what I'm reading in C99 #6.7.2.1 here, which explicitly permits implementation-defined types: "A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type." And the implementation we rely on (gcc) says: https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html Allowable bit-field types other than _Bool, signed int, and unsigned int (C99 and C11 6.7.2.1). Other integer types, such as long int, and enumerated types are permitted even in strictly conforming mode. I tend to find it extremely confusing to place "unsigned int" just after a char, because it makes one think we're leaving a 2-byte hole before and/or that we can use up to 32 bits without causing misalignment while in practice we can only use the few bits left from the missing bytes. That's what we have in fdtab: void *owner; unsigned char state; <-- 8-aligned unsigned char ev; <-- 8-aligned + 1 unsigned char linger_risk:1 <-- 8-aligned + 2; unsigned char cloned:1; unsigned char initialized:1; With the change it remains the same except that it looks quite non-obvious as the hole will depend on the number of bits added: void *owner; unsigned char state; <-- 8-aligned unsigned char ev; <-- 8-aligned + 1 unsigned int linger_risk:1 <-- 8-aligned + 2; unsigned int cloned:1; unsigned int initialized:1; Interestingly, the change above confuses pahole: struct fdtab { long unsigned int lock; /* 0 8 */ long unsigned int thread_mask; /* 8 8 */ long unsigned int update_mask; /* 16 8 */ struct fdlist_entry update; /* 24 8 */ void (*iocb)(int); /* 32 8 */ void * owner; /* 40 8 */ unsigned char state; /* 48 1 */ unsigned char ev; /* 49 1 */ /* Bitfield combined with previous fields */ unsigned int linger_risk:1; /* 48:15 4 */ unsigned int cloned:1; /* 48:14 4 */ unsigned int initialized:1; /* 48:13 4 */ ^^^ Here it expects to reserve 4 bytes for the field /* size: 56, cachelines: 1, members: 11 */ /* padding: 4 */ /* bit_padding: 29 bits */ /* last cacheline: 56 bytes */ /* BRAIN FART ALERT! 56 != 56 + 0(holes), diff = 0 */ ^^^^^^^^^^^^^^^^^^^^^^^^^^ And this surprizing warning (certainly a bug). Without this change, the output looks more consistent: unsigned char state; /* 48 1 */ unsigned char ev; /* 49 1 */ unsigned char linger_risk:1; /* 50: 7 1 */ unsigned char cloned:1; /* 50: 6 1 */ unsigned char initialized:1; /* 50: 5 1 */ More annoyingly it confuses gdb: Before: (gdb) p &fdtab[0].linger_risk $4 = (unsigned char *) 0x32 After: (gdb) p &fdtab[0].linger_risk $1 = (unsigned int *) 0x30 And this is not surprizing for both tools, because I think that with the change, the unsigned int field is merged with the previous chars, which is particularly dirty. And if we place the field before the chars, it leaves a two-bytes hole between them. Ideally I'd use flags there, though we'd have to place them in an unsigned char. If you're interested in having a look at such a change I'd prefer it, otherwise I'd prefer not to change what exists for this part because it does exactly what we need and the change makes it wrong at least debug-wise. > - Signed bitfields of size `1` hold the values `0` and `-1`, but are > usually assigned `1`, possibly leading to subtle bugs when the value > is explicitely compared against `1`. Excellent point! The problem is even more subtle when checking "> 0". And even nastier, it should fail to match against an enum because the common type will be used for the comparison, thus the enum will be extended and remain unsigned while the bit field will be sign-extended. Look, that's getting funny: enum e { ZERO = 0, ONE = 1, TWO = 2, } e; struct f { int bit:2; } f; void assign(struct f *f, enum e e) { f->bit = e; } int compare(struct f *f, enum e e) { return f->bit == e; } The output code does this: 0000000000000000 <assign>: 0: 40 88 f0 mov %sil,%al 3: 40 8a 37 mov (%rdi),%sil 6: 83 e0 03 and $0x3,%eax 9: 83 e6 fc and $0xfffffffc,%esi c: 09 c6 or %eax,%esi e: 40 88 37 mov %sil,(%rdi) 11: c3 retq ==> the bits are copied verbatim 0000000000000012 <compare>: 12: 8a 07 mov (%rdi),%al 14: c1 e0 06 shl $0x6,%eax 17: c0 f8 06 sar $0x6,%al 1a: 0f be c0 movsbl %al,%eax 1d: 39 f0 cmp %esi,%eax 1f: 0f 94 c0 sete %al 22: 0f b6 c0 movzbl %al,%eax 25: c3 retq The bits are sign-extended before the result is compared, so value "TWO" will turn to -2 and will not be matched after being set. Actually one is at risk, the "verify" struct in ssl_bind. But fortunately among the 3 bits it only uses 2 because the enums set there only use values 0 to 3! For this reason I'd like to get the signed-vs-unsigned part merged. Did you check whether this produced any output code change ? It would be a hint that we might possibly have latent bugs. Thanks, Willy