Hi Fangrui, One year and a half later, I'm looking into this again.
On Sun, 22 Oct 2023 23:23:52 -0700, Fangrui Song wrote: > On Mon, Oct 9, 2023 at 4:08 AM Jean Delvare <[email protected]> wrote: > > The warnings are indeed gone, and I think the code should work fine on > > bigendian systems (unfortunately I can't test that easily at the > > moment). However I have 2 concerns left with your patch: Note: meanwhile I was able to test your code on a big-endian system and can confirm it works well. I also tested the results on my own x86-64 system and the size and performance of the new binary were equivalent to the original, which confirms that gcc 7.5 is able to optimize it properly. > > 1* You detect endianness automatically, based on __BYTE_ORDER__. > > However there's still one place where the explicitly-set BIGENDIAN is > > used. This is inconsistent. I would prefer that you just use BIGENDIAN > > for now, and in a separate patch, add auto-detection code to set > > BIGENDIAN where needed. That way, the code stays consistent, and > > BIGENDIAN can still be set manually if the auto-detection code fails > > for whatever reason. > > > > Which brings a side question: how portable are __BYTE_ORDER__ and > > __ORDER_BIG_ENDIAN__? We would have to check for existence before using > > them if they may not be defined on all systems. > > __BYTE_ORDER__/__ORDER_BIG_ENDIAN__ have great portability as they are > probably > the most popular way to detect endianness. > They are supported by at least GCC, Clang, and tinycc. > Linux and popular *BSD (FreeBSD, NetBSD, OpenBSD, etc) use either GCC or > Clang. Unfortunately I tested a number of old systems, and on a SLES 11 SP4 system (glibc 2.11.3, gcc 4.3), these macros do not exist. Instead, I found that __BYTE_ORDER and __BIG_ENDIAN are defined. These are only consistently defined across systems if I first #include <endian.h>. I don't know if other unix systems (like BSD) have this header file, so I don't know if including it unconditionally would be OK. In this header file, I also noticed that, if __USE_BSD is defined, BYTE_ORDER and BIG_ENDIAN also get defined as an alternative to __BYTE_ORDER and __BIG_ENDIAN, respectively. This makes me suspect that this is what BSD systems use and expect. Unfortunately, I don't have access to any BSD system to verify. On a more recent glibc version (2.38), __USE_MISC is being tested instead of __USE_BSD. I could not find any trace of __BYTE_ORDER__ and __ORDER_BIG_ENDIAN__ in the standard C header files on my system, so I suppose they are predefined by gcc itself (if recent enough), contrary to __BYTE_ORDER and __BIG_ENDIAN which come from glibc. This shouldn't be too difficult to handle, because they are all macros, so we can check which of these are defined before using them. We only need to make sure we don't unconditionally include header files which may not exist. > > 2* How portable are __builtin_bswap16 and __builtin_bswap32? Please > > keep in mind that dmidecode is meant to be portable across operating > > systems, therefore it should not depend on glibc- or gcc-specific > > functions. All I can find in the Linux manual pages is bswap in section > > 3, which documents functions bswap_16() and bswap_32(), so different > > from what you used, and describes them as "GNU extensions" so most > > probably not portable. > > GCC and Clang support __builtin_bswap16/__builtin_bswap32. > Since big-endian systems are less common today, so Unfortunately gcc 4.3 did not support it. This is the default compiler on the older systems we still support (until March 2028). On my Linux systems, the manual page documents bswap_16(), bswap_32() and bswap_64(). These are macros, so testing for their presence will be easy. But I don't know if all other supported systems have the header file which defines them (<byteswap.h>). OTOH I see that FreeBSD and NetBSD (according to their man pages, I don't have access to such systems) offer bswap16(), bswap32() and bswap64() instead. Sadly, the required header files are not the same (<sys/endian.h> for FreeBSD, <sys/types.h> and <machine/bswap.h> for NetBSD). What a mess! > (...) Do you know what C compilers dmidecode intends > to support? > If a C compiler doesn't support __builtin_bswap{16,32}, indeed it > could be problematic on a big-endian system, > but I am curious whether such a supported surface is hypothetical. I never clearly defined which C compilers are support or not. I did my best so far to write code that would be as portable as possible so that it would work on as many unix-like operating system as possible and as many architectures as possible. The reason why I am being cautious is that over the years, the number of operating systems where dmidecode was successfully build as grown, and the number of architectures implementing SMBIOS has grown as well, so the risk of breaking compatibility with the proposed change is real. And unfortunately I don't have access to all these systems myself, so I can't test what works and what breaks. > > As a side note, I wonder if we could now get rid of the U64 quirks I > > implemented long ago and use uint64_t instead. If it is portable, that > > would be much better than my custom code. > > I agree. `rg '\.[lh] ' *.c` lists quite a few occurrences. Such a > change should be separate. Great news, I took care of this meanwhile: Stop open-coding the u64 type https://cgit.git.savannah.gnu.org/cgit/dmidecode.git/commit/?id=2fa4ab1a1d37303b81e701d7ed08ae70965ab96c -- Jean Delvare SUSE L3 Support
