On 16/08/19 at 17:22, Étienne Mollier wrote: > Bonjour, > > Woops, this sounds a bit like I might not have used a very clear > wording. If I were at your place, I would proceed so; but I > don't have a Piledriver CPU to do actual testing on my side. > I'm still stuck with an old K10, not to mention my laptop, which > comes with an old regular Atom. :) > > I did try to replace the k8 option by amdfam10 though. In the > half hundred thousand lines of logs issued by the build, I get > something like a dozen differences between k8 and k10. There > were a tremendous amount of warnings too, but some of the ones > you encountered did not appear: the thing with the missing jump > target for instance, nor the ANNOTATE_NOSPEC_ALTERNATIVE on the > retpoline thing. I am running Debian Sid, currently shipping > with Gcc 9, so this is a difference to take in account though. > Finally, building an upstream Linux 5.2 kernel instead of > Buster's 4.19 does not show most of the warnings I encountered, > as these are being fixed as they come, but probably not as well > in LTS kernels. > > Doing a third run with addition of the tuning options (-mtune) > made almost no difference at all, except on the build number and > the CRC hash. It seems to me that the architecture specific > (-march) option already applies the proper tuning, at least for > my architecture. > > My last manipulation consisted in building Linux upstream 5.2.9, > released lately, with -march=amdfam10, and this one is running > quite well so far: > > $ uname -rv > 5.2.9-k10 #1 SMP PREEMPT Fri Aug 16 16:13:08 CEST 2019 > > But again, no messages worth mentioning during the compilation. > > Do your warnings appear when your build targets k8? > Or when building a generic x86_64 kernel?
Actually I run kernel built with "k8" option, it works fine, I got no warning during the compilation. Investigating deeper your tips about "amdfam10" I checked the gcc options web page: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html amdfam10 optimization was for Family 10 CPU but I have a Family 15h CPU I notice that it also exists a "bdver1" for my CPU family so I wanted give it a try and I compiled the kernel source with "bdver1" and surprise I got no warning, all worked fine, :-) the command line I use to compile is: ~/linux-source-4.19$ time make -s -j9 ; make -s -j9 modules > Compilers may have good optimization routines to boost the speed > of the code in several situations, but in other ones there are > trade-offs to take between size and performance of the code. I > personally prefer smaller sized executables (-Os): they fit in > less pages, so uses less CPU cache, and leave more room for my > programs to get more of their own data in cache (or I might > simply have spent too much time on suckless.org. ;) Do you remember which kernel CONFIG switch lets to do this optimization? > > Activating CPU specific options is interesting on some > particular use cases, but newer instruction often require > setting up various bits in the CPU before use, which tends to > inflate the resulting executable. This may be interesting for > scientific applications, or programs dealing with big data > arrays in general. In kernel mode however, the only case I can > think of where CPU specific accelerators would be beneficial are > disk ciphering and RAID arrays, for which I believe there is > already some runtime detection of available instructions, even > with the generic compiler options. I have four disks in a RAID 5 software array configuration on my system, they are managed by mdadm this is my /proc/mdstat file: $ cat /proc/mdstat Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md0 : active raid5 sda1[0] sdb1[1] sdd1[3](S) sdc1[2] 1953258496 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] unused devices: <none> > > To be honest, I don't believe the performance gain to get from > the compiler is tremendous here. Figures from the author of the > patch are there to tell us there is a gain indeed; but when you > investigate in detail the percentage of performance brought by > the tuning, it is only about 0.03% for the selected benchmark on > median values. See the "Data" section at the very end of the > README, and do your own calculations: > > https://github.com/graysky2/kernel_gcc_patch/blob/master/README.md > > The best you can do here is to do your own measures with your > own pattern of usage. If you are a developer, you can run timed > builds of Linux, and see the time it takes. If you are inclined > toward image rendering speeds, there are a few demo-scenes out > there where you might get a few figures such as the frame rate > (careful, glxgears may get capped to 60Hz when some accelerators > are in use, prefer fancier demos. ;) > > There is also this other thread dealing with kernel latency > measures; you may find a few useful tools listed in this > discussion: > > https://lists.debian.org/debian-user/2019/08/msg00851.html > > Or just see how perform your usual programs, if there are > visible improvements. > > Have fun, :) > Yes I agree the optimization won't impact on performance in a way that is perceptively by an human there are tweak more important in the kernel such as CONFIG_HZ_1000=y I always take measurement of the time employee by kernel compilation out of curiosity. Thanks again for the tips, best regards -- Franco Martelli