Re: proposed cpuctl modification
On 2023/03/09 19:21, Robert Elz wrote: > Date:Thu, 9 Mar 2023 16:21:53 +0900 > From:Masanobu SAITOH > Message-ID: <38ae66bd-1b37-c0ef-5a43-52e0c0a2a...@execsw.org> > > | Alder Lake-N? 4 E-cores share one microcode image. I have i7-12700 and it > | has 4 E-cores. Those 4 cores share one microcode image. > > Mine is an i9-12900KS which has 8 of them (2 groups of 4). > > Thanks for the confirmation, that is what looked to be happening, but > I was just guessing from what I observed. I just use intel processors > (and others on occasion) I don't even pretend to understand them. > > | I think your idea is the best. Thank you for your commit. > > No problem. It was not a difficult change to make! > > | Another solutions is that the kernel returns 0 instead of EEXIST if the > | version number is the same as the running microcode's version. > > Yes, I considered that one as well, but as you indicate, doing that just > loses information, and gains nothing - the same number of sys calls (ioctl's) > would be performed, all that would be saved is the check to see if the > error is EEXIST when that happens (ie: peanuts). > > kre > > ps: do your E-cores ever just turn themselves off? On mine, occasionally, > and for no reason I can fathom, the BIOS reports there are none of them. > (and NetBSD doesn't see them either). I've never seen such problem on my machine. > They come back after a power cycle. > This is probably a BIOS issue, but ? It might be... > -- --- SAITOH Masanobu (msai...@execsw.org msai...@netbsd.org)
Re: proposed cpuctl modification
Date:Thu, 9 Mar 2023 16:21:53 +0900 From:Masanobu SAITOH Message-ID: <38ae66bd-1b37-c0ef-5a43-52e0c0a2a...@execsw.org> | Alder Lake-N? 4 E-cores share one microcode image. I have i7-12700 and it | has 4 E-cores. Those 4 cores share one microcode image. Mine is an i9-12900KS which has 8 of them (2 groups of 4). Thanks for the confirmation, that is what looked to be happening, but I was just guessing from what I observed. I just use intel processors (and others on occasion) I don't even pretend to understand them. | I think your idea is the best. Thank you for your commit. No problem. It was not a difficult change to make! | Another solutions is that the kernel returns 0 instead of EEXIST if the | version number is the same as the running microcode's version. Yes, I considered that one as well, but as you indicate, doing that just loses information, and gains nothing - the same number of sys calls (ioctl's) would be performed, all that would be saved is the check to see if the error is EEXIST when that happens (ie: peanuts). kre ps: do your E-cores ever just turn themselves off? On mine, occasionally, and for no reason I can fathom, the BIOS reports there are none of them. (and NetBSD doesn't see them either). They come back after a power cycle. This is probably a BIOS issue, but ?
Re: proposed cpuctl modification
Hello. I'm sorry for the late of this reply. I was busy for a few weeks. On 2023/03/03 6:59, Robert Elz wrote: > This message is about a proposed userland modification, but it seems > more kernelish to me, hence I am asking here on tech-kern, rather than > on tech-userlevel > > When my system boots (intel cpu) it runs the intel-microcode (from pkgsrc) > microcode update. > > Since it is an Intel cpu, that means running the ioctl to perform the > microcode update on every core. The cpu has hyperthreading, and while > I sometimes am inclined to turn that off, I haven't so far. > > Naturally, as they're really the same core, while the base core ucode > update works fine, the hyperthread companion always fails, as there the > microcode is already up to the expected version (surprise surprise, since > we just did that).I guess we could look and skip the update on > the hyperthread companion cores (pseudo-cores) but that's not what I am > proposing, partly because I'd have to work out how to do that, but also > because that by itself would only solve half the problem. > > My processor also has 8 cores, with no hyperthreading, where it looks as > if internally, there's just 2 sets of ucode, one for the first 4, and one > for the second 4. Alder Lake-N? 4 E-cores share one microcode image. I have i7-12700 and it has 4 E-cores. Those 4 cores share one microcode image. > Updates of the other 3 of each group find the ucode > version already installed, and error out. > > So, what I am proposing is to have cpuctl simply ignore EEXIST errors from > the update the microcode" ioctl. unless the -v option (which already exists) > is given. > > Note that this isn't fixing any real problem - everything works as it is now, > and everything (from what I can tell) gets updated correctly. The issue is > more cosmetic - the ucode update currently issues error messages for more than > half of my (apparent) cores, which looks ugly during the boot process, and > could lead to people wondering if there is a problem. I think your idea is the best. Thank you for your commit. Another solutions is that the kernel returns 0 instead of EEXIST if the version number is the same as the running microcode's version. I think it's not good because it hides the information that the running micocode is the latest. > When I first installed the package from pgksrc to make this happen (a while > ago now) I immediately disabled it (in rc.conf) so it ran no more, as the > microcode version offered by the package was what my CPU came equipped with, > and every core complained about the version not actually being updated. > > pkgsrc was updated to a newer version, with newer microcode, I installed > that, and enabled it again - and now I'm stuck with the ugly errors. > > Perhaps someone has a better solution than this, perhaps checking which > microcode version is installed on each core, and not attemmpting to install > the update if it is the same version, but that's beyond my knowledge. > It also seems likely to me that it is simpler to just install anyway, and > ignore the "already there" error if it happens. Probably. I agree with you. Since we do not know what kind of sharing scheme future CPUs will have, I think simple is the best. Thanks. > My proposed patch is appended - it builds, installs, and seems to work > just fine (not too much of a surprise, it is a trivial change). > > Opinions? > > kre > > Index: cpuctl.8 > === > RCS file: /cvsroot/src/usr.sbin/cpuctl/cpuctl.8,v > retrieving revision 1.20 > diff -u -r1.20 cpuctl.8 > --- cpuctl.8 17 May 2019 23:51:35 - 1.20 > +++ cpuctl.8 2 Mar 2023 21:36:06 - > @@ -72,6 +72,10 @@ > .Op Ar file > .Xc > This applies the microcode patch to CPUs. > +Unless > +.Fl v > +was given, errors indicating that the microcode > +already exists on the CPU in question are ignored. > If > .Ar cpu > is not specified or \-1, all CPUs are updated. > Index: cpuctl.c > === > RCS file: /cvsroot/src/usr.sbin/cpuctl/cpuctl.c,v > retrieving revision 1.32 > diff -u -r1.32 cpuctl.c > --- cpuctl.c 1 Feb 2022 10:45:02 - 1.32 > +++ cpuctl.c 2 Mar 2023 21:36:06 - > @@ -247,7 +247,7 @@ > cpuset_destroy(cpuset); > } > error = ioctl(fd, IOC_CPU_UCODE_APPLY, ); > - if (error < 0) { > + if (error < 0 && (verbose || errno != EEXIST)) { > if (uc.fwname[0]) > err(EXIT_FAILURE, "%s", uc.fwname); > else > > -- --- SAITOH Masanobu (msai...@execsw.org msai...@netbsd.org)
Re: proposed cpuctl modification
The `ucode' command in cpuctl.8 may need fixing up: 1. cpuno `-1' to update all CPUs looks like that is for Xen-only (on amd64). 2. cpuno `-2' to update current CPU doesn't get passed to the kernel because cpuset_set() returns EINVAL. -RVP
Re: proposed cpuctl modification
Date:Fri, 03 Mar 2023 14:04:39 +1100 From:matthew green Message-ID: <12620.1677812...@splode.eterna.com.au> | duh. this is user error. Oh. Double duh... "Me too". kre
re: proposed cpuctl modification
matthew green writes: > hmm, someone seems to have broken this recently? on my system > i'm seeing cpu0 and cpu16 both reporting: > >cpu0: SMT ID 0 >cpu16: SMT ID 0 duh. this is user error. it also said: Cannot bind to target CPU. Output may not accurately describe the target. Run as root to allow binding. running as root works fine. .mrg.
Re: proposed cpuctl modification
And a correction, I missed uses of -v in arch/* where it seems to apply only to output from "cpuctl identify", and mostly on aarch64 processors (seems to be very little change on x86, and no changes at all elsewhere, arm (32) sparc sparc64, and definitely nothing on anything else). kre
Re: proposed cpuctl modification
Date:Fri, 03 Mar 2023 09:25:29 +1100 From:matthew green Message-ID: <14071.1677795...@splode.eterna.com.au> | we should do this as well, it should fairly simple. we already | display the relevant info in "cpuctl identify 0" eg: Yes, identify shows all of the relevant info (including ucode version) so all of this can certainly be done - it was just more than I wanted to do to just suppress the annoying error messages... | hmm, someone seems to have broken this recently? Now you point it out, I am seeing that too. | that's odd, but we should handle it saner. My processor is odd - but it is very fast (in some of the cores anyway). | this seems odd to me. verbose should print additional info | but it shouldn't change the actual behaviour here. When -v is given, the code is intended to operate exactly as it did before (it will execute a few more instructions, but the effect should be identical). | eg, if i were to use this with -v on your CPU, i'd want to | have it show it working on cpu0 and cpu4, and EEXIST for | the rest, and this appears that it will exit after cpu1. That is what it has always done. It exits after cpu0 as well. And every other cpu - it only gets asked to update one cpu at a time. In the Intel case, cpuctl is run on each cpu separately, not to attempt to update all of them in one run (I'm not even sure that is possible, or not with the code as it is now). It has never printed anything (with or without -v) for a successful microcode update - before my change, the -v flag appears to do nothing at all, so I wasn't too worried about borrowing it. The kernel logs successful ucode updates however. All the change does, is ignore the EEXIST (and exit 0, not that the exit status seems to matter to anything, at least as used in rc.d) when -v is not given (which rc.d does not do), rather than printing an error and doing exit(1). I don't have an AMD processor, where things are done differently, to test this change on though. kre
re: proposed cpuctl modification
> we just did that).I guess we could look and skip the update on > the hyperthread companion cores (pseudo-cores) but that's not what I am > proposing, partly because I'd have to work out how to do that, but also > because that by itself would only solve half the problem. we should do this as well, it should fairly simple. we already display the relevant info in "cpuctl identify 0" eg: hmm, someone seems to have broken this recently? on my system i'm seeing cpu0 and cpu16 both reporting: cpu0: SMT ID 0 cpu16: SMT ID 0 via cpuctl, though dmesg has them as: cpu0: node 0, package 0, core 0, smt 0 cpu16: node 0, package 0, core 0, smt 1 > My processor also has 8 cores, with no hyperthreading, where it looks as > if internally, there's just 2 sets of ucode, one for the first 4, and one > for the second 4. Updates of the other 3 of each group find the ucode > version already installed, and error out. that's odd, but we should handle it saner. > So, what I am proposing is to have cpuctl simply ignore EEXIST errors from > the update the microcode" ioctl. unless the -v option (which already exists) > is given. i like this idea. > --- cpuctl.c 1 Feb 2022 10:45:02 - 1.32 > +++ cpuctl.c 2 Mar 2023 21:36:06 - > @@ -247,7 +247,7 @@ > cpuset_destroy(cpuset); > } > error = ioctl(fd, IOC_CPU_UCODE_APPLY, ); > - if (error < 0) { > + if (error < 0 && (verbose || errno != EEXIST)) { > if (uc.fwname[0]) > err(EXIT_FAILURE, "%s", uc.fwname); > else this seems odd to me. verbose should print additional info but it shouldn't change the actual behaviour here. eg, if i were to use this with -v on your CPU, i'd want to have it show it working on cpu0 and cpu4, and EEXIST for the rest, and this appears that it will exit after cpu1. feel free to ignore the SMT issue, it's orthogonal to this. thanks! .mrg.