re: proposed cpuctl modification

2023-03-02 Thread matthew green
matthew green writes:
> hmm, someone seems to have broken this recently?  on my system
> i'm seeing cpu0 and cpu16 both reporting:
>
>cpu0: SMT ID 0
>cpu16: SMT ID 0

duh.  this is user error.  it also said:

   Cannot bind to target CPU.  Output may not accurately describe the target.
   Run as root to allow binding.

running as root works fine.


.mrg.


Re: proposed cpuctl modification

2023-03-02 Thread Robert Elz
And a correction, I missed uses of -v in arch/* where it seems to apply
only to output from "cpuctl identify", and mostly on aarch64 processors
(seems to be very little change on x86, and no changes at all elsewhere,
arm (32) sparc sparc64, and definitely nothing on anything else).

kre




Re: proposed cpuctl modification

2023-03-02 Thread Robert Elz
Date:Fri, 03 Mar 2023 09:25:29 +1100
From:matthew green 
Message-ID:  <14071.1677795...@splode.eterna.com.au>

  | we should do this as well, it should fairly simple.  we already
  | display the relevant info in "cpuctl identify 0" eg:

Yes, identify shows all of the relevant info (including ucode version)
so all of this can certainly be done - it was just more than I wanted
to do to just suppress the annoying error messages...

  | hmm, someone seems to have broken this recently?

Now you point it out, I am seeing that too.

  | that's odd, but we should handle it saner.

My processor is odd - but it is very fast (in some of the cores anyway).

  | this seems odd to me.  verbose should print additional info
  | but it shouldn't change the actual behaviour here.

When -v is given, the code is intended to operate exactly as it did
before (it will execute a few more instructions, but the effect should
be identical).

  | eg, if i were to use this with -v on your CPU, i'd want to
  | have it show it working on cpu0 and cpu4, and EEXIST for
  | the rest, and this appears that it will exit after cpu1.

That is what it has always done.   It exits after cpu0 as well.
And every other cpu - it only gets asked to update one cpu at a time.
In the Intel case, cpuctl is run on each cpu separately, not to attempt
to update all of them in one run (I'm not even sure that is possible,
or not with the code as it is now).

It has never printed anything (with or without -v) for a
successful microcode update - before my change, the -v flag
appears to do nothing at all, so I wasn't too worried about
borrowing it.   The kernel logs successful ucode updates however.

All the change does, is ignore the EEXIST (and exit 0, not that the
exit status seems to matter to anything, at least as used in rc.d)
when -v is not given (which rc.d does not do), rather than printing
an error and doing exit(1).

I don't have an AMD processor, where things are done differently,
to test this change on though.

kre



re: proposed cpuctl modification

2023-03-02 Thread matthew green
> we just did that).I guess we could look and skip the update on
> the hyperthread companion cores (pseudo-cores) but that's not what I am
> proposing, partly because I'd have to work out how to do that, but also
> because that by itself would only solve half the problem.

we should do this as well, it should fairly simple.  we already
display the relevant info in "cpuctl identify 0" eg:

hmm, someone seems to have broken this recently?  on my system
i'm seeing cpu0 and cpu16 both reporting:

   cpu0: SMT ID 0
   cpu16: SMT ID 0

via cpuctl, though dmesg has them as:

   cpu0: node 0, package 0, core 0, smt 0
   cpu16: node 0, package 0, core 0, smt 1

> My processor also has 8 cores, with no hyperthreading, where it looks as
> if internally, there's just 2 sets of ucode, one for the first 4, and one
> for the second 4.   Updates of the other 3 of each group find the ucode
> version already installed, and error out.

that's odd, but we should handle it saner.

> So, what I am proposing is to have cpuctl simply ignore EEXIST errors from
> the update the microcode" ioctl. unless the -v option (which already exists)
> is given.

i like this idea.

> --- cpuctl.c  1 Feb 2022 10:45:02 -   1.32
> +++ cpuctl.c  2 Mar 2023 21:36:06 -
> @@ -247,7 +247,7 @@
>   cpuset_destroy(cpuset);
>   }
>   error = ioctl(fd, IOC_CPU_UCODE_APPLY, );
> - if (error < 0) {
> + if (error < 0 && (verbose || errno != EEXIST)) {
>   if (uc.fwname[0])
>   err(EXIT_FAILURE, "%s", uc.fwname);
>   else

this seems odd to me.  verbose should print additional info
but it shouldn't change the actual behaviour here.

eg, if i were to use this with -v on your CPU, i'd want to
have it show it working on cpu0 and cpu4, and EEXIST for
the rest, and this appears that it will exit after cpu1.

feel free to ignore the SMT issue, it's orthogonal to this.

thanks!


.mrg.


proposed cpuctl modification

2023-03-02 Thread Robert Elz
This message is about a proposed userland modification, but it seems
more kernelish to me, hence I am asking here on tech-kern, rather than
on tech-userlevel

When my system boots (intel cpu) it runs the intel-microcode (from pkgsrc)
microcode update.

Since it is an Intel cpu, that means running the ioctl to perform the
microcode update on every core.   The cpu has hyperthreading, and while
I sometimes am inclined to turn that off, I haven't so far.

Naturally, as they're really the same core, while the base core ucode
update works fine, the hyperthread companion always fails, as there the
microcode is already up to the expected version (surprise surprise, since
we just did that).I guess we could look and skip the update on
the hyperthread companion cores (pseudo-cores) but that's not what I am
proposing, partly because I'd have to work out how to do that, but also
because that by itself would only solve half the problem.

My processor also has 8 cores, with no hyperthreading, where it looks as
if internally, there's just 2 sets of ucode, one for the first 4, and one
for the second 4.   Updates of the other 3 of each group find the ucode
version already installed, and error out.

So, what I am proposing is to have cpuctl simply ignore EEXIST errors from
the update the microcode" ioctl. unless the -v option (which already exists)
is given.

Note that this isn't fixing any real problem - everything works as it is now,
and everything (from what I can tell) gets updated correctly.   The issue is
more cosmetic - the ucode update currently issues error messages for more than
half of my (apparent) cores, which looks ugly during the boot process, and
could lead to people wondering if there is a problem.

When I first installed the package from pgksrc to make this happen (a while
ago now) I immediately disabled it (in rc.conf) so it ran no more, as the
microcode version offered by the package was what my CPU came equipped with,
and every core complained about the version not actually being updated.

pkgsrc was updated to a newer version, with newer microcode, I installed
that, and enabled it again - and now I'm stuck with the ugly errors.

Perhaps someone has a better solution than this, perhaps checking which
microcode version is installed on each core, and not attemmpting to install
the update if it is the same version, but that's beyond my knowledge.
It also seems likely to me that it is simpler to just install anyway, and
ignore the "already there" error if it happens.   Probably.

My proposed patch is appended - it builds, installs, and seems to work
just fine (not too much of a surprise, it is a trivial change).

Opinions?

kre

Index: cpuctl.8
===
RCS file: /cvsroot/src/usr.sbin/cpuctl/cpuctl.8,v
retrieving revision 1.20
diff -u -r1.20 cpuctl.8
--- cpuctl.817 May 2019 23:51:35 -  1.20
+++ cpuctl.82 Mar 2023 21:36:06 -
@@ -72,6 +72,10 @@
 .Op Ar file
 .Xc
 This applies the microcode patch to CPUs.
+Unless
+.Fl v
+was given, errors indicating that the microcode
+already exists on the CPU in question are ignored.
 If
 .Ar cpu
 is not specified or \-1, all CPUs are updated.
Index: cpuctl.c
===
RCS file: /cvsroot/src/usr.sbin/cpuctl/cpuctl.c,v
retrieving revision 1.32
diff -u -r1.32 cpuctl.c
--- cpuctl.c1 Feb 2022 10:45:02 -   1.32
+++ cpuctl.c2 Mar 2023 21:36:06 -
@@ -247,7 +247,7 @@
cpuset_destroy(cpuset);
}
error = ioctl(fd, IOC_CPU_UCODE_APPLY, );
-   if (error < 0) {
+   if (error < 0 && (verbose || errno != EEXIST)) {
if (uc.fwname[0])
err(EXIT_FAILURE, "%s", uc.fwname);
else