Re: proposed cpuctl modification

2023-03-10 Thread Masanobu SAITOH



On 2023/03/09 19:21, Robert Elz wrote:
> Date:Thu, 9 Mar 2023 16:21:53 +0900
> From:Masanobu SAITOH 
> Message-ID:  <38ae66bd-1b37-c0ef-5a43-52e0c0a2a...@execsw.org>
> 
>   | Alder Lake-N? 4 E-cores share one microcode image. I have i7-12700 and it
>   | has 4 E-cores. Those 4 cores share one microcode image.
> 
> Mine is an i9-12900KS which has 8 of them (2 groups of 4).
> 
> Thanks for the confirmation, that is what looked to be happening, but
> I was just guessing from what I observed.  I just use intel processors
> (and others on occasion) I don't even pretend to understand them.
> 
>   | I think your idea is the best. Thank you for your commit.
> 
> No problem.  It was not a difficult change to make!
> 
>   | Another solutions is that the kernel returns 0 instead of EEXIST if the
>   | version number is the same as the running microcode's version.
> 
> Yes, I considered that one as well, but as you indicate, doing that just
> loses information, and gains nothing - the same number of sys calls (ioctl's)
> would be performed, all that would be saved is the check to see if the
> error is EEXIST when that happens (ie: peanuts).
> 
> kre
> 
> ps: do your E-cores ever just turn themselves off?   On mine, occasionally,
> and for no reason I can fathom, the BIOS reports there are none of them.
> (and NetBSD doesn't see them either).

I've never seen such problem on my machine.

> They come back after a power cycle.
> This is probably a BIOS issue, but ?

It might be...

> 

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: proposed cpuctl modification

2023-03-09 Thread Robert Elz
Date:Thu, 9 Mar 2023 16:21:53 +0900
From:Masanobu SAITOH 
Message-ID:  <38ae66bd-1b37-c0ef-5a43-52e0c0a2a...@execsw.org>

  | Alder Lake-N? 4 E-cores share one microcode image. I have i7-12700 and it
  | has 4 E-cores. Those 4 cores share one microcode image.

Mine is an i9-12900KS which has 8 of them (2 groups of 4).

Thanks for the confirmation, that is what looked to be happening, but
I was just guessing from what I observed.  I just use intel processors
(and others on occasion) I don't even pretend to understand them.

  | I think your idea is the best. Thank you for your commit.

No problem.  It was not a difficult change to make!

  | Another solutions is that the kernel returns 0 instead of EEXIST if the
  | version number is the same as the running microcode's version.

Yes, I considered that one as well, but as you indicate, doing that just
loses information, and gains nothing - the same number of sys calls (ioctl's)
would be performed, all that would be saved is the check to see if the
error is EEXIST when that happens (ie: peanuts).

kre

ps: do your E-cores ever just turn themselves off?   On mine, occasionally,
and for no reason I can fathom, the BIOS reports there are none of them.
(and NetBSD doesn't see them either). They come back after a power cycle.
This is probably a BIOS issue, but ?




Re: proposed cpuctl modification

2023-03-08 Thread Masanobu SAITOH
Hello.

I'm sorry for the late of this reply. I was busy for a few weeks.

On 2023/03/03 6:59, Robert Elz wrote:
> This message is about a proposed userland modification, but it seems
> more kernelish to me, hence I am asking here on tech-kern, rather than
> on tech-userlevel
> 
> When my system boots (intel cpu) it runs the intel-microcode (from pkgsrc)
> microcode update.
> 
> Since it is an Intel cpu, that means running the ioctl to perform the
> microcode update on every core.   The cpu has hyperthreading, and while
> I sometimes am inclined to turn that off, I haven't so far.
> 
> Naturally, as they're really the same core, while the base core ucode
> update works fine, the hyperthread companion always fails, as there the
> microcode is already up to the expected version (surprise surprise, since
> we just did that).I guess we could look and skip the update on
> the hyperthread companion cores (pseudo-cores) but that's not what I am
> proposing, partly because I'd have to work out how to do that, but also
> because that by itself would only solve half the problem.
> 
> My processor also has 8 cores, with no hyperthreading, where it looks as
> if internally, there's just 2 sets of ucode, one for the first 4, and one
> for the second 4.

Alder Lake-N? 4 E-cores share one microcode image. I have i7-12700 and it
has 4 E-cores. Those 4 cores share one microcode image.

>   Updates of the other 3 of each group find the ucode
> version already installed, and error out.
> 
> So, what I am proposing is to have cpuctl simply ignore EEXIST errors from
> the update the microcode" ioctl. unless the -v option (which already exists)
> is given.
> 
> Note that this isn't fixing any real problem - everything works as it is now,
> and everything (from what I can tell) gets updated correctly.   The issue is
> more cosmetic - the ucode update currently issues error messages for more than
> half of my (apparent) cores, which looks ugly during the boot process, and
> could lead to people wondering if there is a problem.

I think your idea is the best. Thank you for your commit.

Another solutions is that the kernel returns 0 instead of EEXIST if the
version number is the same as the running microcode's version. I think it's not
good because it hides the information that the running micocode is the latest.

> When I first installed the package from pgksrc to make this happen (a while
> ago now) I immediately disabled it (in rc.conf) so it ran no more, as the
> microcode version offered by the package was what my CPU came equipped with,
> and every core complained about the version not actually being updated.
> 
> pkgsrc was updated to a newer version, with newer microcode, I installed
> that, and enabled it again - and now I'm stuck with the ugly errors.
> 
> Perhaps someone has a better solution than this, perhaps checking which
> microcode version is installed on each core, and not attemmpting to install
> the update if it is the same version, but that's beyond my knowledge.
> It also seems likely to me that it is simpler to just install anyway, and
> ignore the "already there" error if it happens.   Probably.

I agree with you. Since we do not know what kind of sharing scheme future
CPUs will have, I think simple is the best.

 Thanks.


> My proposed patch is appended - it builds, installs, and seems to work
> just fine (not too much of a surprise, it is a trivial change).
> 
> Opinions?
> 
> kre
> 
> Index: cpuctl.8
> ===
> RCS file: /cvsroot/src/usr.sbin/cpuctl/cpuctl.8,v
> retrieving revision 1.20
> diff -u -r1.20 cpuctl.8
> --- cpuctl.8  17 May 2019 23:51:35 -  1.20
> +++ cpuctl.8  2 Mar 2023 21:36:06 -
> @@ -72,6 +72,10 @@
>  .Op Ar file
>  .Xc
>  This applies the microcode patch to CPUs.
> +Unless
> +.Fl v
> +was given, errors indicating that the microcode
> +already exists on the CPU in question are ignored.
>  If
>  .Ar cpu
>  is not specified or \-1, all CPUs are updated.
> Index: cpuctl.c
> ===
> RCS file: /cvsroot/src/usr.sbin/cpuctl/cpuctl.c,v
> retrieving revision 1.32
> diff -u -r1.32 cpuctl.c
> --- cpuctl.c  1 Feb 2022 10:45:02 -   1.32
> +++ cpuctl.c  2 Mar 2023 21:36:06 -
> @@ -247,7 +247,7 @@
>   cpuset_destroy(cpuset);
>   }
>   error = ioctl(fd, IOC_CPU_UCODE_APPLY, );
> - if (error < 0) {
> + if (error < 0 && (verbose || errno != EEXIST)) {
>   if (uc.fwname[0])
>   err(EXIT_FAILURE, "%s", uc.fwname);
>   else
> 
> 

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: proposed cpuctl modification

2023-03-03 Thread RVP

The `ucode' command in cpuctl.8 may need fixing up:

1. cpuno `-1' to update all CPUs looks like that is for Xen-only (on
amd64).

2. cpuno `-2' to update current CPU doesn't get passed to the kernel
   because cpuset_set() returns EINVAL.

-RVP


Re: proposed cpuctl modification

2023-03-03 Thread Robert Elz
Date:Fri, 03 Mar 2023 14:04:39 +1100
From:matthew green 
Message-ID:  <12620.1677812...@splode.eterna.com.au>

  | duh.  this is user error.

Oh.   Double duh...   "Me too".

kre



re: proposed cpuctl modification

2023-03-02 Thread matthew green
matthew green writes:
> hmm, someone seems to have broken this recently?  on my system
> i'm seeing cpu0 and cpu16 both reporting:
>
>cpu0: SMT ID 0
>cpu16: SMT ID 0

duh.  this is user error.  it also said:

   Cannot bind to target CPU.  Output may not accurately describe the target.
   Run as root to allow binding.

running as root works fine.


.mrg.


Re: proposed cpuctl modification

2023-03-02 Thread Robert Elz
And a correction, I missed uses of -v in arch/* where it seems to apply
only to output from "cpuctl identify", and mostly on aarch64 processors
(seems to be very little change on x86, and no changes at all elsewhere,
arm (32) sparc sparc64, and definitely nothing on anything else).

kre




Re: proposed cpuctl modification

2023-03-02 Thread Robert Elz
Date:Fri, 03 Mar 2023 09:25:29 +1100
From:matthew green 
Message-ID:  <14071.1677795...@splode.eterna.com.au>

  | we should do this as well, it should fairly simple.  we already
  | display the relevant info in "cpuctl identify 0" eg:

Yes, identify shows all of the relevant info (including ucode version)
so all of this can certainly be done - it was just more than I wanted
to do to just suppress the annoying error messages...

  | hmm, someone seems to have broken this recently?

Now you point it out, I am seeing that too.

  | that's odd, but we should handle it saner.

My processor is odd - but it is very fast (in some of the cores anyway).

  | this seems odd to me.  verbose should print additional info
  | but it shouldn't change the actual behaviour here.

When -v is given, the code is intended to operate exactly as it did
before (it will execute a few more instructions, but the effect should
be identical).

  | eg, if i were to use this with -v on your CPU, i'd want to
  | have it show it working on cpu0 and cpu4, and EEXIST for
  | the rest, and this appears that it will exit after cpu1.

That is what it has always done.   It exits after cpu0 as well.
And every other cpu - it only gets asked to update one cpu at a time.
In the Intel case, cpuctl is run on each cpu separately, not to attempt
to update all of them in one run (I'm not even sure that is possible,
or not with the code as it is now).

It has never printed anything (with or without -v) for a
successful microcode update - before my change, the -v flag
appears to do nothing at all, so I wasn't too worried about
borrowing it.   The kernel logs successful ucode updates however.

All the change does, is ignore the EEXIST (and exit 0, not that the
exit status seems to matter to anything, at least as used in rc.d)
when -v is not given (which rc.d does not do), rather than printing
an error and doing exit(1).

I don't have an AMD processor, where things are done differently,
to test this change on though.

kre



re: proposed cpuctl modification

2023-03-02 Thread matthew green
> we just did that).I guess we could look and skip the update on
> the hyperthread companion cores (pseudo-cores) but that's not what I am
> proposing, partly because I'd have to work out how to do that, but also
> because that by itself would only solve half the problem.

we should do this as well, it should fairly simple.  we already
display the relevant info in "cpuctl identify 0" eg:

hmm, someone seems to have broken this recently?  on my system
i'm seeing cpu0 and cpu16 both reporting:

   cpu0: SMT ID 0
   cpu16: SMT ID 0

via cpuctl, though dmesg has them as:

   cpu0: node 0, package 0, core 0, smt 0
   cpu16: node 0, package 0, core 0, smt 1

> My processor also has 8 cores, with no hyperthreading, where it looks as
> if internally, there's just 2 sets of ucode, one for the first 4, and one
> for the second 4.   Updates of the other 3 of each group find the ucode
> version already installed, and error out.

that's odd, but we should handle it saner.

> So, what I am proposing is to have cpuctl simply ignore EEXIST errors from
> the update the microcode" ioctl. unless the -v option (which already exists)
> is given.

i like this idea.

> --- cpuctl.c  1 Feb 2022 10:45:02 -   1.32
> +++ cpuctl.c  2 Mar 2023 21:36:06 -
> @@ -247,7 +247,7 @@
>   cpuset_destroy(cpuset);
>   }
>   error = ioctl(fd, IOC_CPU_UCODE_APPLY, );
> - if (error < 0) {
> + if (error < 0 && (verbose || errno != EEXIST)) {
>   if (uc.fwname[0])
>   err(EXIT_FAILURE, "%s", uc.fwname);
>   else

this seems odd to me.  verbose should print additional info
but it shouldn't change the actual behaviour here.

eg, if i were to use this with -v on your CPU, i'd want to
have it show it working on cpu0 and cpu4, and EEXIST for
the rest, and this appears that it will exit after cpu1.

feel free to ignore the SMT issue, it's orthogonal to this.

thanks!


.mrg.