Re: [PATCH v5 09/10] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM

Igor Mammedov Tue, 08 Sep 2020 00:40:10 -0700

On Mon, 7 Sep 2020 17:17:52 +0200
Laszlo Ersek <ler...@redhat.com> wrote:


> Hi Igor,
> 
> On 09/07/20 13:23, Igor Mammedov wrote:
> > In case firmware has negotiated CPU hotplug SMI feature, generate
> > AML to describe SMI IO port region and send SMI to firmware
> > on each CPU hotplug SCI in case new CPUs were hotplugged.
> >
> > Since new CPUs can be hotplugged while CPU_SCAN_METHOD is running
> > we can't send SMI before new CPUs are fetched from QEMU as it
> > could cause sending Notify to a CPU that firmware hasn't seen yet.
> > So fetch new CPUs into local cache first, then send SMI and
> > after that send Notify events to cached CPUs. This should ensure
> > that Notify is sent only to CPUs which were processed by firmware
> > first.
> > Any CPUs that were hotplugged after caching will be processed
> > by the next CPU_SCAN_METHOD, when pending SCI is handled.
> >
> > Signed-off-by: Igor Mammedov <imamm...@redhat.com>
> > ---
> > v5:
> >  - fix hotplug on Windows when there is more than 256 possible CPUs
> >    (Windows isn't able to handle VarPackage over 255 elements
> >     so process CPUs in batches)
> >  - (Laszlo Ersek <ler...@redhat.com>)
> >    - fix off-by-one in package length
> >    - fix not selecting CPU before clearing insert event
> >    - use aml_lgreater() instead of aml_lnot(aml_equal(num_added_cpus,
> >      zero)
> >    - split off in separate patches:
> >      - making smi_negotiated_features a property
> >      - introduction of AcpiPmInfo.smi_on_cpuhp
> >      - introduction of PCI0.SMI0 ACPI device
> > v2:
> >  - clear insert event after firmware has returned
> >    control from SMI. (Laszlo Ersek <ler...@redhat.com>)
> > v1:
> >  - make sure that Notify is sent only to CPUs seen by FW
> >  - (Laszlo Ersek <ler...@redhat.com>)
> >     - use macro instead of x-smi-negotiated-features
> >     - style fixes
> >     - reserve whole SMI block (0xB2, 0xB3)
> > v0:
> >  - s/aml_string/aml_eisaid/
> >    /fixes Windows BSOD, on nonsense value/ (Laszlo Ersek
> >    <ler...@redhat.com>)
> >  - put SMI device under PCI0 like the rest of hotplug IO port
> >  - do not generate SMI AML if CPU hotplug SMI feature hasn't been
> >    negotiated
> > ---
> >  hw/acpi/cpu.c | 156 +++++++++++++++++++++++++++++++++++++++++---------
> >  1 file changed, 129 insertions(+), 27 deletions(-)
> >
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index 3d6a500fb7..1283972001 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -14,6 +14,8 @@
> >  #define ACPI_CPU_CMD_DATA_OFFSET_RW 8
> >  #define ACPI_CPU_CMD_DATA2_OFFSET_R 0
> >
> > +#define OVMF_CPUHP_SMI_CMD 4
> > +
> >  enum {
> >      CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0,
> >      CPHP_OST_EVENT_CMD = 1,
> > @@ -321,6 +323,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
> >  #define CPU_NOTIFY_METHOD "CTFY"
> >  #define CPU_EJECT_METHOD  "CEJ0"
> >  #define CPU_OST_METHOD    "COST"
> > +#define CPU_ADDED_LIST    "CNEW"
> >
> >  #define CPU_ENABLED       "CPEN"
> >  #define CPU_SELECTOR      "CSEL"
> > @@ -465,42 +468,141 @@ void build_cpus_aml(Aml *table, MachineState 
> > *machine, CPUHotplugFeatures opts,
> >
> >          method = aml_method(CPU_SCAN_METHOD, 0, AML_SERIALIZED);
> >          {
> > +            const uint8_t max_cpus_per_pass = 255;
> >              Aml *else_ctx;
> > -            Aml *while_ctx;
> > +            Aml *while_ctx, *while_ctx2;
> >              Aml *has_event = aml_local(0);
> >              Aml *dev_chk = aml_int(1);
> >              Aml *eject_req = aml_int(3);
> >              Aml *next_cpu_cmd = aml_int(CPHP_GET_NEXT_CPU_WITH_EVENT_CMD);
> > +            Aml *num_added_cpus = aml_local(1);
> > +            Aml *cpu_idx = aml_local(2);
> > +            Aml *uid = aml_local(3);
> > +            Aml *has_job = aml_local(4);
> > +            Aml *new_cpus = aml_name(CPU_ADDED_LIST);
> >
> >              aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
> > -            aml_append(method, aml_store(one, has_event));
> > -            while_ctx = aml_while(aml_equal(has_event, one));
> > +
> > +            /*
> > +             * Windows versions newer than XP (including Windows 10/Windows
> > +             * Server 2019), do support* VarPackageOp but, it is cripled 
> > to hold
> > +             * the same elements number as old PackageOp.
> > +             * For compatibility with Windows XP (so it won't crash) use 
> > ACPI1.0
> > +             * PackageOp which can hold max 255 elements.
> > +             *
> > +             * use named package as old Windows don't support it in local 
> > var
> > +             */
> > +            aml_append(method, aml_name_decl(CPU_ADDED_LIST,
> > +                                             
> > aml_package(max_cpus_per_pass)));
> > +
> > +            aml_append(method, aml_store(zero, uid));
> > +            aml_append(method, aml_store(one, has_job));
> > +            /*
> > +             * CPU_ADDED_LIST can hold limited number of elements, outer 
> > loop
> > +             * allows to process CPUs in batches which let us to handle 
> > more
> > +             * CPUs than CPU_ADDED_LIST can hold.
> > +             */
> > +            while_ctx2 = aml_while(aml_equal(has_job, one));
> >              {
> > -                 /* clear loop exit condition, ins_evt/rm_evt checks
> > -                  * will set it to 1 while next_cpu_cmd returns a CPU
> > -                  * with events */
> > -                 aml_append(while_ctx, aml_store(zero, has_event));
> > -                 aml_append(while_ctx, aml_store(next_cpu_cmd, cpu_cmd));
> > -                 ifctx = aml_if(aml_equal(ins_evt, one));
> > -                 {
> > -                     aml_append(ifctx,
> > -                         aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
> > -                     aml_append(ifctx, aml_store(one, ins_evt));
> > -                     aml_append(ifctx, aml_store(one, has_event));
> > -                 }
> > -                 aml_append(while_ctx, ifctx);
> > -                 else_ctx = aml_else();
> > -                 ifctx = aml_if(aml_equal(rm_evt, one));
> > -                 {
> > -                     aml_append(ifctx,
> > -                         aml_call2(CPU_NOTIFY_METHOD, cpu_data, 
> > eject_req));
> > -                     aml_append(ifctx, aml_store(one, rm_evt));
> > -                     aml_append(ifctx, aml_store(one, has_event));
> > -                 }
> > -                 aml_append(else_ctx, ifctx);
> > -                 aml_append(while_ctx, else_ctx);
> > +                aml_append(while_ctx2, aml_store(zero, has_job));
> > +
> > +                aml_append(while_ctx2, aml_store(one, has_event));
> > +                aml_append(while_ctx2, aml_store(zero, num_added_cpus));
> > +
> > +                /*
> > +                 * Scan CPUs, till there are CPUs with events or
> > +                 * CPU_ADDED_LIST capacity is exhausted
> > +                 */
> > +                while_ctx = aml_while(aml_land(aml_equal(has_event, one),
> > +                                      aml_lless(uid, 
> > aml_int(arch_ids->len))));
> > +                {
> > +                     /*
> > +                      * clear loop exit condition, ins_evt/rm_evt checks 
> > will
> > +                      * set it to 1 while next_cpu_cmd returns a CPU with 
> > events
> > +                      */
> > +                     aml_append(while_ctx, aml_store(zero, has_event));
> > +
> > +                     aml_append(while_ctx, aml_store(uid, cpu_selector));
> > +                     aml_append(while_ctx, aml_store(next_cpu_cmd, 
> > cpu_cmd));
> > +
> > +                     /*
> > +                      * wrap around case, scan is complete, exit loop.
> > +                      * It happens since events are not cleared in scan 
> > loop,
> > +                      * so next_cpu_cmd continues to find already 
> > processed CPUs
> > +                      */
> > +                     ifctx = aml_if(aml_lless(cpu_data, uid));
> > +                     {
> > +                         aml_append(ifctx, aml_break());
> > +                     }
> > +                     aml_append(while_ctx, ifctx);
> > +
> > +                     /*
> > +                      * if CPU_ADDED_LIST is full, exit inner loop and 
> > process
> > +                      * collected CPUs
> > +                      */
> > +                     ifctx = aml_if(
> > +                         aml_equal(num_added_cpus, 
> > aml_int(max_cpus_per_pass)));
> > +                     {
> > +                         aml_append(ifctx, aml_store(one, has_job));
> > +                         aml_append(ifctx, aml_break());
> > +                     }
> > +                     aml_append(while_ctx, ifctx);
> > +
> > +                     aml_append(while_ctx, aml_store(cpu_data, uid));
> > +                     ifctx = aml_if(aml_equal(ins_evt, one));
> > +                     {
> > +                         /* cache added CPUs to Notify/Wakeup later */
> > +                         aml_append(ifctx, aml_store(uid,
> > +                             aml_index(new_cpus, num_added_cpus)));
> > +                         aml_append(ifctx, aml_increment(num_added_cpus));
> > +                         aml_append(ifctx, aml_store(one, has_event));
> > +                     }
> > +                     aml_append(while_ctx, ifctx);
> > +                     else_ctx = aml_else();
> > +                     ifctx = aml_if(aml_equal(rm_evt, one));
> > +                     {
> > +                         aml_append(ifctx,
> > +                             aml_call2(CPU_NOTIFY_METHOD, uid, eject_req));
> > +                         aml_append(ifctx, aml_store(one, rm_evt));
> > +                         aml_append(ifctx, aml_store(one, has_event));
> > +                     }
> > +                     aml_append(else_ctx, ifctx);
> > +                     aml_append(while_ctx, else_ctx);
> > +                     aml_append(while_ctx, aml_increment(uid));
> > +                }
> > +                aml_append(while_ctx2, while_ctx);
> > +
> > +                /*
> > +                 * in case FW negotiated ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT,
> > +                 * make upcall to FW, so it can pull in new CPUs before
> > +                 * OS is notified and wakes them up
> > +                 */
> > +                if (opts.smi_path) {
> > +                    ifctx = aml_if(aml_lgreater(num_added_cpus, zero));
> > +                    {
> > +                        aml_append(ifctx, 
> > aml_store(aml_int(OVMF_CPUHP_SMI_CMD),
> > +                            aml_name("%s", opts.smi_path)));
> > +                    }
> > +                    aml_append(while_ctx2, ifctx);
> > +                }
> > +
> > +                /* Notify OSPM about new CPUs and clear insert events */
> > +                aml_append(while_ctx2, aml_store(zero, cpu_idx));
> > +                while_ctx = aml_while(aml_lless(cpu_idx, num_added_cpus));
> > +                {
> > +                    aml_append(while_ctx,
> > +                        aml_store(aml_derefof(aml_index(new_cpus, 
> > cpu_idx)),
> > +                                  uid));
> > +                    aml_append(while_ctx,
> > +                        aml_call2(CPU_NOTIFY_METHOD, uid, dev_chk));
> > +                    aml_append(while_ctx, aml_store(uid, aml_debug()));
> > +                    aml_append(while_ctx, aml_store(uid, cpu_selector));
> > +                    aml_append(while_ctx, aml_store(one, ins_evt));
> > +                    aml_append(while_ctx, aml_increment(cpu_idx));
> > +                }
> > +                aml_append(while_ctx2, while_ctx);
> >              }
> > -            aml_append(method, while_ctx);
> > +            aml_append(method, while_ctx2);
> >              aml_append(method, aml_release(ctrl_lock));
> >          }
> >          aml_append(cpus_dev, method);
> >  
> 
> Here's the ASL decompiled, and using the meaningful local variable names
> (and 8 possible VCPUs):
> 
>    1              Method (CSCN, 0, Serialized)
>    2              {
>    3                  Acquire (\_SB.PCI0.PRES.CPLK, 0xFFFF)
>    4                  Name (CNEW, Package (0xFF){})
>    5                  Local_Uid = Zero
>    6                  Local_HasJob = One
>    7                  While ((Local_HasJob == One))
>    8                  {
>    9                      Local_HasJob = Zero
>   10                      Local_HasEvent = One
>   11                      Local_NumAddedCpus = Zero
>   12                      While (((Local_HasEvent == One) && (Local_Uid < 
> 0x08)))
>   13                      {
>   14                          Local_HasEvent = Zero
>   15                          \_SB.PCI0.PRES.CSEL = Local_Uid
>   16                          \_SB.PCI0.PRES.CCMD = Zero
>   17                          If ((\_SB.PCI0.PRES.CDAT < Local_Uid))
>   18                          {
>   19                              Break
>   20                          }
>   21
>   22                          If ((Local_NumAddedCpus == 0xFF))
>   23                          {
>   24                              Local_HasJob = One
>   25                              Break
>   26                          }
>   27
>   28                          Local_Uid = \_SB.PCI0.PRES.CDAT
>   29                          If ((\_SB.PCI0.PRES.CINS == One))
>   30                          {
>   31                              CNEW [Local_NumAddedCpus] = Local_Uid
>   32                              Local_NumAddedCpus++
>   33                              Local_HasEvent = One
>   34                          }
>   35                          ElseIf ((\_SB.PCI0.PRES.CRMV == One))
>   36                          {
>   37                              CTFY (Local_Uid, 0x03)
>   38                              \_SB.PCI0.PRES.CRMV = One
>   39                              Local_HasEvent = One
>   40                          }
>   41
>   42                          Local_Uid++
>   43                      }
>   44
>   45                      If ((Local_NumAddedCpus > Zero))
>   46                      {
>   47                          \_SB.PCI0.SMI0.SMIC = 0x04
>   48                      }
>   49
>   50                      Local_CpuIdx = Zero
>   51                      While ((Local_CpuIdx < Local_NumAddedCpus))
>   52                      {
>   53                          Local_Uid = DerefOf (CNEW [Local_CpuIdx])
>   54                          CTFY (Local_Uid, One)
>   55                          Debug = Local_Uid
>   56                          \_SB.PCI0.PRES.CSEL = Local_Uid
>   57                          \_SB.PCI0.PRES.CINS = One
>   58                          Local_CpuIdx++
>   59                      }
>   60                  }
>   61
>   62                  Release (\_SB.PCI0.PRES.CPLK)
>   63              }
> 
> When we take the Break on line 25, then:
> 
> (a) on line 25, the following equality holds:
> 
>   Local_Uid == CNEW[Local_NumAddedCpus - 1] + 1
> 
> (b) on line 60, the following equality holds:
> 
>   Local_Uid == CNEW[Local_NumAddedCpus - 1]
> 
> This means that, when we write Local_Uid to CSEL on line 15 again, then:
> 
> - we effectively re-investigate the last-cached CPU (with selector value
>   CNEW[Local_NumAddedCpus - 1])
> 
> - rather than resuming the scanning right after it -- that is, with
>   selector value (CNEW[Local_NumAddedCpus - 1] + 1) --, in the spirit of
>   line 42.
> 
> My question is: is this "rewind" intentional?
> 
> Now, I don't see a functionality problem with this, as on line 57, we
> clear the pending insert event for the last-cached CPU, so when we
> re-check it, the "get pending" command will simply seek past it.
> 
> But I'd like to understand if this is *precisely* your intent, or if
> it's an oversight and it just ends up working OK.
it's the later (it should not have any adverse effects) so I didn't care
much about restarting from the last processed CPU.

how about moving

  22                          If ((Local_NumAddedCpus == 0xFF))
  23                          {
  24                              Local_HasJob = One
  25                              Break
  26                          }

right after
  40                          }
  41
  42                          Local_Uid++

instead of adding extra 'if' at the end of outer loop?

> 
> Basically my question is whether we should add, on top:
> 
> > --- cscn.v5     2020-09-07 15:02:13.401663487 +0200
> > +++ cscn.v5.incr        2020-09-07 16:52:22.133843710 +0200
> > @@ -57,6 +57,10 @@
> >                          \_SB.PCI0.PRES.CINS = One
> >                          Local_CpuIdx++
> >                      }
> > +
> > +                    if ((Local_HasJob == One)) {
> > +                        Local_Uid++
> > +                    }
> >                  }
> >
> >                  Release (\_SB.PCI0.PRES.CPLK)  
> 
> If not -- that is, the currently proposed patch is intentional --, then
> I think we should add a comment, about the effective "rewind" being
> intentional & okay.
> 
> (Note: it's certainly valid and necessary to re-write CSEL on line 15
> after raising the SMI on line 47; the question is not whether we should
> or should not re-write CSEL (we must!), but the specific value that we
> write to CSEL.)
> 
> So:
> 
> - If the outer loop body is entered only once, then the patch looks
>   fine, from both the review side, and the testing side (I tested it
>   with 4-8 possible VCPUs).
> 
> - If the outer loop body is entered twice or more, then from the review
>   side, please my question above. From the testing side: do you have an
>   environment where I could test this with OVMF?
> 
> (I expect it to work OK. Upon the first SMI, the firmware will likely
> pick up more VCPUs than what's in CNEW. But edk2 commit 020bb4b46d6f
> ("OvmfPkg/CpuHotplugSmm: fix CPU hotplug race just before SMI
> broadcast", 2020-08-27) should deal with that.)
it should work, as firmware doesn't have to jump over hoops to
accomodate Windows quirks.

> Hmm, actually, there's no need for a special environment: I can patch
> QEMU and lower "max_cpus_per_pass" to something small, such as "3" or
> whatever, for testing the outer loop multiple times. But first I'd like
> to know your thoughts on the "rewind".

modulo OVMF, I've tested both:
 - hacking max_cpus_per_pass to simulate multiple passes of the outer loop
 - with maxcpus=288 to cross 255 thresold (2 passes of outer loop), to check
   that WS2019 is still working.

 
> Thanks!
> Laszlo
> 
>

Re: [PATCH v5 09/10] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM

Reply via email to