Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Hi Jiewen, If a hot add CPU needs to run any code before the first SMI, I would recommend is only executes code from a write protected FLASH range without a stack and then wait for the first SMI. For this OVMF use case, is any CPU init required before the first SMI? From Paolo's list of steps are steps (8a) and (8b) really required? Can the SMI monarch use the Local APIC to send a directed SMI to the hot added CPU? The SMI monarch needs to know the APIC ID of the hot added CPU. Do we also need to handle the case where multiple CPUs are added at once? I think we would need to serialize the use of 3000:8000 for the SMM rebase operation on each hot added CPU. It would be simpler if we can guarantee that only one CPU can be added or removed at a time and the complete flow of adding a CPU to SMM and the OS needs to be completed before another add/remove event needs to be processed. Mike > -Original Message- > From: Yao, Jiewen > Sent: Thursday, August 22, 2019 10:00 PM > To: Kinney, Michael D ; > Paolo Bonzini ; Laszlo Ersek > ; r...@edk2.groups.io > Cc: Alex Williamson ; > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor Mammedov ; > Chen, Yingwen ; Nakajima, Jun > ; Boris Ostrovsky > ; Joao Marcal Lemos Martins > ; Phillip Goerl > > Subject: RE: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with QEMU+OVMF > > Thank you Mike! > > That is good reference on the real hardware behavior. > (Glad it is public.) > > For threat model, the unique part in virtual environment > is temp RAM. > The temp RAM in real platform is per CPU cache, while > the temp RAM in virtual platform is global memory. > That brings one more potential attack surface in virtual > environment, if hot-added CPU need run code with stack > or heap before SMI rebase. > > Other threats, such as SMRAM or DMA, are same. > > Thank you > Yao Jiewen > > > > -Original Message- > > From: Kinney, Michael D > > Sent: Friday, August 23, 2019 9:03 AM > > To: Paolo Bonzini ; Laszlo Ersek > > ; r...@edk2.groups.io; Yao, Jiewen > > ; Kinney, Michael D > > > Cc: Alex Williamson ; > > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor > > Mammedov ; Chen, Yingwen > > ; Nakajima, Jun > ; > > Boris Ostrovsky ; Joao > Marcal Lemos > > Martins ; Phillip Goerl > > > > Subject: RE: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with > > QEMU+OVMF > > > > Paolo, > > > > I find the following links related to the discussions > here along with > > one example feature called GENPROTRANGE. > > > > https://csrc.nist.gov/CSRC/media/Presentations/The- > Whole-is-Greater/im > > a ges-media/day1_trusted-computing_200-250.pdf > > https://cansecwest.com/slides/2017/CSW2017_Cuauhtemoc- > Rene_CPU_Ho > > t-Add_flow.pdf > > https://www.mouser.com/ds/2/612/5520-5500-chipset-ioh- > datasheet-1131 > > 292.pdf > > > > Best regards, > > > > Mike > > > > > -Original Message- > > > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > > > Sent: Thursday, August 22, 2019 4:12 PM > > > To: Kinney, Michael D ; > Laszlo Ersek > > > ; r...@edk2.groups.io; Yao, Jiewen > > > > > > Cc: Alex Williamson ; > > > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor > > > Mammedov ; Chen, Yingwen > > > ; Nakajima, Jun > ; > > > Boris Ostrovsky ; Joao > Marcal Lemos > > > Martins ; Phillip Goerl > > > > > > Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug > using SMM with > > > QEMU+OVMF > > > > > > On 23/08/19 00:32, Kinney, Michael D wrote: > > > > Paolo, > > > > > > > > It is my understanding that real HW hot plug uses > the > > > SDM defined > > > > methods. Meaning the initial SMI is to 3000:8000 > and > > > they rebase to > > > > TSEG in the first SMI. They must have chipset > specific > > > methods to > > > > protect 3000:8000 from DMA. > > > > > > It would be great if you could check. > > > > > > > Can we add a chipset feature to prevent DMA to > 64KB > > > range from > > > > 0x3-0x3 and the UEFI Memory Map and ACPI > > > content can be > > > > updated so the Guest OS knows to not use that > range for > > > DMA? > > > > > > If real hardware does it at the chipset level, we > will probably use > > > Igor's suggestion of aliasing A-seg to 3000:. > Before starting > > > the new CPU, the SMI handler can prepare the SMBASE > relocation > > > trampoline at > > > A000:8000 and the hot-plugged CPU will find it at > > > 3000:8000 when it receives the initial SMI. Because > this is backed > > > by RAM at 0xA-0xA, DMA cannot access it and > would still go > > > through to RAM at 0x3. > > > > > > Paolo
Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Paolo, I find the following links related to the discussions here along with one example feature called GENPROTRANGE. https://csrc.nist.gov/CSRC/media/Presentations/The-Whole-is-Greater/images-media/day1_trusted-computing_200-250.pdf https://cansecwest.com/slides/2017/CSW2017_Cuauhtemoc-Rene_CPU_Hot-Add_flow.pdf https://www.mouser.com/ds/2/612/5520-5500-chipset-ioh-datasheet-1131292.pdf Best regards, Mike > -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Thursday, August 22, 2019 4:12 PM > To: Kinney, Michael D ; > Laszlo Ersek ; r...@edk2.groups.io; > Yao, Jiewen > Cc: Alex Williamson ; > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor Mammedov ; > Chen, Yingwen ; Nakajima, Jun > ; Boris Ostrovsky > ; Joao Marcal Lemos Martins > ; Phillip Goerl > > Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with QEMU+OVMF > > On 23/08/19 00:32, Kinney, Michael D wrote: > > Paolo, > > > > It is my understanding that real HW hot plug uses the > SDM defined > > methods. Meaning the initial SMI is to 3000:8000 and > they rebase to > > TSEG in the first SMI. They must have chipset specific > methods to > > protect 3000:8000 from DMA. > > It would be great if you could check. > > > Can we add a chipset feature to prevent DMA to 64KB > range from > > 0x3-0x3 and the UEFI Memory Map and ACPI > content can be > > updated so the Guest OS knows to not use that range for > DMA? > > If real hardware does it at the chipset level, we will > probably use Igor's suggestion of aliasing A-seg to > 3000:. Before starting the new CPU, the SMI handler > can prepare the SMBASE relocation trampoline at > A000:8000 and the hot-plugged CPU will find it at > 3000:8000 when it receives the initial SMI. Because this > is backed by RAM at 0xA-0xA, DMA cannot access it > and would still go through to RAM at 0x3. > > Paolo
Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Paolo, It is my understanding that real HW hot plug uses the SDM defined methods. Meaning the initial SMI is to 3000:8000 and they rebase to TSEG in the first SMI. They must have chipset specific methods to protect 3000:8000 from DMA. Can we add a chipset feature to prevent DMA to 64KB range from 0x3-0x3 and the UEFI Memory Map and ACPI content can be updated so the Guest OS knows to not use that range for DMA? Thanks, Mike > -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Thursday, August 22, 2019 3:18 PM > To: Kinney, Michael D ; > Laszlo Ersek ; r...@edk2.groups.io; > Yao, Jiewen > Cc: Alex Williamson ; > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor Mammedov ; > Chen, Yingwen ; Nakajima, Jun > ; Boris Ostrovsky > ; Joao Marcal Lemos Martins > ; Phillip Goerl > > Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with QEMU+OVMF > > On 22/08/19 22:06, Kinney, Michael D wrote: > > The SMBASE register is internal and cannot be directly > accessed by any > > CPU. There is an SMBASE field that is member of the > SMM Save State > > area and can only be modified from SMM and requires the > execution of > > an RSM instruction from SMM for the SMBASE register to > be updated from > > the current SMBASE field value. The new SMBASE > register value is only > > used on the next SMI. > > Actually there is also an SMBASE MSR, even though in > current silicon it's read-only and its use is > theoretically limited to SMM-transfer monitors. If that > MSR could be made accessible somehow outside SMM, that > would be great. > > > Once all the CPUs have been initialized for SMM, the > CPUs that are not > > needed can be hot removed. As noted above, the SMBASE > value does not > > change on an INIT. So as long as the hot add operation > does not do a > > RESET, the SMBASE value must be preserved. > > IIRC, hot-remove + hot-add will unplugs/plugs a > completely different CPU. > > > Another idea is to emulate this behavior. If the hot > plug controller > > provide registers (only accessible from SMM) to assign > the SMBASE > > address for every CPU. When a CPU is hot added, QEMU > can set the > > internal SMBASE register value from the hot plug > controller register > > value. If the SMM Monarch sends an INIT or an SMI from > the Local APIC > > to the hot added CPU, then the SMBASE register should > not be modified > > and the CPU starts execution within TSEG the first time > it receives an SMI. > > Yes, this would work. But again---if the issue is real > on current hardware too, I'd rather have a matching > solution for virtual platforms. > > If the current hardware for example remembers INIT- > preserved across hot-remove/hot-add, we could emulate > that. > > I guess the fundamental question is: how do bare metal > platforms avoid this issue, or plan to avoid this issue? > Once we know that, we can use that information to find a > way to implement it in KVM. Only if it is impossible > we'll have a different strategy that is specific to our > platform. > > Paolo > > > Jiewen and I can collect specific questions on this > topic and continue > > the discussion here. For example, I do not think there > is any method > > other than what I referenced above to program the > SMBASE register, but > > I can ask if there are any other methods.
Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Laszlo, I believe all the code for the AP startup vector is already in edk2. It is a combination of the reset vector code in UefiCpuPkg/ResetVecor/Vtf0 and an IA32/X64 specific feature in the GenFv tool. It sets up a 4KB aligned location near 4GB which can be used to start an AP using INIT-SIPI-SIPI. DI is set to 'AP' if the processor is not the BSP. This can be used to choose to put the APs into a wait loop executing from the protected FLASH region. The SMM Monarch on a hot add event can use the Local APIC to send an INIT-SIPI-SIPI to wake the AP at the 4KB startup vector in FLASH. Later the SMM Monarch can sent use the Local APIC to send an SMI to pull the hot added CPU into SMM. It is not clear if we have to do both SIPI followed by the SMI or if we can just do the SMI. Best regards, Mike > -Original Message- > From: de...@edk2.groups.io > [mailto:de...@edk2.groups.io] On Behalf Of Laszlo Ersek > Sent: Thursday, August 22, 2019 11:29 AM > To: Paolo Bonzini ; Kinney, > Michael D ; > r...@edk2.groups.io; Yao, Jiewen > Cc: Alex Williamson ; > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor Mammedov ; > Chen, Yingwen ; Nakajima, Jun > ; Boris Ostrovsky > ; Joao Marcal Lemos Martins > ; Phillip Goerl > > Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with QEMU+OVMF > > On 08/22/19 08:18, Paolo Bonzini wrote: > > On 21/08/19 22:17, Kinney, Michael D wrote: > >> Paolo, > >> > >> It makes sense to match real HW. > > > > Note that it'd also be fine to match some kind of > official Intel > > specification even if no processor (currently?) > supports it. > > I agree, because... > > >> That puts us back to the reset vector and handling > the initial SMI at > >> 3000:8000. That is all workable from a FW > implementation > >> perspective. > > that would suggest that matching reset vector code > already exists, and it would "only" need to be > upstreamed to edk2. :) > > >> It look like the only issue left is DMA. > >> > >> DMA protection of memory ranges is a chipset > feature. For the current > >> QEMU implementation, what ranges of memory are > guaranteed to be > >> protected from DMA? Is it only A/B seg and TSEG? > > > > Yes. > > ( > > This thread (esp. Jiewen's and Mike's messages) are the > first time that I've heard about the *existence* of > such RAM ranges / the chipset feature. :) > > Out of interest (independently of virtualization), how > is a general purpose OS informed by the firmware, > "never try to set up DMA to this RAM area"? Is this > communicated through ACPI _CRS perhaps? > > ... Ah, almost: ACPI 6.2 specifies _DMA, in "6.2.4 _DMA > (Direct Memory Access)". It writes, > > For example, if a platform implements a PCI bus > that cannot access > all of physical memory, it has a _DMA object under > that PCI bus that > describes the ranges of physical memory that can be > accessed by > devices on that bus. > > Sorry about the digression, and also about being late > to this thread, continually -- I'm primarily following > and learning. > > ) > > Thanks! > Laszlo > > -=-=-=-=-=-=-=-=-=-=-=- > Groups.io Links: You receive all messages sent to this > group. > > View/Reply Online (#46228): > https://edk2.groups.io/g/devel/message/46228 > Mute This Topic: https://groups.io/mt/32979681/1643496 > Group Owner: devel+ow...@edk2.groups.io > Unsubscribe: https://edk2.groups.io/g/devel/unsub > [michael.d.kin...@intel.com] > -=-=-=-=-=-=-=-=-=-=-=-
Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Paolo, The SMBASE register is internal and cannot be directly accessed by any CPU. There is an SMBASE field that is member of the SMM Save State area and can only be modified from SMM and requires the execution of an RSM instruction from SMM for the SMBASE register to be updated from the current SMBASE field value. The new SMBASE register value is only used on the next SMI. https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf Vol 3C - Section 34.11 The default base address for the SMRAM is 3H. This value is contained in an internal processor register called the SMBASE register. The operating system or executive can relocate the SMRAM by setting the SMBASE field in the saved state map (at offset 7EF8H) to a new value (see Figure 34-4). The RSM instruction reloads the internal SMBASE register with the value in the SMBASE field each time it exits SMM. All subsequent SMI requests will use the new SMBASE value to find the starting address for the SMI handler (at SMBASE + 8000H) and the SMRAM state save area (from SMBASE + FE00H to SMBASE + H). (The processor resets the value in its internal SMBASE register to 3H on a RESET, but does not change it on an INIT.) One idea to work around these issues is to startup OVMF with the maximum number of CPUs. All the CPUs will be assigned an SMBASE address and at a safe time to assign the SMBASE values using the initial 3000:8000 SMI vector because there is a guarantee of no DMA at that point in the FW init. Once all the CPUs have been initialized for SMM, the CPUs that are not needed can be hot removed. As noted above, the SMBASE value does not change on an INIT. So as long as the hot add operation does not do a RESET, the SMBASE value must be preserved. Of course, this is not a good idea from a boot performance perspective, especially if the max CPUs is a large value. Another idea is to emulate this behavior. If the hot plug controller provide registers (only accessible from SMM) to assign the SMBASE address for every CPU. When a CPU is hot added, QEMU can set the internal SMBASE register value from the hot plug controller register value. If the SMM Monarch sends an INIT or an SMI from the Local APIC to the hot added CPU, then the SMBASE register should not be modified and the CPU starts execution within TSEG the first time it receives an SMI. Jiewen and I can collect specific questions on this topic and continue the discussion here. For example, I do not think there is any method other than what I referenced above to program the SMBASE register, but I can ask if there are any other methods. Thanks, Mike > -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Thursday, August 22, 2019 11:43 AM > To: Laszlo Ersek ; Kinney, Michael D > ; r...@edk2.groups.io; Yao, > Jiewen > Cc: Alex Williamson ; > de...@edk2.groups.io; qemu devel list de...@nongnu.org>; Igor Mammedov ; > Chen, Yingwen ; Nakajima, Jun > ; Boris Ostrovsky > ; Joao Marcal Lemos Martins > ; Phillip Goerl > > Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with QEMU+OVMF > > On 22/08/19 19:59, Laszlo Ersek wrote: > > The firmware and QEMU could agree on a formula, which > would compute > > the CPU-specific SMBASE from a value pre-programmed by > the firmware, > > and the initial APIC ID of the hot-added CPU. > > > > Yes, it would duplicate code -- the calculation -- > between QEMU and > > edk2. While that's not optimal, it wouldn't be a first. > > No, that would be unmaintainable. The best solution to > me seems to be to make SMBASE programmable from non-SMM > code if some special conditions hold. Michael, would it > be possible to get in contact with the Intel architects? > > Paolo
Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Paolo, It makes sense to match real HW. That puts us back to the reset vector and handling the initial SMI at 3000:8000. That is all workable from a FW implementation perspective. It look like the only issue left is DMA. DMA protection of memory ranges is a chipset feature. For the current QEMU implementation, what ranges of memory are guaranteed to be protected from DMA? Is it only A/B seg and TSEG? Thanks, Mike > -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Wednesday, August 21, 2019 10:40 AM > To: Kinney, Michael D ; > r...@edk2.groups.io; Yao, Jiewen > Cc: Alex Williamson ; Laszlo > Ersek ; de...@edk2.groups.io; qemu > devel list ; Igor Mammedov > ; Chen, Yingwen > ; Nakajima, Jun > ; Boris Ostrovsky > ; Joao Marcal Lemos Martins > ; Phillip Goerl > > Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using > SMM with QEMU+OVMF > > On 21/08/19 19:25, Kinney, Michael D wrote: > > Could we have an initial SMBASE that is within TSEG. > > > > If we bring in hot plug CPUs one at a time, then > initial SMBASE in > > TSEG can reprogram the SMBASE to the correct value for > that CPU. > > > > Can we add a register to the hot plug controller that > allows the BSP > > to set the initial SMBASE value for a hot added CPU? > The default can > > be 3000:8000 for compatibility. > > > > Another idea is when the SMI handler runs for a hot > add CPU event, the > > SMM monarch programs the hot plug controller register > with the SMBASE > > to use for the CPU that is being added. As each CPU > is added, a > > different SMBASE value can be programmed by the SMM > Monarch. > > Yes, all of these would work. Again, I'm interested in > having something that has a hope of being implemented in > real hardware. > > Another, far easier to implement possibility could be a > lockable MSR (could be the existing > MSR_SMM_FEATURE_CONTROL) that allows programming the > SMBASE outside SMM. It would be nice if such a bit > could be defined by Intel. > > Paolo