Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Laszlo Ersek
On 05/04/17 16:52, Gerd Hoffmann wrote:
>   Hi,
> 
>> If we invent such a new register, it should be in a location that is
>> either read-only, or zeroed-on-reset, in current QEMU. Otherwise, new
>> firmware running on old QEMU could be misled by a guest OS that writes
>> to this register, and then either reboots or enters S3.
> 
> Good point, we need to be quite careful here to not open security holes.
> Current state is that pretty much all pci config space is writable and
> not cleared on reset.  So no easy way out.
> 
>> ... With this in mind, I don't oppose "having to write somewhere to read
>> back the result", but then let's please make that write access as well
>> to the same new qemu-specific register, and not to MCH_ESMRAMC.
> 
> That should work, yes.  Write '1' to the register, then read back.  If
> it is still '1' -> no big tseg support.  Otherwise it returns the tseg
> size in some form, and "11b" in ESMRAMC can be used to pick that.

My thoughts exactly!

Thank you,
Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Laszlo Ersek
On 05/04/17 16:50, Igor Mammedov wrote:
> On Thu, 04 May 2017 16:41:00 +0200
> Gerd Hoffmann  wrote:
> 
>>   Hi,
>>
>>> There is another thing to consider here, when vm is migrated to newer
>>> qemu(with newer firmware version) then it might not boot on the next
>>> restart due to hitting old set limit.  
>>
>> Firmware is part of the migration data, so you have the same version on
>> both ends.
> true, until VM is shut down. After that it loads new firmware
> from target host. So user will have see error that tells what is wrong
> and manually fix limit value if vm startup config to another value
> that would work for this specific fw/config combo.

I've been generally promoting the idea that firmware logs should be
captured by libvirt by default, and exposed to the user similarly to
QEMU error messages / error logs... Some rate limiting and/or log
rotation would be necessary, so that guest code cannot flood the host
system with infinite logs, but that wouldn't prevent saving error
messages from non-malicious firmware.

Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Gerd Hoffmann
  Hi,

> If we invent such a new register, it should be in a location that is
> either read-only, or zeroed-on-reset, in current QEMU. Otherwise, new
> firmware running on old QEMU could be misled by a guest OS that writes
> to this register, and then either reboots or enters S3.

Good point, we need to be quite careful here to not open security holes.
Current state is that pretty much all pci config space is writable and
not cleared on reset.  So no easy way out.

> ... With this in mind, I don't oppose "having to write somewhere to read
> back the result", but then let's please make that write access as well
> to the same new qemu-specific register, and not to MCH_ESMRAMC.

That should work, yes.  Write '1' to the register, then read back.  If
it is still '1' -> no big tseg support.  Otherwise it returns the tseg
size in some form, and "11b" in ESMRAMC can be used to pick that.

cheers,
  Gerd

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Igor Mammedov
On Thu, 04 May 2017 16:41:00 +0200
Gerd Hoffmann  wrote:

>   Hi,
> 
> > There is another thing to consider here, when vm is migrated to newer
> > qemu(with newer firmware version) then it might not boot on the next
> > restart due to hitting old set limit.  
> 
> Firmware is part of the migration data, so you have the same version on
> both ends.
true, until VM is shut down. After that it loads new firmware
from target host. So user will have see error that tells what is wrong
and manually fix limit value if vm startup config to another value
that would work for this specific fw/config combo.

> 
> cheers,
>   Gerd
> 

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Gerd Hoffmann
  Hi,

> There is another thing to consider here, when vm is migrated to newer
> qemu(with newer firmware version) then it might not boot on the next
> restart due to hitting old set limit.

Firmware is part of the migration data, so you have the same version on
both ends.

cheers,
  Gerd

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Igor Mammedov
On Thu, 4 May 2017 13:34:13 +0200
Laszlo Ersek  wrote:

> On 05/04/17 10:23, Igor Mammedov wrote:
> > On Thu, 4 May 2017 00:33:27 +0200
> > Laszlo Ersek  wrote:
> > 
> > [...]   
> >> If we invented a read-only, side-effect-free PCI config space register
> >> that gave me this value plain and simple (similarly to how a new fw_cfg
> >> file would do), that would be a lot cleaner for me.  
> > Just a thought,
> > have you considered firmware setting size it needs explicitly?
> > That way we won't have to bump that value on qemu side when
> > qemu dictated size becomes too small and won't need compat cruft
> > around it.  
> 
> The problem is that I don't know what size to set. The per-VCPU SMRAM
> demand varies (mostly, grows) over time as edk2's SMM stack gets new
> features and/or is refactored occasionally. The size hint would have to
> come from OvmfPkg (platform code) while the overwhelming majority of the
> SMM stack lives outside of OvmfPkg.
> 
> Also, it's not just data that is allocated from SMRAM, it's also the SMM
> driver binaries themselves. The size of those varies even with the
> compiler toolchain that you use to build OVMF -- for example, with
> gcc-5, link-time optimization is enabled in edk2, which results in
> significantly smaller binaries --, and whether you build OVMF for NOOPT
> / DEBUG / RELEASE. This kind of difference is likely not significant per
> se, but it could be the difference between working with N*100 VCPUs, or
> only with N*100-5 VCPUs.
looks complicated, but still it would be the best option.
Anyways, I don't insist.
 
> So I continue to think of SMRAM size as another board property, like
> plain RAM size. If the guest payload doesn't fit, make QEMU provide more
> of it, be it disk space, normal RAM, or SMRAM. In fact I think the SMRAM
> size property should not be an X-* property but something that users
> could *validly* override on the command line, if they wanted to. Even
> exposing it on the libvirt level wouldn't be wrong, I believe; the same
> way we already expose whether SMM emulation is enabled at all.
Agreed, if it's a public property set by management layers and
firmware will crash with clear message it would work as well.

There is another thing to consider here, when vm is migrated to newer
qemu(with newer firmware version) then it might not boot on the next
restart due to hitting old set limit.

> Thanks
> Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Laszlo Ersek
On 05/04/17 10:23, Igor Mammedov wrote:
> On Thu, 4 May 2017 00:33:27 +0200
> Laszlo Ersek  wrote:
> 
> [...] 
>> If we invented a read-only, side-effect-free PCI config space register
>> that gave me this value plain and simple (similarly to how a new fw_cfg
>> file would do), that would be a lot cleaner for me.
> Just a thought,
> have you considered firmware setting size it needs explicitly?
> That way we won't have to bump that value on qemu side when
> qemu dictated size becomes too small and won't need compat cruft
> around it.

The problem is that I don't know what size to set. The per-VCPU SMRAM
demand varies (mostly, grows) over time as edk2's SMM stack gets new
features and/or is refactored occasionally. The size hint would have to
come from OvmfPkg (platform code) while the overwhelming majority of the
SMM stack lives outside of OvmfPkg.

Also, it's not just data that is allocated from SMRAM, it's also the SMM
driver binaries themselves. The size of those varies even with the
compiler toolchain that you use to build OVMF -- for example, with
gcc-5, link-time optimization is enabled in edk2, which results in
significantly smaller binaries --, and whether you build OVMF for NOOPT
/ DEBUG / RELEASE. This kind of difference is likely not significant per
se, but it could be the difference between working with N*100 VCPUs, or
only with N*100-5 VCPUs.

So I continue to think of SMRAM size as another board property, like
plain RAM size. If the guest payload doesn't fit, make QEMU provide more
of it, be it disk space, normal RAM, or SMRAM. In fact I think the SMRAM
size property should not be an X-* property but something that users
could *validly* override on the command line, if they wanted to. Even
exposing it on the libvirt level wouldn't be wrong, I believe; the same
way we already expose whether SMM emulation is enabled at all.

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Igor Mammedov
On Thu, 4 May 2017 00:33:27 +0200
Laszlo Ersek  wrote:

[...] 
> If we invented a read-only, side-effect-free PCI config space register
> that gave me this value plain and simple (similarly to how a new fw_cfg
> file would do), that would be a lot cleaner for me.
Just a thought,
have you considered firmware setting size it needs explicitly?
That way we won't have to bump that value on qemu side when
qemu dictated size becomes too small and won't need compat cruft
around it.

> 
> Thanks!
> Laszlo
> ___
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-04 Thread Gerd Hoffmann
  Hi,

> > The problem with this is that I need the TSEG size in another module as
> > well, namely PlatformPei. The dispatch order between PlatformPei and
> > SmmAccessPei is unspecified (they have both TRUE for DEPEX). If
> > PlatformPei gets dispatched second, I really wouldn't want to write to
> > MCH_ESMRAMC again, just to query MCH_TSEGMB. (I couldn't just read
> > MCH_TSEGMB because if PlatformPei were dispatched first, then MCH_TSEGMB
> > would still be unset).
> > 
> > In other words, I wouldn't like to write a register just to receive the
> > information; I need the information in two places whose relative
> > ordering is unspecified, and one of those already writes the register in
> > question, genuinely. I guess I could hack it around with another dynamic
> > PCD, but that's kind of ugly.

Hmm, this unspecified ordering works nicely today because tseg size is a
compile time constant.  If we make this dynamic we either need a pcd or
must probe tseg size twice (no matter whenever this is some pci cfg
space register or fw_cfg file), which is kind of ugly too ...

> > If we invented a read-only, side-effect-free PCI config space register
> > that gave me this value plain and simple (similarly to how a new fw_cfg
> > file would do), that would be a lot cleaner for me.

That makes the "probe twice" thing easier indeed.

> > I think this would
> > match your earlier alternative where you wrote "Alternatively we could
> > add some qemu-specific register".

Yes.

cheers,
  Gerd


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Laszlo Ersek
On 05/04/17 00:33, Laszlo Ersek wrote:
> On 05/03/17 15:44, Gerd Hoffmann wrote:
>>   Hi,
>>
>>> I propose the following: add a new fw_cfg file which communicates how
>>> much memory (how many megabytes) the "11b" value in the tseg size
>>> register will configure.
>>
>> I'd prefer to keep fw_cfg out of the picture, and I think we can do it
>> without too much problems.
>>
>> We have a TSEGMB (tseg memory base) register.  qemu doesn't implement
>> it (but I think you can write to it and read back what you've
>> written).
> 
> OVMF already uses this register, for the exact purpose you mention.
> 
> (1) OVMF sets the register in SmmAccessPeiEntryPoint(), which is the
> entry point of the PEI-phase module that produces PEI_SMM_ACCESS_PPI.
> 
> (2) OVMF reads the register back in SmramAccessGetCapabilities(), which
> is built into two modules:
> 
> - the same PEIM as mentioned above, for producing
> PEI_SMM_ACCESS_PPI.GetCapabilities(),
> 
> - a similar-purpose DXE driver, for producing
> EFI_SMM_ACCESS2_PROTOCOL.GetCapabilities().
> 
> Those GetCapabilities() member functions basically return the SMRAM map
> to the caller.
> 
>> We could make qemu update that register in case "11b" is written to
>> the tseg size.  The firmware then can read TSEGMB and figure whenever
>> a large tseg is supported and if so how big it is.
> 
> Do you mean that QEMU would at once calculate the absolute base of TSEG,
> downwards from the top of low RAM, expressed in MB?
> 
> Excerpt from (1):
> 
>>   //
>>   // Set TSEG Memory Base.
>>   //
>>   PciWrite32 (DRAMC_REGISTER_Q35 (MCH_TSEGMB),
>> (TopOfLowRamMb - FixedPcdGet8 (PcdQ35TsegMbytes)) << 
>> MCH_TSEGMB_MB_SHIFT);
>>
>>   //
>>   // Set TSEG size, and disable TSEG visibility outside of SMM. Note that the
>>   // T_EN bit has inverse meaning; when T_EN is set, then TSEG visibility is
>>   // *restricted* to SMM.
>>   //
>>   EsmramcVal &= ~(UINT32)MCH_ESMRAMC_TSEG_MASK;
>>   EsmramcVal |= FixedPcdGet8 (PcdQ35TsegMbytes) == 8 ? MCH_ESMRAMC_TSEG_8MB :
>> FixedPcdGet8 (PcdQ35TsegMbytes) == 2 ? MCH_ESMRAMC_TSEG_2MB :
>> MCH_ESMRAMC_TSEG_1MB;
>>   EsmramcVal |= MCH_ESMRAMC_T_EN;
>>   PciWrite8 (DRAMC_REGISTER_Q35 (MCH_ESMRAMC), EsmramcVal);
> 
> I guess I could update this to:
> - set MCH_TSEGMB like now, but also remember the written value
> - write 11b to MCH_ESMRAMC, and read back MCH_TSEGMB
> - if the read back value differs from the written one, use that value
> for tseg size going forward, plus write 11b | T_EN to MCH_ESMRAMC
> - otherwise, set MCH_ESMRAMC like now.
> 
> The problem with this is that I need the TSEG size in another module as
> well, namely PlatformPei. The dispatch order between PlatformPei and
> SmmAccessPei is unspecified (they have both TRUE for DEPEX). If
> PlatformPei gets dispatched second, I really wouldn't want to write to
> MCH_ESMRAMC again, just to query MCH_TSEGMB. (I couldn't just read
> MCH_TSEGMB because if PlatformPei were dispatched first, then MCH_TSEGMB
> would still be unset).
> 
> In other words, I wouldn't like to write a register just to receive the
> information; I need the information in two places whose relative
> ordering is unspecified, and one of those already writes the register in
> question, genuinely. I guess I could hack it around with another dynamic
> PCD, but that's kind of ugly.
> 
> If we invented a read-only, side-effect-free PCI config space register
> that gave me this value plain and simple (similarly to how a new fw_cfg
> file would do), that would be a lot cleaner for me. I think this would
> match your earlier alternative where you wrote "Alternatively we could
> add some qemu-specific register". That's my next preference after
> fw_cfg.

If we invent such a new register, it should be in a location that is
either read-only, or zeroed-on-reset, in current QEMU. Otherwise, new
firmware running on old QEMU could be misled by a guest OS that writes
to this register, and then either reboots or enters S3.

... With this in mind, I don't oppose "having to write somewhere to read
back the result", but then let's please make that write access as well
to the same new qemu-specific register, and not to MCH_ESMRAMC.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Laszlo Ersek
On 05/03/17 15:55, Paolo Bonzini wrote:
> 
> 
> On 03/05/2017 15:35, Laszlo Ersek wrote:
>>> I see.  In my other answer I tried to keep it as intact as possible.
>>>
>>> I'm a bit worried about the limits on the number of fw-cfg files.
>> We've promoted that to a device property in QEMU commit e12f3a13e2e1
>> ("fw-cfg: turn FW_CFG_FILE_SLOTS into a device property", 2017-01-12),
>> and we've raised the count to 0x20 for 2.9 machtypes, in commit
>> a5b3ebfd23bc ("fw-cfg: bump "x-file-slots" to 0x20 for 2.9+ machine
>> types", 2017-01-12).
>>
>> ... Or does your concern already account for those?
> 
> I was aware it had been bumped, though not exactly how much.  32 is
> better than before, but still on the lowish side...

I tried to be frugal on purpose, but we can bump it again for 2.10,
can't we?

(Anyway, I'm not pushing for this too hard, just seeking to keep it open
as an option.)

Thanks,
Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Laszlo Ersek
On 05/03/17 15:44, Gerd Hoffmann wrote:
>   Hi,
>
>> I propose the following: add a new fw_cfg file which communicates how
>> much memory (how many megabytes) the "11b" value in the tseg size
>> register will configure.
>
> I'd prefer to keep fw_cfg out of the picture, and I think we can do it
> without too much problems.
>
> We have a TSEGMB (tseg memory base) register.  qemu doesn't implement
> it (but I think you can write to it and read back what you've
> written).

OVMF already uses this register, for the exact purpose you mention.

(1) OVMF sets the register in SmmAccessPeiEntryPoint(), which is the
entry point of the PEI-phase module that produces PEI_SMM_ACCESS_PPI.

(2) OVMF reads the register back in SmramAccessGetCapabilities(), which
is built into two modules:

- the same PEIM as mentioned above, for producing
PEI_SMM_ACCESS_PPI.GetCapabilities(),

- a similar-purpose DXE driver, for producing
EFI_SMM_ACCESS2_PROTOCOL.GetCapabilities().

Those GetCapabilities() member functions basically return the SMRAM map
to the caller.

> We could make qemu update that register in case "11b" is written to
> the tseg size.  The firmware then can read TSEGMB and figure whenever
> a large tseg is supported and if so how big it is.

Do you mean that QEMU would at once calculate the absolute base of TSEG,
downwards from the top of low RAM, expressed in MB?

Excerpt from (1):

>   //
>   // Set TSEG Memory Base.
>   //
>   PciWrite32 (DRAMC_REGISTER_Q35 (MCH_TSEGMB),
> (TopOfLowRamMb - FixedPcdGet8 (PcdQ35TsegMbytes)) << MCH_TSEGMB_MB_SHIFT);
>
>   //
>   // Set TSEG size, and disable TSEG visibility outside of SMM. Note that the
>   // T_EN bit has inverse meaning; when T_EN is set, then TSEG visibility is
>   // *restricted* to SMM.
>   //
>   EsmramcVal &= ~(UINT32)MCH_ESMRAMC_TSEG_MASK;
>   EsmramcVal |= FixedPcdGet8 (PcdQ35TsegMbytes) == 8 ? MCH_ESMRAMC_TSEG_8MB :
> FixedPcdGet8 (PcdQ35TsegMbytes) == 2 ? MCH_ESMRAMC_TSEG_2MB :
> MCH_ESMRAMC_TSEG_1MB;
>   EsmramcVal |= MCH_ESMRAMC_T_EN;
>   PciWrite8 (DRAMC_REGISTER_Q35 (MCH_ESMRAMC), EsmramcVal);

I guess I could update this to:
- set MCH_TSEGMB like now, but also remember the written value
- write 11b to MCH_ESMRAMC, and read back MCH_TSEGMB
- if the read back value differs from the written one, use that value
for tseg size going forward, plus write 11b | T_EN to MCH_ESMRAMC
- otherwise, set MCH_ESMRAMC like now.

The problem with this is that I need the TSEG size in another module as
well, namely PlatformPei. The dispatch order between PlatformPei and
SmmAccessPei is unspecified (they have both TRUE for DEPEX). If
PlatformPei gets dispatched second, I really wouldn't want to write to
MCH_ESMRAMC again, just to query MCH_TSEGMB. (I couldn't just read
MCH_TSEGMB because if PlatformPei were dispatched first, then MCH_TSEGMB
would still be unset).

In other words, I wouldn't like to write a register just to receive the
information; I need the information in two places whose relative
ordering is unspecified, and one of those already writes the register in
question, genuinely. I guess I could hack it around with another dynamic
PCD, but that's kind of ugly.

If we invented a read-only, side-effect-free PCI config space register
that gave me this value plain and simple (similarly to how a new fw_cfg
file would do), that would be a lot cleaner for me. I think this would
match your earlier alternative where you wrote "Alternatively we could
add some qemu-specific register". That's my next preference after
fw_cfg.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Paolo Bonzini


On 03/05/2017 15:35, Laszlo Ersek wrote:
>> I see.  In my other answer I tried to keep it as intact as possible.
>>
>> I'm a bit worried about the limits on the number of fw-cfg files.
> We've promoted that to a device property in QEMU commit e12f3a13e2e1
> ("fw-cfg: turn FW_CFG_FILE_SLOTS into a device property", 2017-01-12),
> and we've raised the count to 0x20 for 2.9 machtypes, in commit
> a5b3ebfd23bc ("fw-cfg: bump "x-file-slots" to 0x20 for 2.9+ machine
> types", 2017-01-12).
> 
> ... Or does your concern already account for those?

I was aware it had been bumped, though not exactly how much.  32 is
better than before, but still on the lowish side...

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Gerd Hoffmann
  Hi,

> I propose the following: add a new fw_cfg file which communicates how
> much memory (how many megabytes) the "11b" value in the tseg size
> register will configure.

I'd prefer to keep fw_cfg out of the picture, and I think we can do it
without too much problems.

We have a TSEGMB (tseg memory base) register.  qemu doesn't implement it
(but I think you can write to it and read back what you've written).

We could make qemu update that register in case "11b" is written to the
tseg size.  The firmware then can read TSEGMB and figure whenever a
large tseg is supported and if so how big it is.

cheers,
  Gerd



___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Laszlo Ersek
On 05/03/17 15:26, Paolo Bonzini wrote:
> 
> 
> On 03/05/2017 15:14, Laszlo Ersek wrote:
>> I'd prefer a solution that would keep the fw logic / code flow related
>> to register configuration intact, and would just replace a few numbers /
>> constants if possible.
> 
> I see.  In my other answer I tried to keep it as intact as possible.
> 
> I'm a bit worried about the limits on the number of fw-cfg files.

We've promoted that to a device property in QEMU commit e12f3a13e2e1
("fw-cfg: turn FW_CFG_FILE_SLOTS into a device property", 2017-01-12),
and we've raised the count to 0x20 for 2.9 machtypes, in commit
a5b3ebfd23bc ("fw-cfg: bump "x-file-slots" to 0x20 for 2.9+ machine
types", 2017-01-12).

... Or does your concern already account for those?

Thank you,
Laszlo


>> And, whether the "largest TSEG size" (number of MBs) that QEMU exposed
>> in the new fw_cfg file depended *only* on the machine type, or on other
>> config elements as well (such as max VCPU count), that would be QEMU's
>> prerogative of course.
>>
>> To me personally, the ability (via fw_cfg) to ask / request the
>> following looks best:
>>
>> - Is there a dynamic largest? (= does the fw_cfg file exist?)
>> - What is it exactly? (= what are its contents?)
>> - Please give me it. (= write 11b)

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Paolo Bonzini


On 03/05/2017 15:14, Laszlo Ersek wrote:
> I'd prefer a solution that would keep the fw logic / code flow related
> to register configuration intact, and would just replace a few numbers /
> constants if possible.

I see.  In my other answer I tried to keep it as intact as possible.

I'm a bit worried about the limits on the number of fw-cfg files.

Paolo

> And, whether the "largest TSEG size" (number of MBs) that QEMU exposed
> in the new fw_cfg file depended *only* on the machine type, or on other
> config elements as well (such as max VCPU count), that would be QEMU's
> prerogative of course.
> 
> To me personally, the ability (via fw_cfg) to ask / request the
> following looks best:
> 
> - Is there a dynamic largest? (= does the fw_cfg file exist?)
> - What is it exactly? (= what are its contents?)
> - Please give me it. (= write 11b)
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Laszlo Ersek
On 05/03/17 14:56, Paolo Bonzini wrote:
> 
> 
> On 03/05/2017 08:57, Gerd Hoffmann wrote:
>> qemu implements what physical q35 support.  The extended smram register
>> has two bits for the tseg size, three out of the four values are used
>> (for 1, 2, 8 MB sizes).  "11" is reserved in the specs.  We could use
>> "11" to implement a bigger tseg.  Current code sets the tseg size to
>> zero for "11".  Alternatively we could add some qemu-specific register.
> 
> If you can set TSEG while SMRAM is closed, you could detect that in
> edk2.  According to Laszlo 32 MB should be more than enough, and we
> could enable it only for >192 CPUs.

(Seeing this just now.)

I'd prefer a solution that would keep the fw logic / code flow related
to register configuration intact, and would just replace a few numbers /
constants if possible.

And, whether the "largest TSEG size" (number of MBs) that QEMU exposed
in the new fw_cfg file depended *only* on the machine type, or on other
config elements as well (such as max VCPU count), that would be QEMU's
prerogative of course.

To me personally, the ability (via fw_cfg) to ask / request the
following looks best:

- Is there a dynamic largest? (= does the fw_cfg file exist?)
- What is it exactly? (= what are its contents?)
- Please give me it. (= write 11b)

Thanks,
Laszlo

>> When implementing this in qemu we will have to do it runtime-switchable,
>> for backward compatibility with older qemu versions.  So ideally
>> firmware would detect somehow whenever qemu supports a bigger tseg or
>> not and adapt at runtime.  If edk2 can't do this we would need two edk2
>> builds ...

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Laszlo Ersek
On 05/03/17 08:57, Gerd Hoffmann wrote:
> On Di, 2017-05-02 at 20:49 +, Kinney, Michael D wrote:
>> Laszlo,
>>
>> Is it possible to add more TSEG sizes to the Q35 board?
> 
> qemu implements what physical q35 support.  The extended smram register
> has two bits for the tseg size, three out of the four values are used
> (for 1, 2, 8 MB sizes).  "11" is reserved in the specs.  We could use
> "11" to implement a bigger tseg.  Current code sets the tseg size to
> zero for "11".  Alternatively we could add some qemu-specific register.
> 
> When implementing this in qemu we will have to do it runtime-switchable,
> for backward compatibility with older qemu versions.  So ideally
> firmware would detect somehow whenever qemu supports a bigger tseg or
> not and adapt at runtime.  If edk2 can't do this we would need two edk2
> builds ...

Thanks everyone for the feedback!

After reading Mike's and Jiewen's replies, I've come to the same
conclusion as Gerd. QEMU should provide a larger SMRAM, and it must be
guest-detectable.

I propose the following: add a new fw_cfg file which communicates how
much memory (how many megabytes) the "11b" value in the tseg size
register will configure. For starters (pc-q35-2.10), I recommend 16MB.
For future Q35 board versions (pc-q35-2.11 and later), we can bump it to
32MB if necessary. The firmware does not have to *dictate* the size, it
just needs to be able to see that "11b" is a valid selection (for "give
me largest"), and what that "largest" exactly corresponds to.

In the firmware, the TSEG size is handled internally to platform code
(i.e., in OvmfPkg); it is selected by the PcdQ35TsegMbytes
fixed-at-build PCD. The "OvmfPkg/PlatformPei" and
"OvmfPkg/SmmAccess/SmmAccessPei" PEIMs consume it directly. Everything
else gets the SMRAM size from the abstractions provided by those PEIMs.

These PEIMs can consume fw_cfg. If the fw_cfg file is missing, then the
firmware knows it can't select "11b". New firmware on old Q35 machine
type versions will not find the fw_cfg file, hence OVMF will not use
"11b". Old firmware on new Q35 machine type versions will ignore the
fw_cfg file, and stick with the PcdQ35TsegMbytes preference.

This will basically demote "PcdQ35TsegMbytes" from "TSEG size selection"
to "TSEG size selection unless QEMU dictates it".

There is one place in the code where the MCH_ESMRAMC register is read
back to see the TSEG size. This is built into both SmmAccessPei and
SmmAccess2Dxe. In the entry points of both SmmAccessPei and
SmmAccess2Dxe, we can interrogate fw_cfg, and stash the size
corresponding to 11b in a common global variable. Then we can rebase the
common code to that global variable, in case it reads back 11b.

So I think it can be made work. Does it sound good?

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Paolo Bonzini


On 03/05/2017 08:57, Gerd Hoffmann wrote:
> qemu implements what physical q35 support.  The extended smram register
> has two bits for the tseg size, three out of the four values are used
> (for 1, 2, 8 MB sizes).  "11" is reserved in the specs.  We could use
> "11" to implement a bigger tseg.  Current code sets the tseg size to
> zero for "11".  Alternatively we could add some qemu-specific register.

If you can set TSEG while SMRAM is closed, you could detect that in
edk2.  According to Laszlo 32 MB should be more than enough, and we
could enable it only for >192 CPUs.

Paolo

> When implementing this in qemu we will have to do it runtime-switchable,
> for backward compatibility with older qemu versions.  So ideally
> firmware would detect somehow whenever qemu supports a bigger tseg or
> not and adapt at runtime.  If edk2 can't do this we would need two edk2
> builds ...
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-03 Thread Gerd Hoffmann
On Di, 2017-05-02 at 20:49 +, Kinney, Michael D wrote:
> Laszlo,
> 
> Is it possible to add more TSEG sizes to the Q35 board?

qemu implements what physical q35 support.  The extended smram register
has two bits for the tseg size, three out of the four values are used
(for 1, 2, 8 MB sizes).  "11" is reserved in the specs.  We could use
"11" to implement a bigger tseg.  Current code sets the tseg size to
zero for "11".  Alternatively we could add some qemu-specific register.

When implementing this in qemu we will have to do it runtime-switchable,
for backward compatibility with older qemu versions.  So ideally
firmware would detect somehow whenever qemu supports a bigger tseg or
not and adapt at runtime.  If edk2 can't do this we would need two edk2
builds ...

cheers,
  Gerd

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-02 Thread Yao, Jiewen
Yes, increasing TSEG might be an easy way.

I have seen a physical board using 16M TSEG or even 32M TSEG on a large memory 
and many core system.

Thank you
Yao Jiewen

From: Kinney, Michael D
Sent: Wednesday, May 3, 2017 4:49 AM
To: Laszlo Ersek ; Fan, Jeff ; Yao, 
Jiewen ; Kinney, Michael D 
Cc: edk2-devel-01 ; Gerd Hoffmann ; 
Paolo Bonzini 
Subject: RE: SMRAM sizes on large hosts

Laszlo,

Is it possible to add more TSEG sizes to the Q35 board?

There may be things we can do to reduce the per CPU SMRAM
overhead, but those will likely take some time, be more
complex, and require significant validation.  And...we may
run into the same issue again if there continues to be
requirements to increase the number of VCPUs.

Thanks,

Mike

> -Original Message-
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Tuesday, May 2, 2017 11:16 AM
> To: Fan, Jeff >; Kinney, 
> Michael D >;
> Yao, Jiewen >
> Cc: edk2-devel-01 >; 
> Gerd Hoffmann >; Paolo
> Bonzini >
> Subject: SMRAM sizes on large hosts
>
> Hi All,
>
> in your experience, how much SMRAM do "big hosts" provide? (Machines
> that have, say, ~300 CPU cores.)
>
> With QEMU's Q35 board, which provides 8MB of SMRAM (TSEG), we're hitting
> various out-of-SMRAM conditions with OVMF at around 230-240 VCPUs. We'd
> like to go to a higher VCPU count than that.
>
> * So, in your experience, how much SMRAM do physical boards, that carry
> such a high number of cores, provide?
>
> * Perhaps we'll have to do something about the SMRAM size on QEMU in the
> longer term, but until then, can you guys recommend various "cheap
> tricks" to decrease per-VCPU SMRAM usage?
>
> For example, in OVMF we have a 16KB SMM stack per VCPU, and we also
> enable the SMM stack overflow guard page -- we had been hit by an SMM
> stack overflow with the original 8KB stack size, and so we increased
> both the stack size and enabled the guard page; see commits
>
>   509f8425b75d UefiCpuPkg: change PcdCpuSmmStackGuard default to TRUE
>   0d0c245dfb14 OvmfPkg: set SMM stack size to 16KB
>
> I've now tried to decrease the stack size to the "middle point" 12KB.
> That stack size does not reproduce the SMM stack overflow seen
> originally, but it also doesn't help with the SMRAM exhaustion -- we
> cannot go to any higher VCPU count with it.
>
> Are there any other "tweakables" (PCDs) we could massage to see the
> per-VCPU SMRAM usage go down?
>
> Here's a (somewhat indiscriminate) list of PCDs, from the OVMF Ia32X64
> build report file, where each PCD's name contains "smm":
>
> PcdCpuSmmBlockStartupThisAp :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmDebug  :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmFeatureControlMsrLock  :   FLAG  (BOOLEAN) = 1
> PcdCpuSmmProfileEnable  :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmProfileRingBuffer  :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmStackGuard :   FLAG  (BOOLEAN) = 1
>  *P PcdCpuSmmEnableBspElection  :   FLAG  (BOOLEAN) = 0
>  *P PcdSmmSmramRequire  :   FLAG  (BOOLEAN) = 1
>
> PcdCpuSmmCodeAccessCheckEnable  :  FIXED  (BOOLEAN) = 1
> PcdCpuSmmProfileSize:  FIXED   (UINT32) = 0x20
> PcdCpuSmmStaticPageTable:  FIXED  (BOOLEAN) = 1
>  *P PcdCpuSmmStackSize  :  FIXED   (UINT32) = 0x4000
>
> PcdLoadFixAddressSmmCodePageNumber  :  PATCH   (UINT32) = 0
>
> PcdS3BootScriptTablePrivateSmmDataPtr   :DYN   (UINT64) = 0x0
>  *P PcdCpuSmmApSyncTimeout  :DYN   (UINT64) = 10
>  *P PcdCpuSmmSyncMode   :DYN(UINT8) = 0x01
>
> (BTW, security features should not be disabled, even if they saved some
> SMRAM.)
>
> Thank you,
> Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] SMRAM sizes on large hosts

2017-05-02 Thread Kinney, Michael D
Laszlo,

Is it possible to add more TSEG sizes to the Q35 board?

There may be things we can do to reduce the per CPU SMRAM
overhead, but those will likely take some time, be more
complex, and require significant validation.  And...we may
run into the same issue again if there continues to be 
requirements to increase the number of VCPUs.

Thanks,

Mike

> -Original Message-
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Tuesday, May 2, 2017 11:16 AM
> To: Fan, Jeff ; Kinney, Michael D 
> ;
> Yao, Jiewen 
> Cc: edk2-devel-01 ; Gerd Hoffmann 
> ; Paolo
> Bonzini 
> Subject: SMRAM sizes on large hosts
> 
> Hi All,
> 
> in your experience, how much SMRAM do "big hosts" provide? (Machines
> that have, say, ~300 CPU cores.)
> 
> With QEMU's Q35 board, which provides 8MB of SMRAM (TSEG), we're hitting
> various out-of-SMRAM conditions with OVMF at around 230-240 VCPUs. We'd
> like to go to a higher VCPU count than that.
> 
> * So, in your experience, how much SMRAM do physical boards, that carry
> such a high number of cores, provide?
> 
> * Perhaps we'll have to do something about the SMRAM size on QEMU in the
> longer term, but until then, can you guys recommend various "cheap
> tricks" to decrease per-VCPU SMRAM usage?
> 
> For example, in OVMF we have a 16KB SMM stack per VCPU, and we also
> enable the SMM stack overflow guard page -- we had been hit by an SMM
> stack overflow with the original 8KB stack size, and so we increased
> both the stack size and enabled the guard page; see commits
> 
>   509f8425b75d UefiCpuPkg: change PcdCpuSmmStackGuard default to TRUE
>   0d0c245dfb14 OvmfPkg: set SMM stack size to 16KB
> 
> I've now tried to decrease the stack size to the "middle point" 12KB.
> That stack size does not reproduce the SMM stack overflow seen
> originally, but it also doesn't help with the SMRAM exhaustion -- we
> cannot go to any higher VCPU count with it.
> 
> Are there any other "tweakables" (PCDs) we could massage to see the
> per-VCPU SMRAM usage go down?
> 
> Here's a (somewhat indiscriminate) list of PCDs, from the OVMF Ia32X64
> build report file, where each PCD's name contains "smm":
> 
> PcdCpuSmmBlockStartupThisAp :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmDebug  :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmFeatureControlMsrLock  :   FLAG  (BOOLEAN) = 1
> PcdCpuSmmProfileEnable  :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmProfileRingBuffer  :   FLAG  (BOOLEAN) = 0
> PcdCpuSmmStackGuard :   FLAG  (BOOLEAN) = 1
>  *P PcdCpuSmmEnableBspElection  :   FLAG  (BOOLEAN) = 0
>  *P PcdSmmSmramRequire  :   FLAG  (BOOLEAN) = 1
> 
> PcdCpuSmmCodeAccessCheckEnable  :  FIXED  (BOOLEAN) = 1
> PcdCpuSmmProfileSize:  FIXED   (UINT32) = 0x20
> PcdCpuSmmStaticPageTable:  FIXED  (BOOLEAN) = 1
>  *P PcdCpuSmmStackSize  :  FIXED   (UINT32) = 0x4000
> 
> PcdLoadFixAddressSmmCodePageNumber  :  PATCH   (UINT32) = 0
> 
> PcdS3BootScriptTablePrivateSmmDataPtr   :DYN   (UINT64) = 0x0
>  *P PcdCpuSmmApSyncTimeout  :DYN   (UINT64) = 10
>  *P PcdCpuSmmSyncMode   :DYN(UINT8) = 0x01
> 
> (BTW, security features should not be disabled, even if they saved some
> SMRAM.)
> 
> Thank you,
> Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


[edk2] SMRAM sizes on large hosts

2017-05-02 Thread Laszlo Ersek
Hi All,

in your experience, how much SMRAM do "big hosts" provide? (Machines
that have, say, ~300 CPU cores.)

With QEMU's Q35 board, which provides 8MB of SMRAM (TSEG), we're hitting
various out-of-SMRAM conditions with OVMF at around 230-240 VCPUs. We'd
like to go to a higher VCPU count than that.

* So, in your experience, how much SMRAM do physical boards, that carry
such a high number of cores, provide?

* Perhaps we'll have to do something about the SMRAM size on QEMU in the
longer term, but until then, can you guys recommend various "cheap
tricks" to decrease per-VCPU SMRAM usage?

For example, in OVMF we have a 16KB SMM stack per VCPU, and we also
enable the SMM stack overflow guard page -- we had been hit by an SMM
stack overflow with the original 8KB stack size, and so we increased
both the stack size and enabled the guard page; see commits

  509f8425b75d UefiCpuPkg: change PcdCpuSmmStackGuard default to TRUE
  0d0c245dfb14 OvmfPkg: set SMM stack size to 16KB

I've now tried to decrease the stack size to the "middle point" 12KB.
That stack size does not reproduce the SMM stack overflow seen
originally, but it also doesn't help with the SMRAM exhaustion -- we
cannot go to any higher VCPU count with it.

Are there any other "tweakables" (PCDs) we could massage to see the
per-VCPU SMRAM usage go down?

Here's a (somewhat indiscriminate) list of PCDs, from the OVMF Ia32X64
build report file, where each PCD's name contains "smm":

PcdCpuSmmBlockStartupThisAp :   FLAG  (BOOLEAN) = 0
PcdCpuSmmDebug  :   FLAG  (BOOLEAN) = 0
PcdCpuSmmFeatureControlMsrLock  :   FLAG  (BOOLEAN) = 1
PcdCpuSmmProfileEnable  :   FLAG  (BOOLEAN) = 0
PcdCpuSmmProfileRingBuffer  :   FLAG  (BOOLEAN) = 0
PcdCpuSmmStackGuard :   FLAG  (BOOLEAN) = 1
 *P PcdCpuSmmEnableBspElection  :   FLAG  (BOOLEAN) = 0
 *P PcdSmmSmramRequire  :   FLAG  (BOOLEAN) = 1

PcdCpuSmmCodeAccessCheckEnable  :  FIXED  (BOOLEAN) = 1
PcdCpuSmmProfileSize:  FIXED   (UINT32) = 0x20
PcdCpuSmmStaticPageTable:  FIXED  (BOOLEAN) = 1
 *P PcdCpuSmmStackSize  :  FIXED   (UINT32) = 0x4000

PcdLoadFixAddressSmmCodePageNumber  :  PATCH   (UINT32) = 0

PcdS3BootScriptTablePrivateSmmDataPtr   :DYN   (UINT64) = 0x0
 *P PcdCpuSmmApSyncTimeout  :DYN   (UINT64) = 10
 *P PcdCpuSmmSyncMode   :DYN(UINT8) = 0x01

(BTW, security features should not be disabled, even if they saved some
SMRAM.)

Thank you,
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel