[SeaBIOS] Re: [EXTERNAL] Re: How to change Seabios configuration to disable Legacy Bios and enable UEFI bios

2021-03-24 Thread Laszlo Ersek
On 03/24/21 18:24, Fatemi (US), Afsheen K wrote:
> Thanks for your response, Keith.
> 
> What do I need to do to switch to OVMF BIOS?

There is *zero support* for the below, first because the RH-provided
"ovmf" SRPM in RHEL7 is Tech Preview in all minor releases of RHEL7 [1],
second because in base RHEL7 (in the qemu-kvm package) there is no QEMU
binary that's capable of executing the OVMF binary mentioned previously
[2]. For that, you either need the qemu-kvm-rhev package from the RHV or
RHOSP layered products (not base RHEL7), or the qemu-kvm-ev package from
CentOS 7.

[1] https://access.redhat.com/discussions/2958371#comment-1155681
[2] https://access.redhat.com/solutions/3364131

So the minimal, strictly technical-sense answer to your question, in an
environment that's "as much RHEL7 as possible", is:

- upgrade the host to the latest RHEL7 minor release (for sake of having
the latest RHEL7 KVM bits in the kernel)

- install qemu-kvm-ev from CentOS 7

- install the latest "OVMF" package (built from the "ovmf" SRPM),
currently "OVMF-20180508-6.gitee3198e672e2.el7.noarch.rpm". Available in
the RHEL7 Server product only.

- In virt-manager, define a new domain, and select "customize before
install", as one of the last steps. Then go to Overview, and select Q35
as chipset, and UEFI as firmware.

If you search the web for some of the above terms (ovmf centos
qemu-kvm-ev), you'll find some more info. My memories are a bit rusty
because I've done this a very long time ago on RHEL7.

Anyway, you'd be much-much better off using a more recent distribution
(for example, RHEL-8).

If you'd like to continue this discussion, please let's not generate
more noise about OVMF on the SeaBIOS development list. I recommend this
list instead:

  disc...@edk2.groups.io
  https://edk2.groups.io/g/discuss/

Thanks,
Laszlo

> 
> Afsheen
> 
> From: Keith Hui [mailto:buu...@gmail.com]
> Sent: Wednesday, March 24, 2021 10:08 AM
> To: Fatemi (US), Afsheen K 
> Cc: seabios@seabios.org
> Subject: [EXTERNAL] Re: [SeaBIOS] How to change Seabios configuration to 
> disable Legacy Bios and enable UEFI bios
> 
> 
> EXT email: be mindful of links/attachments.
> 
> 
> 
> 
> Hi Afsheen,
> 
> SeaBIOS IS legacy BIOS. Although Windows 10 should still work with it, if you 
> want UEFI only, you need to look at a different solution such as OVMF.
> 
> Regards
> Keith
> 
> On Wed., Mar. 24, 2021, 11:01 Fatemi (US), Afsheen K, 
> mailto:afsheen.k.fat...@boeing.com>> wrote:
> Dear SeaBios Support,
> 
> I’m using KVM on a Redhat 7.3. I’m having difficulty creating a Windows 10 VM 
> as it complains in the beginning of the installation that the BIOS is set to 
> Legacy which should be disabled and set to UEFI. The BIOS of the physical 
> host is indeed set to UEFI but the BIOS of the virtual box that KVM is 
> using(Seabios version 1.9.1-5.el7) appears to be set to Legacy. I don’t know 
> how to reconfigure Seabios to change it to UEFI. The VM boot menu doesn’t 
> have any options to go to its bios. Apparently there should be some “kemu” or 
> similar command with correct parameters which might do the trick. Any help 
> would be appreciated very much.
> 
> Regards,
> 
> Afsheen
> 
> 
> ___
> SeaBIOS mailing list -- seabios@seabios.org
> To unsubscribe send an email to 
> seabios-le...@seabios.org
> 
> 
> 
> ___
> SeaBIOS mailing list -- seabios@seabios.org
> To unsubscribe send an email to seabios-le...@seabios.org
> 

___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: How to boot from DVD/CD automatically, instead of pressing a key by hand?

2021-03-01 Thread Laszlo Ersek
On 03/01/21 10:31, Gerd Hoffmann wrote:
> On Sun, Feb 28, 2021 at 02:22:53PM -, Yiguang Chen wrote:
>> Most time, When a vm with seabios start. The bios will display such info:
>> -
>> Seabios (version rel-1.13-0 ..)
>> Machine UUID ...
>>
>> IPXE ..
>>
>> IPXE..
>>
>> Booting from DVD/CD...
>> Press any key to boot from CD or DVD.
>> ---
>>
>> It means that have a bootable cdrom to boot. But If we want to boot
>> from cdrom, we must press any key as what the warning had said. If it
>> is possible to boot from DVD/CD automatically, instead of pressing a
>> key by hand?
> 
> This is a windows install iso, right?
> 
> This isn't seabios, the windows boot loader does that.  I think windows
> does this only in case it finds a bootable hard disk.  So when booting
> the guest with a fresh & blank virtual hard disk it should boot the
> windows installer without asking for a key press.

This doesn't match my experience with (UEFI) Windows installs -- even if
the hard disk is blank (not even partitioned), the windows boot loader
asks for a key press. (IIRC)

Laszlo

___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: [PATCH 1/3] boot: cache HALT priority

2020-01-14 Thread Laszlo Ersek
On 01/14/20 10:25, Gerd Hoffmann wrote:
> Call find_prio("HALT") only once, on first is_bootprio_strict() call.
> Store the result in a variable and reuse it on subsequent calls.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  src/boot.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/src/boot.c b/src/boot.c
> index 5182ab426b9f..afeb36a3319a 100644
> --- a/src/boot.c
> +++ b/src/boot.c
> @@ -297,7 +297,11 @@ find_prio(const char *glob)
>  
>  u8 is_bootprio_strict(void)
>  {
> -return find_prio("HALT") >= 0;
> +static int prio_halt = -2;
> +
> +if (prio_halt == -2)
> +prio_halt = find_prio("HALT");
> +return prio_halt >= 0;
>  }
>  
>  int bootprio_find_pci_device(struct pci_device *pci)
> 

General question. Is it safe to use static variables in this file with
initializers, i.e. without assigning their initial values through
statements?

What happens at reset? In particular, the "bootorder" fw_cfg file may
change across reset. (I'm not certain if specifically "HALT" can change
in "bootorder" across reset.)

I've found two static variables in this file:

static BootDevice *BootDevices VARVERIFY32INIT;
static int BootDeviceCount;

They seem to be set explicitly (in assignment statements) in the
loadBootDevices() function. "BootDeviceCount" seems to support my
concern, because

BootDeviceCount = 0;

is otherwise redundant, given:

static int BootDeviceCount;

I realize this is a very basic question to someone closely familiar with
the SeaBIOS architecture. Thanks for bearing with me!

Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: strange behavior (regression?) with SeaBIOS + iPXE + WDSNBP.COM

2020-01-02 Thread Laszlo Ersek
Hi All,

On 11/18/19 18:58, Laszlo Ersek wrote:
> On 11/16/19 01:59, Michael Brown wrote:
>> On 15/11/2019 21:49, Laszlo Ersek wrote:
>>> (Right now, my own env is a minimal "mock" setup, with a semi-random
>>> WDSNBP.COM binary extracted from a long-term Windows Server 2012 R2
>>> virtual machine of mine. Nothing beyond serving that binary up with
>>> libvirt / dnsmasq is configured, so the setup is not nearly a "real" WDS
>>> one.)
>>
>> I'm very happy to work in a minimal mock setup, if it's sufficient to
>> reproduce the problem.  Most of my test setups are built that way already.
> 
> Apologies, I was unclear. My personal env is minimal to the point of not
> reproducing the problem. I used my environment only for checking the
> DBGC output (code and data size) in pxe_start_nbp(). In my env, the
> above-referenced WDSNBP.COM binary tries to contact the WDS server, and
> then cleanly gives up. I don't see any looping.
> 
> I've sent out some requests internally, for more information.

This issue has now been resolved. The problem was related to WDSNBP.COM.

There are apparently multiple versions of WDSNBP.COM in common (?) use. For 
example:

(1) size: 31,140 bytes
sha256: 75ccf88f9ceefcf02089b6f859ebbdb39eba05f63ebeda48c3f7cc318e4bf2b4
shipped with: Windows Server 2008 R2 (possibly as an upgrade?)

(2) size: 31,124
sha256: 44d07502bb87c9e89c68f0d101fb33636dc389b5607e6c173524e6506bcb2f1c
shipped with: Windows Server 2008 R2 (possibly as an upgrade?)

(3) size: 30,832 bytes
sha256: 2b2fb3a7cfba1ef640bcb5d75050e57c79ff639ec621b0162f837c2c889ca178
shipped with: Windows Server 2012 R2

With WDSNBP.COM consistently upgraded to version (3), in the WDS environment 
that originally experienced the issue, the symptoms have disappeared.

Thank you Michael for your help!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: strange behavior (regression?) with SeaBIOS + iPXE + WDSNBP.COM

2019-11-18 Thread Laszlo Ersek
On 11/16/19 01:59, Michael Brown wrote:
> On 15/11/2019 21:49, Laszlo Ersek wrote:
>> (Right now, my own env is a minimal "mock" setup, with a semi-random
>> WDSNBP.COM binary extracted from a long-term Windows Server 2012 R2
>> virtual machine of mine. Nothing beyond serving that binary up with
>> libvirt / dnsmasq is configured, so the setup is not nearly a "real" WDS
>> one.)
> 
> I'm very happy to work in a minimal mock setup, if it's sufficient to
> reproduce the problem.  Most of my test setups are built that way already.

Apologies, I was unclear. My personal env is minimal to the point of not
reproducing the problem. I used my environment only for checking the
DBGC output (code and data size) in pxe_start_nbp(). In my env, the
above-referenced WDSNBP.COM binary tries to contact the WDS server, and
then cleanly gives up. I don't see any looping.

I've sent out some requests internally, for more information.

Thank you!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: strange behavior (regression?) with SeaBIOS + iPXE + WDSNBP.COM

2019-11-15 Thread Laszlo Ersek
Hello Michael,

On 11/15/19 18:03, Michael Brown wrote:
> On 15/11/2019 08:45, Laszlo Ersek wrote:
>> (1) There is a functional iPXE + WDS setup, with iPXE built as a
>> traditional BIOS PCI option ROM, using CONFIG=qemu. Accordingly the
>> platform is qemu, with SeaBIOS, and the NIC is virtio-net-pci.
>>
>> I don't know anything about the particulars of the WDS setup at this
>> point, only that the boot loader program it exposes is WDSNBP.COM.
>>
>> 
>>
>> Any hints as to what could be going wrong?
> 
> Your analysis appears correct to me throughout.  No idea what might be
> the root cause at this stage.
> 
> Do you have an easy set of instructions for reproducing the problem?  A
> copy of the precise version of WDSNBP.COM that you are using may be
> sufficient.

Thank you very much for responding!

I will get to work on collecting the details of the actual WDS
environment. It will probably take some time. I'll make an attempt to
reproduce the issue directly in my environment too, so I can provide
working instructions.

(Right now, my own env is a minimal "mock" setup, with a semi-random
WDSNBP.COM binary extracted from a long-term Windows Server 2012 R2
virtual machine of mine. Nothing beyond serving that binary up with
libvirt / dnsmasq is configured, so the setup is not nearly a "real" WDS
one.)

I'll report back.

Thank you again!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] strange behavior (regression?) with SeaBIOS + iPXE + WDSNBP.COM

2019-11-15 Thread Laszlo Ersek
Hi Michael & Lists,

I'd like to ask for ideas with the following problem we have.


(1) There is a functional iPXE + WDS setup, with iPXE built as a
traditional BIOS PCI option ROM, using CONFIG=qemu. Accordingly the
platform is qemu, with SeaBIOS, and the NIC is virtio-net-pci.

I don't know anything about the particulars of the WDS setup at this
point, only that the boot loader program it exposes is WDSNBP.COM.


(2) The setup works fine when iPXE is built at commit 4e85b2708fa0
("[virtio] Use host-specified MTU when available", 2017-01-23).


(3) When iPXE is built at commit 133f4c47baef ("[build] Handle
R_X86_64_PLT32 from binutils 2.31", 2018-09-17), the setup breaks.

The symptom is that iPXE fetches WDSNBP.COM just fine, but WDSNBP.COM,
rather than doing whatever it does otherwise, keeps PXE-booting itself
(3+ times), and finally aborts.

Consider the following log output (my undertanding is that all this is
logged by WDSNBP.COM):

> Downloaded WDSNBP...
>
> Press F12 for network service boot
> Architecture: x64
> WDSNBP started using DHCP Referral.
> Contacting Server: ... (Gateway: ...)
> Contacting Server: ...
> TFTP Download: boot\x86\wdsnbp.com

This block repeats approx. 3 times, after which the following is
displayed:

> Windows Deployment Services: PXE Boot Aborted.
> Could not boot image: Error 0x7f8d8101 (http://ipxe.org/7f8d8101)
> No more network devices
>
> No bootable device

My understanding is that the first line from this last block is printed
by WDSNBP.COM, the second line by iPXE (in pxe_start_nbp()), the third
line also by iPXE, and the last one by SeaBIOS.

This seems to indicate that WDSNBP.COM exits with an error code, and
pxe_start_nbp() logs it as "Error 0x7f8d8101".


(4) Now, after a bit of searching the web, I've found the following
articles, which indicate that the WDS (= server side) setup is
incorrect:

(4a) "disable NetBios over TCPIP, on the WDS server"

  
https://techthoughts.info/pxe-booting-wds-dhcp-scope-vs-ip-helpers/#comment-4307
  
https://social.technet.microsoft.com/Forums/ie/en-US/f3883e8b-1039-477d-999d-73d9a6973fc4/wds-pxe-boot-tftp-download-loop-4-times-f12

(4b) "cover all combinations of forward and backwards slashes in
ReadFilter, on the WDS server"

   http://ipxe.org/appnote/chainload_wds#tftp_loops

However: the regression appears to be a function of *only* the git
commit at which we build iPXE. It seems so deterministic that we
bisected commit range 4e85b2708fa0..133f4c47baef. (Hence we have not
captured the network traffic yet, nor have we investigated the WDS
server config.)

The "culprit" commit is ea29122a70c6 ("[http] Include error messages for
4xx and 5xx response codes", 2017-12-28).


(5) Which makes no sense to me, unfortunately. :(

Commit ea29122a70c6 adds the "http_errors" array to the code. According
to

  src/include/ipxe/tables.h

and the build artifact

  src/bin/1af41000.rom.tmp.map

this new array is placed in a new section called

  .textdata.tbl.errortab.01

Trying to retro-fit those facts to the symptom encountered, I came up
with the idea that *maybe* the new array (or section) causes a memory
allocation failure in WDSNBP.COM -- due to increased memory footprint of
iPXE. Which then leads to the misbehavior of WDSNBP.COM.

After all, WDSNBP.COM is a 16-bit real-mode program:

  
https://support.microsoft.com/en-us/help/4468601/pxe-boot-in-configuration-manager

so it could be susceptible to the size & fragmentation of the RAM that
is under 640KB.


(6) Unfortunately, this "low RAM exhaustion" idea doesn't seem to hold
water. There are at least two counter-arguments:

(6a) if I revert commit ea29122a70c6 on top of commit 133f4c47baef, then
the issue does *not* go away.

(The issue also does not go away if I remove the "netdev_errors" array,
also on top of commit 133f4c47baef -- that's a larger array.)

(... In theory anyway, this might not necessarily disprove the memory
exhaustion idea. What if the iPXE footprint grows, over the
ea29122a70c6..133f4c47baef so much, for independent reasons, that
reverting ea29122a70c6 at the end cannot compensate for that increase?)

(6b) I added "DEBUG=pxe_call:1" to the "make" command, and compared the
debug messages printed by pxe_start_nbp(), between 4e85b2708fa0 and
133f4c47baef. Alas, the debug messages are identical:

> PXE NBP starting with netdev net0, code 9c6c:0802, data 9cf0:2ce0

which to me suggests that there is no change in the amount of memory
that is made available to WDSNBP.COM -- its code and data continue to
start at 0x9_CEC2 and 0x9_FBE0, respectively.


Any hints as to what could be going wrong?

Thanks!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: [PATCH] ahci: zero-initialize port struct

2019-11-13 Thread Laszlo Ersek
On 11/13/19 10:18, Gerd Hoffmann wrote:
> Specifically port->driver.lchs needs clearing, otherwise seabios will

s/driver/drive/

> try interpret whatever random crap happens to be there as disk geometry,
> which may or may not break boot depending on how lucky you are.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  src/hw/ahci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/hw/ahci.c b/src/hw/ahci.c
> index 97a072a1ca81..d45b4307ec68 100644
> --- a/src/hw/ahci.c
> +++ b/src/hw/ahci.c
> @@ -345,6 +345,7 @@ ahci_port_alloc(struct ahci_ctrl_s *ctrl, u32 pnr)
>  warn_noalloc();
>  return NULL;
>  }
> +memset(port, 0, sizeof(*port));
>  port->pnr = pnr;
>  port->ctrl = ctrl;
>  port->list = memalign_tmp(1024, 1024);
> 

Reviewed-by: Laszlo Ersek 
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: [PATCH] docs: Add developer-certificate-of-origin

2019-10-21 Thread Laszlo Ersek
On 10/21/19 17:31, Kevin O'Connor wrote:
> Update the documentation to be explicit about the signed-off-by
> convention.
> 
> Signed-off-by: Kevin O'Connor 
> ---
>  docs/Contributing.md |  5 
>  docs/developer-certificate-of-origin | 37 
>  2 files changed, 42 insertions(+)
>  create mode 100644 docs/developer-certificate-of-origin
> 
> diff --git a/docs/Contributing.md b/docs/Contributing.md
> index d0f2b5b..8d7 100644
> --- a/docs/Contributing.md
> +++ b/docs/Contributing.md
> @@ -18,3 +18,8 @@ submit patches. The SeaBIOS C code does follow a slightly 
> different
>  coding style from QEMU (eg, mixed code and C99 style variable
>  declarations are encouraged, braces are not required around single
>  statement blocks), however patches in the QEMU style are acceptable.
> +
> +As with QEMU, commits should contain a "Signed-off-by" line using your
> +real name (sorry, no pseudonyms or anonymous contributions) and a
> +current email address. It indicates agreement with the terms of the
> +[developer certificate of 
> origin](https://git.seabios.org/cgit/seabios.git/tree/docs/developer-certificate-of-origin).
> diff --git a/docs/developer-certificate-of-origin 
> b/docs/developer-certificate-of-origin
> new file mode 100644
> index 000..8201f99
> --- /dev/null
> +++ b/docs/developer-certificate-of-origin
> @@ -0,0 +1,37 @@
> +Developer Certificate of Origin
> +Version 1.1
> +
> +Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
> +1 Letterman Drive
> +Suite D4700
> +San Francisco, CA, 94129
> +
> +Everyone is permitted to copy and distribute verbatim copies of this
> +license document, but changing it is not allowed.
> +
> +
> +Developer's Certificate of Origin 1.1
> +
> +By making a contribution to this project, I certify that:
> +
> +(a) The contribution was created in whole or in part by me and I
> +have the right to submit it under the open source license
> +indicated in the file; or
> +
> +(b) The contribution is based upon previous work that, to the best
> +of my knowledge, is covered under an appropriate open source
> +license and I have the right under that license to submit that
> +work with modifications, whether created in whole or in part
> +by me, under the same open source license (unless I am
> +permitted to submit under a different license), as indicated
> +in the file; or
> +
> +(c) The contribution was provided directly to me by some other
> +person who certified (a), (b) or (c) and I have not modified
> +it.
> +
> +(d) I understand and agree that this project and the contribution
> +are public and that a record of the contribution (including all
> +personal information I submit with it, including my sign-off) is
> +maintained indefinitely and may be redistributed consistent with
> +this project or the open source license(s) involved.
> 

(I don't mean to derail this discussion, so feel free to ignore my
comments.)

I've grown to dislike URLs, pointing into git WebUIs, that lack a commit
hash. They basically mean "look at this file at the current master HEAD"
-- but that's a moving target.

I can see two ways to fix that:

- add the DCO in a separate commit, and then hard-code the commit hash
in the next patch (the one that adds the URL to Contributing.md)

- Capture the version of the DCO (1.1) in the file name
("docs/developer-certificate-of-origin-1.1"), and update the URL
accordingly. Assuming the DCO is upgraded, or changed otherwise, at a
later point, the DCO version part in the filename should change as well.
This will at least *break* old links (i.e. when looking at the link in
an old checkout of "docs/Contributing.md"), and warn users that they
have to find the DCO themselves that matches "Contributing.md" (such as,
check out the whole tree).

I don't know if a URL format exists that says,

  look at file "docs/developer-certificate-of-origin" at the same commit
  hash at which you are looking at "docs/Contributing.md" right now

(Because that's what you normally get with a plain local "git checkout
HASH" command.)

But, again, if this feels overly cautious, feel free to ignore.

Thanks
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: [PATCH v7 7/8] bootdevice: FW_CFG interface for LCHS values

2019-10-16 Thread Laszlo Ersek
Hi Sam,

On 10/16/19 13:02, Sam Eiderman wrote:
> Gentle Ping,
>
> Philippe, John?
>
> Just wondering if the series is okay, as Gerd pointed out this series
> is a blocker for the corresponding changes in SeaBIOS for v 1.13

The QEMU series is still not merged, due to a bug in the last patch
(namely, the test case, "hd-geo-test: Add tests for lchs override").

To my knowledge, SeaBIOS prefers to merge patches with the underlying
QEMU patches merged first, so you'll likely have to fix that QEMU issue
first.

I explained the bug in the QEMU test case here:

  http://mid.mail-archive.com/6b00dc74-7267-8ce8-3271-5db269edb1b7@redhat.com
  http://mid.mail-archive.com/700cd594-1446-e478-fb03-d2e6b862dc6c@redhat.com

(Alternative links to the same:

  https://lists.gnu.org/archive/html/qemu-devel/2019-10/msg01790.html
  https://lists.gnu.org/archive/html/qemu-devel/2019-10/msg01793.html
)

I've never received feedback to those messages, and I think you must
have missed them.

FWIW, when I hit "Reply All" in that thread, you were on the "To:" list
with:

  Sam Eiderman 

but here you are present with

  Sam Eiderman 

In addition, when I posted those messages, I got the following
auto-response ("Undelivered Mail Returned to Sender"):

> This is the mail system at host mx1.redhat.com.
>
> I'm sorry to have to inform you that your message could not
> be delivered to one or more recipients. It's attached below.
>
> For further assistance, please send mail to postmaster.
>
> If you do so, please include this problem report. You can
> delete your own text from the attached returned message.
>
>The mail system
>
> : host
> aserp2030.oracle.com[141.146.126.74] said:
> 550 5.1.1 Unknown recipient address. (in reply to RCPT TO command)

I didn't know your new address, so I could only hope you'd find my
feedback on qemu-devel.

Thanks
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: [Qemu-block] [QEMU] [PATCH v5 0/8] Add Qemu to SeaBIOS LCHS interface

2019-07-25 Thread Laszlo Ersek
On 07/25/19 02:50, John Snow wrote:
> 
> 
> On 7/24/19 8:47 PM, John Snow wrote:
>>
>>
>> On 7/19/19 6:10 AM, Sam Eiderman wrote:
>>> Well, this patch introduces 3 command line parameters (“lcyls”, “lheads”, 
>>> “lsecs”)
>>> to “scsi-hd” “ide-hd” and “virtio-pci-blk” so this somehow has something to 
>>> do with
>>> block.
>>>
>>> This patch also adds fw_cfg interface to send these parameters to SeaBIOS.
>>>
>>> "scripts/get_maintainer.pl -f hw/nvram/fw_cfg.c” gives
>>>
>>> "Philippe Mathieu-Daudé"  (supporter:Firmware 
>>> configur...)
>>> Laszlo Ersek  (reviewer:Firmware configur...)
>>> Gerd Hoffmann  (reviewer:Firmware configur…)
>>>
>>> And this was already Reviewed-by Gerd.
>>>
>>> How should I proceed?
>>>
>>> Sam
>>>
>>
>> I feel like it would be up to Gerd as the general SeaBIOS point of contact?
>>
> 
> ...ah, who is offline for vacation.
> 
> We're in freeze right now anyway, so I would think that Gerd and/or
> Kevin can work out who ought to stage this for a PR when the tree opens
> again.
> 

I think the sole patch in the series that modifies "hw/nvram/fw_cfg.c" is

 [Qemu-devel] [QEMU] [PATCH v5 7/8] bootdevice: FW_CFG interface for LCHS values
  20190626123948.10199-8-shmuel.eiderman@oracle.com">http://mid.mail-archive.com/20190626123948.10199-8-shmuel.eiderman@oracle.com

and neither Phil nor myself seem to be CC'd on it (I've found the message in my 
list folder only).

Regarding fw_cfg, I only review Phil's fw_cfg patches (so that whenever he 
posts patches, he can count on my review); other than that, I generally skip 
fw_cfg patches. And, I totally don't have a tree for collecting such patches.

Now, while Phil does:

  T: git https://github.com/philmd/qemu.git fw_cfg-next

I still don't think that tree would be the best for queueing this series, given 
the diffstat:

 bootdevice.c | 148 +---
 hw/block/virtio-blk.c|   6 +
 hw/ide/qdev.c|   7 +-
 hw/nvram/fw_cfg.c|  14 +-
 hw/scsi/scsi-bus.c   |  15 ++
 hw/scsi/scsi-disk.c  |  14 ++
 include/hw/block/block.h |  22 +-
 include/hw/scsi/scsi.h   |   1 +
 include/sysemu/sysemu.h  |   4 +
 tests/Makefile.include   |   2 +-
 tests/hd-geo-test.c  | 582 +++
 11 files changed, 774 insertions(+), 41 deletions(-)

Just my two cents.

Thanks
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: [Qemu-devel] [QEMU] [PATCH 7/8] bootdevice: FW_CFG interface for LCHS values

2019-06-12 Thread Laszlo Ersek
On 06/12/19 11:42, Sam Eiderman wrote:
> Using fw_cfg, supply logical CHS values directly from QEMU to the BIOS.
> 
> Non-standard logical geometries break under QEMU.
> 
> A virtual disk which contains an operating system which depends on
> logical geometries (consistent values being reported from BIOS INT13
> AH=08) will most likely break under QEMU/SeaBIOS if it has non-standard
> logical geometries - for example 56 SPT (sectors per track).
> No matter what QEMU will report - SeaBIOS, for large enough disks - will
> use LBA translation, which will report 63 SPT instead.
> 
> In addition we cannot force SeaBIOS to rely on physical geometries at
> all. A virtio-blk-pci virtual disk with 255 phyiscal heads cannot
> report more than 16 physical heads when moved to an IDE controller,
> since the ATA spec allows a maximum of 16 heads - this is an artifact of
> virtualization.
> 
> By supplying the logical geometries directly we are able to support such
> "exotic" disks.
> 
> We serialize this information in a similar way to the "bootorder"
> interface.
> The fw_cfg entry is "bootdevices" and it serializes a struct.
> At the moment the struct holds the values of logical CHS values but it
> can be expanded easily due to the extendable ABI implemented.
> 
> (In the future, we can pass the bootindex through "bootdevices" instead
> "bootorder" - unifying all bootdevice information in one fw_cfg value)

I would disagree with that. UEFI guest firmware doesn't seem to have any
use for this new type of information ("logical CHS values"), so the
current interface (the "bootorder" fw_cfg file) should continue to work.
The ArmVirtQemu and OVMF platform firmwares (built from the edk2
project, and bundled with QEMU 4.1+) implement some serious parsing and
processing for "bootorder".

Independently, another comment:

> The PV interface through fw_cfg could have also been implemented using
> device specific keys, e.g.: "/etc/bootdevice/%s/logical_geometry" where
> %s is the device name QEMU produces - but this implementation would
> require much more code refactoring, both in QEMU and SeaBIOS, so the
> current implementation was chosen.
> 
> Reviewed-by: Karl Heubaum 
> Reviewed-by: Arbel Moshe 
> Signed-off-by: Sam Eiderman 
> ---
>  bootdevice.c| 42 ++
>  hw/nvram/fw_cfg.c   | 14 +++---
>  include/sysemu/sysemu.h |  1 +
>  3 files changed, 54 insertions(+), 3 deletions(-)
> 
> diff --git a/bootdevice.c b/bootdevice.c
> index 2b12fb85a4..84c2a83f25 100644
> --- a/bootdevice.c
> +++ b/bootdevice.c
> @@ -405,3 +405,45 @@ void del_boot_device_lchs(DeviceState *dev, const char 
> *suffix)
>  }
>  }
>  }
> +
> +typedef struct QEMU_PACKED BootDeviceEntrySerialized {
> +/* Do not change field order - add new fields below */
> +uint32_t lcyls;
> +uint32_t lheads;
> +uint32_t lsecs;
> +} BootDeviceEntrySerialized;
> +
> +/* Serialized as: struct size (4) + (device name\0 + device struct) x 
> devices */
> +char *get_boot_devices_info(size_t *size)
> +{
> +FWLCHSEntry *i;
> +BootDeviceEntrySerialized s;
> +size_t total = 0;
> +char *list = NULL;
> +
> +list = g_malloc0(sizeof(uint32_t));
> +*((uint32_t *)list) = (uint32_t)sizeof(s);
> +total = sizeof(uint32_t);
> +
> +QTAILQ_FOREACH(i, _lchs, link) {
> +char *bootpath;
> +size_t len;
> +
> +bootpath = get_boot_device_path(i->dev, false, i->suffix);
> +s.lcyls = i->lcyls;
> +s.lheads = i->lheads;
> +s.lsecs = i->lsecs;

You should document the endianness of the fields in
BootDeviceEntrySerialized, and then call byte order conversion functions
here accordingly (most probably cpu_to_le32()).

As written, this code would break if you ran qemu-system-x86_64 /
qemu-system-i386 (with TCG acceleration) on a big endian host.

Thanks
Laszlo

> +
> +len = strlen(bootpath) + 1;
> +list = g_realloc(list, total + len + sizeof(s));
> +memcpy([total], bootpath, len);
> +memcpy([total + len], , sizeof(s));
> +total += len + sizeof(s);
> +
> +g_free(bootpath);
> +}
> +
> +*size = total;
> +
> +return list;
> +}
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 9f7b7789bc..008b21542f 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -916,13 +916,21 @@ void *fw_cfg_modify_file(FWCfgState *s, const char 
> *filename,
>  
>  static void fw_cfg_machine_reset(void *opaque)
>  {
> +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> +FWCfgState *s = opaque;
>  void *ptr;
>  size_t len;
> -FWCfgState *s = opaque;
> -char *bootindex = get_boot_devices_list();
> +char *buf;
>  
> -ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
> +buf = get_boot_devices_list();
> +ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)buf, len);
>  g_free(ptr);
> +
> +if (!mc->legacy_fw_cfg_order) {
> 

[SeaBIOS] Re: Mailing list update

2019-01-22 Thread Laszlo Ersek
On 01/21/19 19:19, Kevin O'Connor wrote:
> On Mon, Jan 21, 2019 at 07:03:57PM +0100, Laszlo Ersek wrote:
>> On 01/20/19 18:07, Kevin O'Connor wrote:
>>> On Thu, Jan 10, 2019 at 07:14:14PM +0100, Laszlo Ersek wrote:
>>>>   https://www.mail-archive.com/seabios@seabios.org/
>>>>   https://www.mail-archive.com/seabios@seabios.org/info.html
>>>>
>>>> So, I'd suggest incorporating these links (as secondary archive URLs)
>>>> into the wiki article as well.
>>>
>>> Thanks.  I made the suggested changes (commit d62ca8c9).
>>
>> Thank you, Kevin!
>>
>> While the article at <https://www.seabios.org/Mailinglist> looks good,
>> the third link, in:
>>
>>   Messages prior to January 2019 are archived at:
>>   http://www.seabios.org/mailman/listinfo/seabios/
>>
>> is still broken; it is redirected to
>> <https://mail.coreboot.org/mailman/listinfo/seabios/>, and there I get
>>
>>   Page not found
>>   This page either doesn't exist, or it moved somewhere else.
> 
> Oops.  Thanks for pointing it out.  Should be fixed now.

It is, thanks!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: Mailing list update

2019-01-21 Thread Laszlo Ersek
On 01/20/19 18:07, Kevin O'Connor wrote:
> On Thu, Jan 10, 2019 at 07:14:14PM +0100, Laszlo Ersek wrote:
>>   https://www.mail-archive.com/seabios@seabios.org/
>>   https://www.mail-archive.com/seabios@seabios.org/info.html
>>
>> So, I'd suggest incorporating these links (as secondary archive URLs)
>> into the wiki article as well.
> 
> Thanks.  I made the suggested changes (commit d62ca8c9).

Thank you, Kevin!

While the article at <https://www.seabios.org/Mailinglist> looks good,
the third link, in:

  Messages prior to January 2019 are archived at:
  http://www.seabios.org/mailman/listinfo/seabios/

is still broken; it is redirected to
<https://mail.coreboot.org/mailman/listinfo/seabios/>, and there I get

  Page not found
  This page either doesn't exist, or it moved somewhere else.

Thanks!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: Mailing list update

2019-01-10 Thread Laszlo Ersek
On 01/10/19 12:10, Patrick Georgi wrote:
> Hi Kevin, Laszlo,
> 
> 9. Januar 2019 19:38, "Kevin O'Connor"  schrieb:
>> Patrick, can you confirm that we need to update the above links?
> Yes, the links changed because the entire architecture changed.
> 
> I kept pipermail as a read-only copy so old links into the archive keep 
> working (as far as possible: pipermail links are notoriously unreliable).
> 
>> Also, a few people asked if we could support the format of the older
>> mail archives - is that possible?
> It seems that mailman3 allows multiple message archive systems, with 
> hyperkitty being the only one supported right now.
> 
> A pipermail-style backend that creates a plain html archive would certainly 
> be a welcome addition, as Laszlo definitely isn't the only "dinosaur" (I'd 
> propose some changes though: for example basing the URL on the hashed 
> message-id, like hyperkitty does, instead of pipermail's plain counter, 
> improves URL stability), but I'm not sure if my python-fu is up to the levels 
> to quickly build something like that.
> 
> Once it exists, I'll happily integrate it, though.

Message-Id-based archive URLs are the best. GMANE used to support that,
but GMANE is dead. mail-archive.com and public-inbox.org also support
Message-Id-based archive URLs, and I think we could simply subscribe at
least mail-archive.com's agent to the seabios list:

  https://www.mail-archive.com/faq.html#newlist

Of course, it would be best if the list's own hosting provided the
MID-based search / URLs too.

... Wait, am I dumb? Yeah, I'm dumb. Because, mail-archive.com is
already subscribed to the seabios list:

  https://www.mail-archive.com/seabios@seabios.org/
  https://www.mail-archive.com/seabios@seabios.org/info.html

So, I'd suggest incorporating these links (as secondary archive URLs)
into the wiki article as well.

And, for example, here's a Message-ID based URL pointing to the initial
message of the current thread:

  20190108225218.GA13236@morn.lan">http://mid.mail-archive.com/20190108225218.GA13236@morn.lan

Thanks!
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


[SeaBIOS] Re: Mailing list update

2019-01-09 Thread Laszlo Ersek
On 01/08/19 23:52, Kevin O'Connor wrote:
> FYI, the SeaBIOS mailing list backend was recently updated (upgrade to
> Mailman3).  Service should not be impacted, but if anyone does
> experience an issue then please let me know.

Indeed, I ran into a problem with this recently. I wanted to reference
an earlier SeaBIOS thread -- that I do have locally -- on edk2-devel,
but it took too much effort until I found the right archive URL. Namely:

(1) The Wiki article at

  https://www.seabios.org/Mailinglist

is now out of date; the link
 is stale. I think it
should be
 instead.

(2) The other link in the same article,
, while functional, points to
an archive that does not seem to include messages past December 2018
(such as your message I'm responding to). We should probably add
 as well.

(3) Postorius, the new web interface [*], is an atrocity. But, I'm a
dinosaur, so do ignore this complaint...

[*] I used to think that the new (mailman3) WebUI was called HyperKitty,
but apparently HyperKitty is a back-end component:
.

Thanks,
Laszlo
___
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org


Re: [SeaBIOS] [PATCH] optionrom: disallow int19 redirect for pnp roms.

2018-11-28 Thread Laszlo Ersek
On 11/28/18 19:33, Kevin O'Connor wrote:
> On Wed, Nov 28, 2018 at 06:50:50PM +0100, Laszlo Ersek wrote:
>> On 11/28/18 16:51, Kevin O'Connor wrote:
>>> If we could do it safely that would be fine.  My fear is that it
>>> introduces a regression.  A new config option would be okay, but it
>>> doesn't sound like that will help, as it seems that once one narrows
>>> down the problem to a bad behaving optionrom, one could just as easily
>>> block that optionrom instead..
>>
>> Do you mean that a "blacklist" should be added (a static array of
>> checksums, of known-bad ROM images)?
> 
> If I understand the bugzilla report correctly, it would be possible to
> avoid this issue by using  in libvirt.  It appears the
> issue is identifying the problem and then there are further issues
> with changing that config.
> 
> Implementing a default blacklist is a thought that I had.  If we feel
> the software we control is working as intended and it is the optionrom
> that is broken, then perhaps the focus should be on not running that
> optionrom.  (Effectively changing the default to run only known good
> optionroms on pci passthrough.)  I don't think SeaBIOS would be the
> place to maintain a blacklist/whitelist though, so it's an easy
> proposal for me to make..  I understand if it is not viable.

So, if I understand correctly:

- doing something generic in SeaBIOS is too risky / heavy-handed,
- discriminating individual oproms in SeaBIOS is out of scope.

Well... If we put it like that, I can't say that I disagree. I'll try to
carry this over to the RHBZ.

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] SMBIOS: Add SMBIOS Type 6 Memory Module Information

2018-11-28 Thread Laszlo Ersek
On 11/28/18 16:54, Liran Alon wrote:
> From: Arbel Moshe 
> 
> Add support for obsolete SMBIOS Type 6 which describes the speed, type,
> size and error status of each system memory module.
> 
> This is required by some guests to boot successfully.
> 
> Such an example is Cisco NGFW appliance which has a script which
> runs every boot that parses this SMBIOS Type 6 information and if
> it doesn't exists, it just fails to boot with an error of
> "Unable to parse dmidecode. Restarting now!".
> 
> Reviewed-by: Liran Alon 
> Reviewed-by: Ross Philipson 
> Signed-off-by: Arbel Moshe 
> ---
>  src/fw/smbios.c  | 46 ++

This file starts with the following two lines:

// smbios table generation (on emulators)
// DO NOT ADD NEW FEATURES HERE.  (See paravirt.c / biostables.c instead.)

Please consider extending the SMBIOS generator in QEMU instead.

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] optionrom: disallow int19 redirect for pnp roms.

2018-11-28 Thread Laszlo Ersek
On 11/28/18 16:51, Kevin O'Connor wrote:
> On Wed, Nov 28, 2018 at 12:14:07PM +0100, Laszlo Ersek wrote:
>> Right. Before I raised my short question about *not* short-circuiting
>> get_pnp_rom() with "isvga" set, I had read through the BZ, and I was
>> *very* tempted to say "this is what's wrong with our industry". :) The
>> oprom in question is mind-bogglingly broken, from the discussion /
>> analysis in the BZ.
>>
>> (I'm sure somewhere deep in an internal bug tracking system at the card
>> vendor there is a ticket about some broken platform BIOS where the BEV
>> wouldn't work, and they had to hook Int19h.)
> 
> Right - fundamental to X86 booting is the idea that firmware
> developers write code, PC manufacturers write code, peripheral
> manufacturers write code, and only users test all the code together.
> It's a broken workflow.  It's been nearly 40 years and X86 is still
> stuck in this broken workflow.
> 
>>>> I'm leery of making a change like this, because there's a good chance
>>>> it will break something in some other obscure software.
>>>
>>> I've added a rather verbose message printing some information about the
>>> rom because of that.
>>>
>>>> I think fixing this in iPXE would be preferable if possible.
>>>
>>> See above. ipxe doesn't need fixing.
>>
>> I support the addition of this "safety code", and I tend to agree (with
>> the BZ discussion) that making it *dynamically* configurable could be
>> difficult and/or overkill.
>>
>> Kevin, would you feel easier about the Int19h vector restoration if it
>> were controlled by a new, static, config knob?
> 
> If we could do it safely that would be fine.  My fear is that it
> introduces a regression.  A new config option would be okay, but it
> doesn't sound like that will help, as it seems that once one narrows
> down the problem to a bad behaving optionrom, one could just as easily
> block that optionrom instead..

Do you mean that a "blacklist" should be added (a static array of
checksums, of known-bad ROM images)?

Thanks,
Laszlo

> I'm not sure what the best choice is.
> 
> -Kevin
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] optionrom: disallow int19 redirect for pnp roms.

2018-11-28 Thread Laszlo Ersek
On 11/28/18 07:24, Gerd Hoffmann wrote:
> On Tue, Nov 27, 2018 at 09:19:09PM -0500, Kevin O'Connor wrote:
>> On Tue, Nov 27, 2018 at 01:10:38PM +0100, Gerd Hoffmann wrote:
>>> Check whenever pnp roms attempt to redirect int19, and in case it does
>>> log a message and undo the redirect.
>>>
>>> A pnp rom should not need this, we have BEVs and BCVs for that.
>>> Nevertheless there are roms in the wild which are redirecting int19.
>>> At least some BIOS implementations for physical hardware have a config
>>> option in the setup to allow/disallow int19 redirections, so just not
>>> allowing this seems to be the way to deal with this situation.
>>>
>>> Buglink: https://bugzilla.redhat.com//show_bug.cgi?id=1642135
>>
>> That is very odd.  I'm pretty sure iPXE normally does register itself
>> as a BEV - any idea why it's now hooking int19?
> 
> It's not ipxe.
> 
> It is the rom of a intel nic, attached to a guest via pci passthrough.
> It does both, register bev and hook int19.  No clue why.  The only
> reason I can think of is backward compatibility to firmware so old that
> it doesn't know pnp roms.  Which is a silly thing in pci express
> hardware.  Maybe they carry forward that code since decades ...

Right. Before I raised my short question about *not* short-circuiting
get_pnp_rom() with "isvga" set, I had read through the BZ, and I was
*very* tempted to say "this is what's wrong with our industry". :) The
oprom in question is mind-bogglingly broken, from the discussion /
analysis in the BZ.

(I'm sure somewhere deep in an internal bug tracking system at the card
vendor there is a ticket about some broken platform BIOS where the BEV
wouldn't work, and they had to hook Int19h.)

>> I'm leery of making a change like this, because there's a good chance
>> it will break something in some other obscure software.
> 
> I've added a rather verbose message printing some information about the
> rom because of that.
> 
>> I think fixing this in iPXE would be preferable if possible.
> 
> See above. ipxe doesn't need fixing.

I support the addition of this "safety code", and I tend to agree (with
the BZ discussion) that making it *dynamically* configurable could be
difficult and/or overkill.

Kevin, would you feel easier about the Int19h vector restoration if it
were controlled by a new, static, config knob?

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] optionrom: disallow int19 redirect for pnp roms.

2018-11-27 Thread Laszlo Ersek
On 11/27/18 13:10, Gerd Hoffmann wrote:
> Check whenever pnp roms attempt to redirect int19, and in case it does
> log a message and undo the redirect.
> 
> A pnp rom should not need this, we have BEVs and BCVs for that.
> Nevertheless there are roms in the wild which are redirecting int19.
> At least some BIOS implementations for physical hardware have a config
> option in the setup to allow/disallow int19 redirections, so just not
> allowing this seems to be the way to deal with this situation.
> 
> Buglink: https://bugzilla.redhat.com//show_bug.cgi?id=1642135
> Signed-off-by: Gerd Hoffmann 
> ---
>  src/optionroms.c | 18 +-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/src/optionroms.c b/src/optionroms.c
> index fc992f649f..4ec5504ca9 100644
> --- a/src/optionroms.c
> +++ b/src/optionroms.c
> @@ -8,6 +8,7 @@
>  #include "bregs.h" // struct bregs
>  #include "config.h" // CONFIG_*
>  #include "farptr.h" // FLATPTR_TO_SEG
> +#include "biosvar.h" // GET_IVT
>  #include "hw/pci.h" // pci_config_readl
>  #include "hw/pcidevice.h" // foreachpci
>  #include "hw/pci_ids.h" // PCI_CLASS_DISPLAY_VGA
> @@ -136,9 +137,24 @@ init_optionrom(struct rom_header *rom, u16 bdf, int 
> isvga)
>  
>  tpm_option_rom(newrom, rom->size * 512);
>  
> -if (isvga || get_pnp_rom(newrom))
> +struct pnp_data *pnp = get_pnp_rom(newrom);
> +if (isvga || pnp) {
> +struct segoff_s old19, new19;
>  // Only init vga and PnP roms here.
> +old19 = GET_IVT(0x19);
>  callrom(newrom, bdf);
> +new19 = GET_IVT(0x19);
> +if (old19.seg != new19.seg ||
> +old19.offset != new19.offset) {
> +dprintf(1, "WARNING! rom tried to hijack int19 "
> +"(vec %04x:%04x, pnp %s, bev %s, bvc %s)\n",
> +new19.seg, new19.offset,
> +pnp ? "yes" : "no",
> +pnp && pnp->bev ? "yes" : "no",
> +pnp && pnp->bcv ? "yes" : "no");
> +SET_IVT(0x19, old19);
> +}
> +}
>  
>  return rom_confirm(newrom->size * 512);
>  }
> 

Is it OK to call get_pnp_rom() when "isvga" is set?

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] virtio-blk/scsi: enable multi-queues support when starting device

2018-09-26 Thread Laszlo Ersek
On 09/26/18 10:16, Liu, Changpeng wrote:
> I posted the patch again, because I didn't get any response since several 
> months ago... :).

Indeed, you didn't receive any comments under that (July) posting,
regrettably:

1531201226-4099-1-git-send-email-changpeng.liu@intel.com">http://mid.mail-archive.com/1531201226-4099-1-git-send-email-changpeng.liu@intel.com

However, prior to that, the topic had been discussed several times. QEMU
commit fb20fbb76 is correct, and the idea behind the present patch is wrong.

Obviously I'm not a SeaBIOS maintainer, so take my opinion for what it's
worth. However, I will definitely not accept a similar patch for OVMF --
it was tried, and I rejected it:

https://lists.01.org/pipermail/edk2-devel/2017-December/019131.html

(In fact, although the author of the OVMF posting and the author of the
QEMU patch were different persons, and it's also unclear whether they
worked for the same organization, I suspect that the QEMU patch was
actually the direct result of the OVMF discussion.)

>> -Original Message-
>> From: Liu, Changpeng
>> Sent: Wednesday, September 26, 2018 4:24 PM
>> To: seabios@seabios.org
>> Cc: stefa...@redhat.com; Liu, Changpeng ; Harris,
>> James R ; Zedlewski, Piotr
>> ; marcandre.lur...@redhat.com
>> Subject: [PATCH] virtio-blk/scsi: enable multi-queues support when starting
>> device
>>
>> QEMU will not start all the queues since commit fb20fbb76
>> "vhost: avoid to start/stop virtqueue which is not read",
>> because seabios only use one queue when starting, this will
>> not work for some vhost slave targets which expect the exact
>> number of queues defined in virtio-pci configuration space,

This expectation that you spell out in the commit message is what's
wrong, in those "vhost slave targets".

Thanks
Laszlo

>> while here, we also enable those queues in the BIOS phase.
>>
>> Signed-off-by: Changpeng Liu 
>> ---
>>  src/hw/virtio-blk.c  | 26 +++---
>>  src/hw/virtio-ring.h |  1 +
>>  src/hw/virtio-scsi.c | 28 +++-
>>  3 files changed, 39 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
>> index 88d7e54..79638ec 100644
>> --- a/src/hw/virtio-blk.c
>> +++ b/src/hw/virtio-blk.c
>> @@ -25,7 +25,7 @@
>>
>>  struct virtiodrive_s {
>>  struct drive_s drive;
>> -struct vring_virtqueue *vq;
>> +struct vring_virtqueue *vq[MAX_NUM_QUEUES];
>>  struct vp_device vp;
>>  };
>>
>> @@ -34,7 +34,7 @@ virtio_blk_op(struct disk_op_s *op, int write)
>>  {
>>  struct virtiodrive_s *vdrive =
>>  container_of(op->drive_fl, struct virtiodrive_s, drive);
>> -struct vring_virtqueue *vq = vdrive->vq;
>> +struct vring_virtqueue *vq = vdrive->vq[0];
>>  struct virtio_blk_outhdr hdr = {
>>  .type = write ? VIRTIO_BLK_T_OUT : VIRTIO_BLK_T_IN,
>>  .ioprio = 0,
>> @@ -96,6 +96,7 @@ virtio_blk_process_op(struct disk_op_s *op)
>>  static void
>>  init_virtio_blk(void *data)
>>  {
>> +u32 i, num_queues = 1;
>>  struct pci_device *pci = data;
>>  u8 status = VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER;
>>  dprintf(1, "found virtio-blk at %pP\n", pci);
>> @@ -109,10 +110,6 @@ init_virtio_blk(void *data)
>>  vdrive->drive.cntl_id = pci->bdf;
>>
>>  vp_init_simple(>vp, pci);
>> -if (vp_find_vq(>vp, 0, >vq) < 0 ) {
>> -dprintf(1, "fail to find vq for virtio-blk %pP\n", pci);
>> -goto fail;
>> -}
>>
>>  if (vdrive->vp.use_modern) {
>>  struct vp_device *vp = >vp;
>> @@ -156,6 +153,11 @@ init_virtio_blk(void *data)
>>  vp_read(>device, struct virtio_blk_config, heads);
>>  vdrive->drive.pchs.sector =
>>  vp_read(>device, struct virtio_blk_config, sectors);
>> +
>> +num_queues = vp_read(>common, virtio_pci_common_cfg,
>> num_queues);
>> +if (num_queues < 1 || num_queues > MAX_NUM_QUEUES) {
>> + num_queues = 1;
>> +}
>>  } else {
>>  struct virtio_blk_config cfg;
>>  vp_get_legacy(>vp, 0, , sizeof(cfg));
>> @@ -178,6 +180,13 @@ init_virtio_blk(void *data)
>>  vdrive->drive.pchs.sector = cfg.sectors;
>>  }
>>
>> +for (i = 0; i < num_queues; i++) {
>> +if (vp_find_vq(>vp, i, >vq[i]) < 0 ) {
>> +dprintf(1, "fail to find vq %u for virtio-blk %pP\n", i, pci);
>> +goto fail_vq;
>> +}
>> +}
>> +
>>  char *desc = znprintf(MAXDESCSIZE, "Virtio disk PCI:%pP", pci);
>>  boot_add_hd(>drive, desc, bootprio_find_pci_device(pci));
>>
>> @@ -185,9 +194,12 @@ init_virtio_blk(void *data)
>>  vp_set_status(>vp, status);
>>  return;
>>
>> +fail_vq:
>> +for (i = 0; i < num_queues; i++) {
>> +free(vdrive->vq[i]);
>> +}
>>  fail:
>>  vp_reset(>vp);
>> -free(vdrive->vq);
>>  free(vdrive);
>>  }
>>
>> diff --git a/src/hw/virtio-ring.h b/src/hw/virtio-ring.h
>> index 8604a01..3c8a2d1 100644
>> --- a/src/hw/virtio-ring.h

Re: [SeaBIOS] [RFC v3] pciinit: setup mcfg for pxb-pcie to support multiple pci domains

2018-09-26 Thread Laszlo Ersek
On 09/26/18 06:44, Gerd Hoffmann wrote:
>   Hi,
> 
>> Second, the v5 RFC doesn't actually address the alleged bus number
>> shortage. IIUC, it supports a low number of ECAM ranges under 4GB, but
>> those are (individually) limited in the bus number ranges they can
>> accommodate (due to 32-bit address space shortage). So more or less the
>> current approach just fragments the bus number space we already have, to
>> multiple domains.
> 
> Havn't looked at the qemu side too close yet, but as I understand things
> the firmware programs the ECAM location (simliar to the q35 mmconf bar),
> and this is just a limitation of the current seabios patch.
> 
> So, no, *that* part wouldn't be messy in ovmf, you can simply place the
> ECAMs where you want.

Figuring out "wherever I want" is the problem. It's not simple. The
64-bit MMIO aperture (for BAR allocation) can also be placed mostly
"wherever the firmware wants", and that wasn't simple either. All these
things end up depending on each other.

https://bugzilla.redhat.com/show_bug.cgi?id=1353591#c8

(

The placement of the q35 MMCONF BAR was difficult too; I needed your
help with the low RAM split that QEMU would choose.

v1 discussion:

http://mid.mail-archive.com/1457340448.25423.43.camel@redhat.com

v2 patch (ended up as commit 7b8fe63561b4):

http://mid.mail-archive.com/1457446804-18892-4-git-send-email-lersek@redhat.com

These things add up :(

)

Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [RFC v3] pciinit: setup mcfg for pxb-pcie to support multiple pci domains

2018-09-25 Thread Laszlo Ersek
On 09/25/18 17:38, Kevin O'Connor wrote:
> On Mon, Sep 17, 2018 at 11:02:59PM +0800, Zihan Yang wrote:
>> To support multiple pci domains of pxb-pcie device in qemu, we need to setup
>> mcfg range in seabios. We use [0x8000, 0xb000) to hold new domain 
>> mcfg
>> table for now, and we need to retrieve the desired mcfg size of each pxb-pcie
>> from a hidden bar because they may not need the whole 256 busses, which also
>> enables us to support more domains within a limited range (768MB)
> 
> At a highlevel, this looks okay to me.  I'd like to see additional
> reviews from others more familiar with the QEMU PCI code, though.
> 
> Is the plan to do the same thing for OVMF?

I remain entirely unconvinced that this feature is useful. (I've stated
so before.)

I believe the latest QEMU RFC posting (v5) is here:

[Qemu-devel] [RFC v5 0/6] pci_expander_brdige: support separate pci
domain for pxb-pcie

http://mid.mail-archive.com/1537196258-12581-1-git-send-email-whois.zihan.yang@gmail.com

First, I fail to see the use case where ~256 PCI bus numbers aren't
enough. If I strain myself, perhaps I can imagine using ~200 PCIe root
ports on Q35 (each of which requires a separate bus number), so that we
can independently hot-plug 200 devices then. And that's supposedly not
enough, because we want... 300? 400? A thousand? Doesn't sound realistic
to me. (This is not meant to be a strawman argument, I really have no
idea what the feature would be useful for.)

Second, the v5 RFC doesn't actually address the alleged bus number
shortage. IIUC, it supports a low number of ECAM ranges under 4GB, but
those are (individually) limited in the bus number ranges they can
accommodate (due to 32-bit address space shortage). So more or less the
current approach just fragments the bus number space we already have, to
multiple domains.

Third, should a subsequent iteration of the QEMU series put those extra
ECAMs above 4GB, with the intent to leave the enumeration of those
hierarchies to the "guest OS", it would present an incredible
implementation mess for OVMF. If people gained the ability to attach
storage or network to those domains, on the QEMU command line, they
would expect to boot off of them, using UEFI. Then OVMF would have to
make sure the controllers could be bound by their respective UEFI
drivers. That in turn would require functional config space access
(ECAM) at semi-random 64-bit addresses.

The layout of the 64-bit address space is already pretty darn
complicated in OVMF; it depends on guest RAM size, DIMM hotplug area
size, and it already dictates the base of the 64-bit MMIO aperture for
BAR allocations. It also dictates how high up the DXE Core should build
the page tables (you can only access 64-bit addresses if the 1:1 page
tables built by the DXE Core cover them).

Obviously, UEFI on physical machines does support multiple PCI domains.
There are a number of differences however:

- Both the ECAM ranges, and the MMIO apertures (for BAR allocation) of
the disparate host bridges are distinct. On QEMU / OVMF, the latter part
is not true (about the MMIO apertures), and this has already required us
to write some nasty quirks for supposedly platform-independent core code
in edk2.

- The same address ranges mentioned in the previous bullet are known in
advance (they are "static"). That's a *huge* simplification opportunity
to physical platform code (which is most of the time not even open
source, let alone upstreamed to edk2), because the engineers can lay out
the 64-bit address range, and deduce all the related artifacts from that
layout, on paper, at the office.

It's a proper mess with a lot of opportunity for regressions, and I just
don't see the bang for the buck.

(I didn't mean to re-hash my opinion yet again -- in the QEMU RFC v5
thread, I saw references to OVMF, stating that OVMF would not support
this. I was 100% fine with those mentions, but here you asked
explicitly... Some ideas just make me rant, my apologies.)

Thanks
Laszlo

> -Kevin
> 
>>
>> Signed-off-by: Zihan Yang 
>> ---
>>  src/fw/dev-q35.h |  7 +++
>>  src/fw/pciinit.c | 32 
>>  src/hw/pci_ids.h |  1 +
>>  3 files changed, 40 insertions(+)
>>
>> diff --git a/src/fw/dev-q35.h b/src/fw/dev-q35.h
>> index 201825d..229cd81 100644
>> --- a/src/fw/dev-q35.h
>> +++ b/src/fw/dev-q35.h
>> @@ -49,4 +49,11 @@
>>  #define ICH9_APM_ACPI_ENABLE   0x2
>>  #define ICH9_APM_ACPI_DISABLE  0x3
>>  
>> +#define PXB_PCIE_HOST_BRIDGE_MCFG_BAR  0x50/* 64bit 
>> register */
>> +#define PXB_PCIE_HOST_BRIDGE_MCFG_SIZE 0x58/* 32bit 
>> register */
>> +#define PXB_PCIE_HOST_BRIDGE_ENABLE
>> Q35_HOST_BRIDGE_PCIEXBAREN
>> +/* pxb-pcie can use [0x8000, 0xb000), be careful not to overflow */
>> +#define PXB_PCIE_HOST_BRIDGE_MCFG_SIZE_ADDR0x8000
>> +#define PXB_PCIE_HOST_BRIDGE_MCFG_SIZE_ADDR_UPPER 
>> Q35_HOST_BRIDGE_PCIEXBAR_ADDR
>> +
>>  #endif // 

Re: [SeaBIOS] [PATCH v3 3/3] pci: recognize RH PCI legacy bridge resource reservation capability

2018-08-24 Thread Laszlo Ersek
On 08/24/18 10:53, Jing Liu wrote:
> Enable the firmware recognizing RedHat legacy PCI bridge device ID,
> so QEMU can reserve additional PCI bridge resource capability.
> Change the debug level lower to 3 when it is non-QEMU bridge.
> 
> Signed-off-by: Jing Liu 
> ---
>  src/fw/pciinit.c | 50 +-
>  src/hw/pci_ids.h |  1 +
>  2 files changed, 30 insertions(+), 21 deletions(-)
> 
> diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c
> index 62a32f1..c0634bc 100644
> --- a/src/fw/pciinit.c
> +++ b/src/fw/pciinit.c
> @@ -525,30 +525,38 @@ static void pci_bios_init_platform(void)
>  
>  static u8 pci_find_resource_reserve_capability(u16 bdf)
>  {
> -if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT &&
> -pci_config_readw(bdf, PCI_DEVICE_ID) ==
> -PCI_DEVICE_ID_REDHAT_ROOT_PORT) {
> -u8 cap = 0;
> -do {
> -cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap);
> -} while (cap &&
> - pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) !=
> -REDHAT_CAP_RESOURCE_RESERVE);
> -if (cap) {
> -u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS);
> -if (cap_len < RES_RESERVE_CAP_SIZE) {
> -dprintf(1, "PCI: QEMU resource reserve cap length %d is 
> invalid\n",
> -cap_len);
> -return 0;
> -}
> -} else {
> -dprintf(1, "PCI: QEMU resource reserve cap not found\n");
> +u16 device_id;
> +
> +if (pci_config_readw(bdf, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) {
> +dprintf(3, "PCI: This is non-QEMU bridge.\n");

I think I liked the previous language slightly more ("PCI: QEMU resource
reserve cap vendor ID doesn't match."), but that shouldn't be a problem.

Series
Reviewed-by: Laszlo Ersek 

Thanks
Laszlo


> +return 0;
> +}
> +
> +device_id = pci_config_readw(bdf, PCI_DEVICE_ID);
> +
> +if (device_id != PCI_DEVICE_ID_REDHAT_ROOT_PORT &&
> +device_id != PCI_DEVICE_ID_REDHAT_BRIDGE) {
> +dprintf(1, "PCI: QEMU resource reserve cap device ID doesn't 
> match.\n");
> +return 0;
> +}
> +u8 cap = 0;
> +
> +do {
> +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap);
> +} while (cap &&
> + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) !=
> +  REDHAT_CAP_RESOURCE_RESERVE);
> +if (cap) {
> +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS);
> +if (cap_len < RES_RESERVE_CAP_SIZE) {
> +dprintf(1, "PCI: QEMU resource reserve cap length %d is 
> invalid\n",
> +cap_len);
> +return 0;
>  }
> -return cap;
>  } else {
> -dprintf(1, "PCI: QEMU resource reserve cap VID or DID doesn't 
> match.\n");
> -return 0;
> +dprintf(1, "PCI: QEMU resource reserve cap not found\n");
>  }
> +return cap;
>  }
>  
>  /
> diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h
> index 38fa2ca..1096461 100644
> --- a/src/hw/pci_ids.h
> +++ b/src/hw/pci_ids.h
> @@ -2265,6 +2265,7 @@
>  
>  #define PCI_VENDOR_ID_REDHAT 0x1b36
>  #define PCI_DEVICE_ID_REDHAT_ROOT_PORT   0x000C
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE  0x0001
>  
>  #define PCI_VENDOR_ID_TEKRAM 0x1de1
>  #define PCI_DEVICE_ID_TEKRAM_DC290   0xdc29
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v2 3/3] pci: recognize RH PCI legacy bridge resource reservation capability

2018-08-24 Thread Laszlo Ersek
On 08/24/18 09:48, Liu, Jing2 wrote:
> 
> 
> On 8/24/2018 3:12 PM, Laszlo Ersek wrote:
>> On 08/24/18 04:23, Liu, Jing2 wrote:
>>> Hi Laszlo,
>>>
>>> On 8/22/2018 5:13 PM, Laszlo Ersek wrote:
>>>> On 08/16/18 12:43, Liu, Jing2 wrote:
>>>>>
>>>>>
>>>>> On 8/16/2018 3:18 PM, Gerd Hoffmann wrote:
>>>>>>  Hi,
>>>>>>
>>>>>>> +    if (pci_config_readw(bdf, PCI_VENDOR_ID) !=
>>>>>>> PCI_VENDOR_ID_REDHAT) {
>>>>>>> +    dprintf(1, "PCI: QEMU resource reserve cap vendor ID
>>>>>>> doesn't
>>>>>>> match.\n");
>>>>>>
>>>>>> I'd suggest to use a higher debug level for this one, 3 would be a
>>>>>> good
>>>>>> pick I think.  level 1 messages are printed by default, and we should
>>>>>> not spam the log just because there is a non-qemu bridge present
>>>>>> in the
>>>>>> system.
>>>>> OK. Will do that.
>>>>
>>>> With the debug level update, I'm ready to give my R-b for this series.
>>>>
>>> Thanks for your feedback!
>>> So do I need update another version and with your R-b?
>>
>> I imagine you'd post v3 with the update Gerd requested for the debug
>> level(s), and then I'd respond with my R-b. (Obviously I'm not a SeaBIOS
>> maintainer so that'll not be "decisive" by any means.)
>>
> Oh, BTW, I am considering, if only dismatch vendor-id stands for
> "non-qemu bridge" or dismatch both vid and did? I guess I need change both.

I don't understand. Can you post an incremental diff in this thread just
for illustration?

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] [PATCH v2 3/3] pci: recognize RH PCI legacy bridge resource reservation capability

2018-08-24 Thread Laszlo Ersek
On 08/24/18 04:23, Liu, Jing2 wrote:
> Hi Laszlo,
> 
> On 8/22/2018 5:13 PM, Laszlo Ersek wrote:
>> On 08/16/18 12:43, Liu, Jing2 wrote:
>>>
>>>
>>> On 8/16/2018 3:18 PM, Gerd Hoffmann wrote:
>>>>     Hi,
>>>>
>>>>> +    if (pci_config_readw(bdf, PCI_VENDOR_ID) !=
>>>>> PCI_VENDOR_ID_REDHAT) {
>>>>> +    dprintf(1, "PCI: QEMU resource reserve cap vendor ID doesn't
>>>>> match.\n");
>>>>
>>>> I'd suggest to use a higher debug level for this one, 3 would be a good
>>>> pick I think.  level 1 messages are printed by default, and we should
>>>> not spam the log just because there is a non-qemu bridge present in the
>>>> system.
>>> OK. Will do that.
>>
>> With the debug level update, I'm ready to give my R-b for this series.
>>
> Thanks for your feedback!
> So do I need update another version and with your R-b?

I imagine you'd post v3 with the update Gerd requested for the debug
level(s), and then I'd respond with my R-b. (Obviously I'm not a SeaBIOS
maintainer so that'll not be "decisive" by any means.)

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] [PATCH v2 3/3] pci: recognize RH PCI legacy bridge resource reservation capability

2018-08-22 Thread Laszlo Ersek
On 08/16/18 12:43, Liu, Jing2 wrote:
> 
> 
> On 8/16/2018 3:18 PM, Gerd Hoffmann wrote:
>>    Hi,
>>
>>> +    if (pci_config_readw(bdf, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) {
>>> +    dprintf(1, "PCI: QEMU resource reserve cap vendor ID doesn't
>>> match.\n");
>>
>> I'd suggest to use a higher debug level for this one, 3 would be a good
>> pick I think.  level 1 messages are printed by default, and we should
>> not spam the log just because there is a non-qemu bridge present in the
>> system.
> OK. Will do that.

With the debug level update, I'm ready to give my R-b for this series.

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] [Qemu-devel] [PATCH v2 0/3] hw/pci: PCI resource reserve capability

2018-08-16 Thread Laszlo Ersek
Hi,

On 08/16/18 11:28, Jing Liu wrote:
> This patch serial is about PCI resource reserve capability.
> 
> First patch refactors the resource reserve fields in GenPCIERoorPort
> structure out to another new structure, called "PCIResReserve". Modify
> the parameter list of pci_bridge_qemu_reserve_cap_init().
> 
> Then we add the teardown function called pci_bridge_qemu_reserve_cap_uninit().
> 
> Last we enable the resource reserve capability for legacy PCI bridge
> so that firmware can reserve additional resources for the bridge.
> 
> Change Log:
> v2 -> v1
> * add refactoring patch
> * add teardown function
> * some other fixes
> 
> Jing Liu (3):
>   hw/pci: factor PCI reserve resources to a separate structure
>   hw/pci: add teardown function for PCI resource reserve capability
>   hw/pci: add PCI resource reserve capability to legacy PCI bridge
> 
>  hw/pci-bridge/gen_pcie_root_port.c | 32 +-
>  hw/pci-bridge/pci_bridge_dev.c | 25 
>  hw/pci/pci_bridge.c| 47 
> +-
>  include/hw/pci/pci_bridge.h| 18 +++
>  4 files changed, 80 insertions(+), 42 deletions(-)
> 

just some meta comments for now:

- I've added Marcel; please keep him CC'd on this set (and the SeaBIOS
counterpart, [SeaBIOS] [PATCH v2 0/3] pci: resource reserve capability
found)

- my task queue has blown up and I'm unsure when I'll get to reviewing
this set. The same applies to the SeaBIOS counterpart.

This is just to say that you should feel free to go ahead and work with
the (sub)maintainers; I'll try to get back to this as time allows, but
don't wait for me.

If further versions are necessary, I'd appreciate being CC'd on those,
just so I know what to look at when I find the time again.

Thanks!
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v2 0/3] pci: resource reserve capability found

2018-08-14 Thread Laszlo Ersek
On 08/13/18 09:49, Jing Liu wrote:
> This patch serial is about QEMU resource reserve capability finding
> in firmware.
>
> Firstly, this fixes a logic bug. When the capability is truncated,
> return zero instead of the truncated offset. Secondly, this modified
> the debug messages when the capability is not found and when the vendor
> ID or device Id doesn't match REDHAT special ones.
>
> Last, this enables the firmware recongizing the REDHAT PCI BRIDGE device ID,
> so that QEMU can also reserve PCI bridge resource capability.
>
> Jing Liu (3):
>   pci: fix the return value for truncated capability
>   pci: clean up the debug message for pci capability found
>   pci: recognize RH PCI legacy bridge resource reservation capability
>
>  src/fw/pciinit.c | 45 -
>  src/hw/pci_ids.h |  1 +
>  2 files changed, 29 insertions(+), 17 deletions(-)

Something looks wrong with this patch set -- the shortlog above
indicates that the 3rd patch is called "pci: recognize RH PCI legacy
bridge resource reservation capability", but in the actual series I see:

  [SeaBIOS][PATCH v2 0/3] pci: resource reserve capability found
  [SeaBIOS][PATCH v2 1/3] pci: fix the return value for truncated capability
  [SeaBIOS][PATCH v2 2/3] pci: clean up the debug message for pci capability 
found
  [SeaBIOS][PATCH v2] pci: recognize RH PCI legacy bridge resource

For the last patch, the "3/3" counter is missing in the subject prefix,
plus the subject doesn't match the short log.

I also see a separate message

  [SeaBIOS] [PATCH v2 3/3] pci: recognize RH PCI legacy bridge resource

which is apparently not threaded under v2 0/3, and whose subject line
also doesn't match the shortlog.

Not sure what Kevin and Gerd prefer, but if these patches are meant to
form a series, I'd suggest to repost with correct numbering, threading,
and subject lines.

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] Marvell 88SE9230 passthrough in KVM takes long time to boot

2018-08-09 Thread Laszlo Ersek
+Andrea; comments below

On 08/08/18 23:11, Alex Williamson wrote:
> On Wed, 8 Aug 2018 14:11:16 +0200
> Gerd Hoffmann  wrote:
> 
>> On Sun, Jul 29, 2018 at 01:49:10PM +0200, Konrad Eisele wrote:
>>> I'm passing through a Marvell 88SE9230 card to a KVM guest under
>>> Ubuntu 18.04. The card is a Sata controller with 4 ports.
>>> The option rom of the Marvell 88SE9230 card shows on a normal boot a
>>> bios screen. When pressing CTRL-m quick enough, you  can interrupt the
>>> bootprocess and enter a menue wherer you can define raid
>>> arrays.
>>>
>>> When booting seabios inside KVM the bootprocess is very slow.
>>> There is a 1 min holdtime where the cpu is about 30%. The screen is
>>> black with only the seabios version string shown. I suspect that
>>> the passed-through Marvell 88SE9230 cards option roms causes this
>>> behaviour.
>>> Maybe the scanning for option rom cause the slow bootprocess?
>>>
>>> In the seabios boot case no bios menue is shown, after
>>> around 1 min the boot continues.
>>>
>>> Is it possible to disable the options rom processing? Is there some
>>> documentation about this (How can I configure it for Ubuntu) ?  
>>
>> Set the romfile option to the empty string (for vfio-pci, on the qemu
>> command line) should do that (qemu will not expose the rom to the guest
>> then).
> 
> Typically rombar=0 on the QEMU command line is how to disable the option
> ROM for an assigned device, or
> 
> 
> 
> in libvirt.  Thanks,

On a tangent:

Not sure about assigned devices, but for emulated devices, romfile=''
"scales" better. With rombar=0, the oprom is still loaded into fw_cfg as
a "genrom", and if multiple devices of the same type attempt to do that,
the genroms will conflict, and QEMU will not launch:

  https://bugzilla.redhat.com/show_bug.cgi?id=1425058#c19

For this reason (as well), libvirt now supports a further attribute for
the  element, namely @enabled. It maps to the "romfile" property:

  https://bugzilla.redhat.com/show_bug.cgi?id=1425058#c34

By now I've updated all my long-term domains that used to specify , to .

Thanks!
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] pci: add RedHat PCI BRIDGE capability

2018-08-08 Thread Laszlo Ersek
On 08/08/18 05:24, Liu, Jing2 wrote:
> On 8/7/2018 7:43 PM, Laszlo Ersek wrote:
>> On 08/07/18 09:20, Jing Liu wrote:

[snip]

>>> -    if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT &&
>>> -    pci_config_readw(bdf, PCI_DEVICE_ID) ==
>>> -    PCI_DEVICE_ID_REDHAT_ROOT_PORT) {
>>> +    u16 vendor_id = pci_config_readw(bdf, PCI_VENDOR_ID);
>>> +    u16 device_id = pci_config_readw(bdf, PCI_DEVICE_ID);

[snip]

>> (5) Regarding the code: I'm not sure how careful SeaBIOS is about
>>  unnecessary config space accesses (i.e., unnecessary traps to the
>>  host). Personally I would prefer if we didn't unconditionally read
>>  the device ID post-patch either -- that is, if the vendor ID doesn't
>>  match, we shouldn't read the device ID. Something like:
>>
> Do you mean we need prevent the compiler read device ID in advanced when
> vendor ID does not matched?
> If not, why the original codes will read device ID when the vendor Id
> check fails?

What I mean is that the original code (see in the context above) uses
the "logical and" (&&)  operator; if the Vendor ID does not match
PCI_VENDOR_ID_REDHAT, then the Device ID is not read from config space
at all. After the patch (see above again), the Device ID is read
unconditionally, even if we later find that the Vendor ID is a mismatch,
and so throw away the Device ID. In that case, the Device ID read (which
is a trap from the guest to KVM to QEMU) is wasted.

It's likely not extremely important to be as frugal as possible with
config space accesses (traps); however, if it's not a big complication
code-wise to avoid possibly wasted reads (traps), then I think we should
be frugal.

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] [PATCH] pci: add RedHat PCI BRIDGE capability

2018-08-07 Thread Laszlo Ersek
adding Marcel; comments at the bottom

On 08/07/18 09:20, Jing Liu wrote:
> Add a device-specific capability for the RedHat PCI BRIDGE
> to enable reserving additional resources.
>
> Signed-off-by: Jing Liu 
> ---
>  src/fw/pciinit.c | 9 ++---
>  src/hw/pci_ids.h | 1 +
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c
> index 3a2f747..0265e9d 100644
> --- a/src/fw/pciinit.c
> +++ b/src/fw/pciinit.c
> @@ -525,9 +525,12 @@ static void pci_bios_init_platform(void)
>
>  static u8 pci_find_resource_reserve_capability(u16 bdf)
>  {
> -if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT &&
> -pci_config_readw(bdf, PCI_DEVICE_ID) ==
> -PCI_DEVICE_ID_REDHAT_ROOT_PORT) {
> +u16 vendor_id = pci_config_readw(bdf, PCI_VENDOR_ID);
> +u16 device_id = pci_config_readw(bdf, PCI_DEVICE_ID);
> +
> +if (vendor_id == PCI_VENDOR_ID_REDHAT &&
> +(device_id == PCI_DEVICE_ID_REDHAT_ROOT_PORT ||
> + device_id == PCI_DEVICE_ID_REDHAT_BRIDGE)) {
>  u8 cap = 0;
>  do {
>  cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap);
> diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h
> index 38fa2ca..1096461 100644
> --- a/src/hw/pci_ids.h
> +++ b/src/hw/pci_ids.h
> @@ -2265,6 +2265,7 @@
>
>  #define PCI_VENDOR_ID_REDHAT 0x1b36
>  #define PCI_DEVICE_ID_REDHAT_ROOT_PORT   0x000C
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE  0x0001
>
>  #define PCI_VENDOR_ID_TEKRAM 0x1de1
>  #define PCI_DEVICE_ID_TEKRAM_DC290   0xdc29
>

Given that you are touching this function, it's a good opportunity to
clean up its issues that I pointed out earlier in
, noted as "side
comments (a), (b) and (c)". I'll re-iterate:

* The "PCI: QEMU resource reserve cap not found" debug message is
  printed under wrong conditions. Namely, it is both printed when it
  makes no sense (i.e., when the vendor-id/device-id don't match and we
  don't even go looking for the capability), and it's *not* printed when
  it does makes sense (the search loop completes without finding the
  capability).

* There is a logic bug: if we find the capability but it's truncated, we
  print a good error message, but then go ahead and return the offset of
  the broken (truncated) capability just the same. In this case, we
  should return zero.

So, I suggest that you please:

(1) send a patch that fixes the logic bug,

(2) send another patch that cleans up the debug messages,

(3) send yet another patch that recognizes the capability in question on
the traditional bridge device too. (I.e., a variant of the current
patch, rebased to (1) and (2)).

More comments for this patch:

(4) the subject line should be clarified, such as:

  pci: recognize RH resource reservation capability on traditional bridges

(72 characters). The commit message body should be updated
accordingly -- we're not adding the capability, just matching it on
another device.

(5) Regarding the code: I'm not sure how careful SeaBIOS is about
unnecessary config space accesses (i.e., unnecessary traps to the
host). Personally I would prefer if we didn't unconditionally read
the device ID post-patch either -- that is, if the vendor ID doesn't
match, we shouldn't read the device ID. Something like:

>   u16 device_id;
>
>   if (pci_config_readw(bdf, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) {
> return 0;
>   }
>
>   device_id = pci_config_readw(bdf, PCI_DEVICE_ID);
>   if (device_id != PCI_DEVICE_ID_REDHAT_ROOT_PORT &&
>   device_id != PCI_DEVICE_ID_REDHAT_BRIDGE) {
> return 0;
>   }
>
>   /* success / failure messages are justified after this point */
>   ...

The vendor-id/device-id checks should likely be reorganized as shown
above in patch (2), as a part of the debug message cleanup. And then the
device-id check can be extended to cover PCI_DEVICE_ID_REDHAT_BRIDGE in
patch (3).

Just my opinion of course; I'm not a SeaBIOS maintainer.

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] Marvell 88SE9230 passthrough in KVM takes long time to boot

2018-07-30 Thread Laszlo Ersek
side comment:

On 07/29/18 21:43, Konrad Eisele wrote:

> changed /usr/bin/kvm (which
> libvirtd is calling at boot)
> qemu-system-x86_64 -enable-kvm -bios /usr/lib/qemu/bios.bin "$@"

This should not be necessary. Libvirt lets you specify the desired
firmware in the domain XML:

https://libvirt.org/formatdomain.html#elementsOSBIOS

In this case, it should be

  
/path/to/your/bios.bin
  

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] hotplug failure issue on pci-bridge

2018-07-17 Thread Laszlo Ersek
On 07/16/18 11:45, Liu, Jing2 wrote:
> Hi Laszlo,
> 
> On 7/12/2018 3:29 PM, Laszlo Ersek wrote:
>> On 07/12/18 07:43, Liu, Jing2 wrote:
>>> Yep, thanks for the advice.
>>> But hotplugging on pci-bridge is the actual use case
>>> request so we would better solve and fix this.
>>
>>
>> You can cold-plug a PCI Express Root Port in the Q35 root complex
>> (pcie.0), reserving the MMIO resources you want, cold-plug a PCIE-PCI
>> Bridge in that root port, and then hot-plug the desired endpoint into
>> that PCIE-PCI Bridge. 
> 
> I'm trying this. But actual results show that,
> when pcie-pci-bridge has no coldplug device, it shows all NONE for each
> windows.
> 01:00.0 PCI bridge: Red Hat, Inc. Device 000e (prog-if 00 [Normal decode])
>     I/O behind bridge: None
>     Memory behind bridge: None
>     Prefetchable memory behind bridge: None

Can you check /proc/iomem, and dmesg?

What is your exact QEMU command line?

BTW, there's also <https://bugzilla.redhat.com/show_bug.cgi?id=1536147>.
It might be relevant here.

(I haven't personally tested SeaBIOS in the hotplug scenario at hand,
but it's been my understanding that, as long as you cold-plug the
PCIe-PCI bridge itself, RHBZ#1536147 shouldn't apply, and the hotplug
into the bridge should just work. Personally I've only tested the same
with OVMF only.)

Adding Marcel and Alexander to the thread (likely belatedly; sorry about
that).

Thanks
Laszlo

> Only if I cold plug some device (e.g. e1000) under it, and then hotplug
> another device might be successful.
> 
> BTW, I open the guest kernel config: CONFIG_PCI_REALLOC_ENABLE_AUTO=y,
> but it doesn't work.
> I'm not sure if there are some other issues I forgot?
> 
> Jing
> 
> This is one of the exact examples that
>> "docs/pcie_pci_bridge.txt" provides. (The other example is when the
>> PCIE-PCI bridge itself is hot-plugged into the root port, for which bus
>> number reservation is necessary too, at the root port level.)
>>
>> If you want more than that, e.g. do something similar on i440fx, that
>> will take QEMU work as well, not just SeaBIOS.
>>
>> Laszlo
>>
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] hotplug failure issue on pci-bridge

2018-07-12 Thread Laszlo Ersek
On 07/12/18 07:43, Liu, Jing2 wrote:
> Yep, thanks for the advice.
> But hotplugging on pci-bridge is the actual use case
> request so we would better solve and fix this.

It doesn't take a bugfix, but a feature. The firmware needs to be told
to reserve PCI resources in advance, in preparation for the hotplug. The
firmware might reserve some amount of resources by default, but those
defaults cannot be arbitrarily large. The reservation sizes need to come
from QEMU.

The most robust method for this was deemed to be placing a vendor
capability in the hotplug controller's PCI config space. In QEMU this is
implemented with the pci_bridge_qemu_reserve_cap_init() function. Right
now, the function is only called from the PCI Express Root Port device
model, in "hw/pci-bridge/gen_pcie_root_port.c".

You can cold-plug a PCI Express Root Port in the Q35 root complex
(pcie.0), reserving the MMIO resources you want, cold-plug a PCIE-PCI
Bridge in that root port, and then hot-plug the desired endpoint into
that PCIE-PCI Bridge. This is one of the exact examples that
"docs/pcie_pci_bridge.txt" provides. (The other example is when the
PCIE-PCI bridge itself is hot-plugged into the root port, for which bus
number reservation is necessary too, at the root port level.)

If you want more than that, e.g. do something similar on i440fx, that
will take QEMU work as well, not just SeaBIOS.

Laszlo

> 
> On 7/11/2018 6:38 PM, Laszlo Ersek wrote:
>> On 07/11/18 05:12, Liu, Jing2 wrote:
>>> Hi,
>>>
>>> Recently, we tried some hotplug issues. The case is: when hotplug a
>>> device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports
>>> "BAR 2: no space for [mem size 0x4000 64bit pref]" etc.
>>>
>>> Seabios checks all the devices under the pci-bridge when qemu launching
>>> the guest, and only allocates "size=ALIGN(sum, align)" of memory space
>>> for pci-bridge mem and pref-mem windows. So if we hotplug a big pci
>>> device like Intel GPU which needs 256M mem/pref-mem or bigger, it will
>>> fail.
>>>
>>>
>>> If my understanding is right, we may need some other logic of the memory
>>> allocation in seabios?
>>>
>>> Looking forward to the advice.
>>
>> I suggest using the Q35 machine type, and hot-plugging the device into a
>> PCI Express Root Port ("pcie-root-port"). The latter has properties
>> dedicated to reserving various PCI resources specifically for hotplug
>> purposes:
>>
>> pcie-root-port.mem-reserve=size
>> pcie-root-port.pref32-reserve=size
>> pcie-root-port.bus-reserve=uint32
>> pcie-root-port.pref64-reserve=size
>> pcie-root-port.io-reserve=size
>>
>> In order to address the issue you report at the top, you would use
>>
>>    -device pcie-root-port,bus=pcie.0,id=root-port-XXX,pref64-reserve=1G
>>
>> then hot-plug the device into "root-port-XXX".
>>
>>
>> Please see the following two files in the QEMU tree:
>> - docs/pcie.txt
>> - docs/pcie_pci_bridge.txt
>>
>> The first provides guidelines on the PCI Express hierarchy in general,
>> and also on hotplug in particular (see section 5).
>>
>> The second is relevant here because it describes the vendor capability
>> that QEMU and SeaBIOS (and OVMF) use, for passing the resource
>> reservation hints from QEMU to guest firmware.
>>
>> The 2nd document mainly focuses on hot-plugging a PCIE-PCI bridge into a
>> PCIE root port, and on reserving a bus number range for the
>> sub-hierarchy behind said hot-plugged  PCIE-PCI bridge. However, the
>> reservation mechanism is the same for other types of PCI resources, and
>> for other types of child devices that are hot-plugged into root ports.
>>
>> HTH,
>> Laszlo
>>


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] hotplug failure issue on pci-bridge

2018-07-11 Thread Laszlo Ersek
On 07/11/18 05:12, Liu, Jing2 wrote:
> Hi,
> 
> Recently, we tried some hotplug issues. The case is: when hotplug a
> device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports
> "BAR 2: no space for [mem size 0x4000 64bit pref]" etc.
> 
> Seabios checks all the devices under the pci-bridge when qemu launching
> the guest, and only allocates "size=ALIGN(sum, align)" of memory space
> for pci-bridge mem and pref-mem windows. So if we hotplug a big pci
> device like Intel GPU which needs 256M mem/pref-mem or bigger, it will
> fail.
> 
> 
> If my understanding is right, we may need some other logic of the memory
> allocation in seabios?
> 
> Looking forward to the advice.

I suggest using the Q35 machine type, and hot-plugging the device into a
PCI Express Root Port ("pcie-root-port"). The latter has properties
dedicated to reserving various PCI resources specifically for hotplug
purposes:

pcie-root-port.mem-reserve=size
pcie-root-port.pref32-reserve=size
pcie-root-port.bus-reserve=uint32
pcie-root-port.pref64-reserve=size
pcie-root-port.io-reserve=size

In order to address the issue you report at the top, you would use

  -device pcie-root-port,bus=pcie.0,id=root-port-XXX,pref64-reserve=1G

then hot-plug the device into "root-port-XXX".


Please see the following two files in the QEMU tree:
- docs/pcie.txt
- docs/pcie_pci_bridge.txt

The first provides guidelines on the PCI Express hierarchy in general,
and also on hotplug in particular (see section 5).

The second is relevant here because it describes the vendor capability
that QEMU and SeaBIOS (and OVMF) use, for passing the resource
reservation hints from QEMU to guest firmware.

The 2nd document mainly focuses on hot-plugging a PCIE-PCI bridge into a
PCIE root port, and on reserving a bus number range for the
sub-hierarchy behind said hot-plugged  PCIE-PCI bridge. However, the
reservation mechanism is the same for other types of PCI resources, and
for other types of child devices that are hot-plugged into root ports.

HTH,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] 1TB Guest changes to receive phys bits >=40

2018-06-11 Thread Laszlo Ersek
On 06/11/18 15:52, Gerd Hoffmann wrote:
>   Hi,
>
>> The change [2] itself is rather old, so I wondered if I'm missing
>> that this was implemented in a totally different way. Do I have to
>> switch/set options these days instead of using that patch?
>
> It should just work.  qemu passes ram regions to the firmware using
> fw_cfg (etc/e820) these days, and both seabios and ovmf support this
> for years already.

Small pedantic correction (not affecting the core statement): OVMF has
supported >=1TB guests only since commit 1fceaddb12b5
("OvmfPkg/PlatformPei: support >=1TB high RAM, and discontiguous high
RAM", 2017-08-05) [1] [2].

And, enabling such with -D SMM_REQUIRE had taken surprisingly messy
pre-requisites; see [3]. Large guest RAM requires guest firmware (that
intends to identity map all of that RAM) to build large page tables. For
SMM, a separate set of page tables is built in SMRAM (TSEG); however,
TSEG used to be limited. This was one of the reasons we had to introduce
"extended TSEG", for exposing which libvirt has gained support just a
few days ago [4].

[1] https://github.com/tianocore/edk2/commit/1fceaddb12b5
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1468526#c19
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1447027#c20
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1469338#c22

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] Confirm the status about BZ#1377575

2018-02-14 Thread Laszlo Ersek
Hello Shouta-san,

On 02/14/18 05:48, shouta.ueh...@jp.yokogawa.com wrote:
> Dear SeaBIOS development members,
> 
> I'm Shota Uehara work for Yokogawa Electric Corporation.
> I would like to confirm the status about a bug fixing of BZ#1377575.
> 
> I use Windows Server 2016 virtual machine on qemu-kvm with CentOS 7.
> And now I attempt to add a dedicated virtual device to qemu-kvm, and develop 
> its device driver for WS2016.
> 
> For debugging the windows driver I want to get a crash dump on WS2016, 
> however it doesn't be created.
> According to RedHat knowledge base, that is a Windows incompatibility issue 
> to SeaBIOS, as known BZ#1377575.
>  
> 
> Therefore, I have 2 questions as follows;
> (1) Could you tell me the current status about a bug fixing of BZ#1377575 ?
> (2) If possible, could you tell me the work-around to avoid such bug and 
> realize to get a crash dump ?

Please refer to the following blog post by my (then-)colleague Ladi
Prosek, who worked around the Windows issue (BZ#1377575) in the RHEL7
SeaBIOS package:

  https://ladipro.wordpress.com/2016/10/21/windows-vga-bug/

As stated in the blog post, both abandoning Cirrus video, and patching
SeaBIOS, are necessary for working around the Windows issue. The SeaBIOS
patch was sent to the SeaBIOS mailing list; please see the discussion
thread at:

  [SeaBIOS] [PATCH] vgabios: Reorder video modes to work around a
Windows bug
  https://www.coreboot.org/pipermail/seabios/2016-October/010963.html

The thread seems to have petered out back then. For using an upstream
SeaBIOS build, you might have to apply the patch manually. However, when
using SeaBIOS from CentOS 7, the issue should already be fixed:

- in 7.3.z, starting with seabios-1.9.1-5.el7_3.1
  (BZ#1392028, ),
- in 7.4, starting with seabios-1.10.1-1.el7
  (BZ#1377575, ).

Can you check if you are using a CentOS 7 SeaBIOS package that is at
least this recent?

... In fact, I see a private (Red Hat only) reminder under the Knowledge
Base article that we should update the "Known Issues" text, regarding
BZ#1377575, once the related updates are out. Given that both BZs above
are in CLOSED ERRATA status, I'm going to ping our docs guys now. :)

... Additionally, I've found a note in BZ#1377575 that Windows 10 RS2,
version 1703+ (possibly earlier), should no longer have this bug. Not
sure how (or if) that maps to Windows Server 2016.

Thank you!
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] Saving a few bytes across a reboot

2018-02-07 Thread Laszlo Ersek
On 02/07/18 17:44, Stefan Berger wrote:
> On 02/07/2018 10:50 AM, Laszlo Ersek wrote:

>> OK, but if the OS is allowed to modify this set of "queued operations",
>> then what protection is expected of SMM? Whether you can modify the TPM
>> directly, or queue random commands for it at libery, what's the
>> difference?
> 
> 
> On the OS level it is presumably an operation that is reserved to the
> admin to queue the operation.
> 
> I am not that familiar with UEFI and who is allowed to run code there
> and what code it can execute. But UEFI seems to lock the variable that
> holds that PPI code that tells it what to do after next reboot. So
> presumably a UEFI module cannot modify that variable but can only read
> it (and hopefully not manipulate NVRAM directly). If PPI was implemented
> through a memory location where the code gets written to it could do
> that likely easily (unless memory protections are setup by UEFI, which I
> don't know), cause a reset and have UEFI execute on that code.

This makes sense... but then it doesn't make sense :)

Assume that the variable is indeed "locked" (so that random UEFI drivers
/ apps cannot rewrite it using the UEFI variable service). Then,

- if the lock is enforced in SMM, then the variable will be locked from
  the OS as well, not just from 3rd party UEFI apps, so no PPI
  operations can ever be queued,

- if the lock is "simulated" in ACPI or in non-SMM firmare code (= in
  the "runtime DXE driver" layer), then the lock can be circumvented by
  both 3rd party UEFI apps and the OS.


>> Again, we have to see where the barrier is, between OS and firmware, or
>> between OS-level users:
>>
>> - In both cases, 3rd party UEFI apps / driver are considered equally
>>    privileged to the OS kernel;
>>
>> - in the OS<->firmware barrier case, SMM is required, and UEFI apps and
>>    the OS kernel are similarly restricted to submitting requests to SMM,
>>    and all the business verification belongs in SMM,
> 
> So SMM can verify whether the parameters it gets are valid. Whether now
> the user wanted to set operation 0 but the ACPI code submitted 5 (Clear
> TPM), would be a matter of verifying the ACPI code that's in-between. Is
> an attack via ACPI manipulation through some UEFI module possible?

Yes, it is possible.

There are dedicated UEFI (and PI -- "platform init") services for
installing new ACPI tables, and even for locating and parsing -- albeit
in a *very* cumbersome way -- existing ACPI tables (AML too). Once the
right ACPI objects are found, they can be overwritten.


> On the OS level it must remain a privileged operation of an admin to
> issue these PPI codes. That it is a privileged operation is implemented
> by the OS and I don't think we need to do anything. What we would want
> to prevent is abuse by a module that the firmware executes for example.
> I think this is the driving force for a UEFI variable and the fact that
> it's being locked (and later on unlocked so SMM mode can write to it ?)

This unlocking intrigues me. Assuming it happens in SMM, I have no idea
how the implementation tells apart the requestors (3rd party UEFI app
vs. OS).


> As for the use case, I would say it's automation on the OS level. From
> that perspective it's support could probably be deferred, which may
> eliminate at least the SMM part. However, UEFI uses the PPI mechanisms
> itself to issue certain commands when interacting with its menu. I am
> not sure whether SMM code is involved here... but for being able to use
> UEFI and TPM 2 at least for the UEFI support the PPI part needs to be
> there, otherwise the menu items one gets won't do anything. [The
> question is does UEFI execute ACPI or write directly in the UEFI
> varaible? My guess is the latter.]

I'm sorry, I'm out of my depth here. Can we re-have this discussion on
edk2-devel? (A bit later though, please, because currently I'm unable to
send email to edk2-devel. The 01.org list server recently dislikes
something about my emails and keeps rejecting them.)


>> (Sorry if this email is too long and confusing! I'm confused.)
> 
> Me too. I am not clear on specifics in UEFI, such as memory protections
> setup while a module is running in UEFI. Is NVRAM protected from
> overwrite?

Only SMRAM and pflash (aka NVRAM aka UEFI variables, on QEMU anyway) are
protected from direct hardware write.

Whether the write to SMRAM/pflash hardware comes from the OS or a 3rd
party UEFI app is irrelevent, both are prevented; only code running in
SMM is permitted write access.

Furthermore, it is irrelevant whether the OS or a 3rd party UEFI app is
the one that submits a request into SMM. If the request buffer passes
validation, then SMRAM and/or pflash (as appropriate) are updated. 

Re: [SeaBIOS] Saving a few bytes across a reboot

2018-02-07 Thread Laszlo Ersek
On 02/07/18 15:57, Stefan Berger wrote:
> On 02/07/2018 09:18 AM, Laszlo Ersek wrote:
>> On 02/07/18 14:51, Stefan Berger wrote:

>>> To support SeaBIOS as well, we would have to be
>>> able to distinguish a BIOS from the UEFI on the QEMU level so that we
>>> could produce different ACPI
>> Yes and no,
>>
>>> (no SMI and different OperationRegion than
>>> 0x  for SeaBIOS),
>> "yes" with regard to the SMM difference, "no" with regard to the
>> operation region. We have an ACPI linker/loader command that makes the
>> firmware basically just allocate memory, and we have two other ACPI
>> linker/loader commands that (a) patch the allocation address into other
>> ACPI artifacts, (b) return the allocation address to QEMU (for device
>> emulation purposes), if necessary.
> 
> I thought about allowing the firmware to configure the memory region to
> use for the PPI interface. UEFI would say 0x , SeaBIOS would
> choose some other area (0xFEF4 5000). Does the ACPI patcher handle this
> case or does the address patching have to be set up while building the
> tables in QEMU? If latter, then we would have to know in QEMU whether
> it's going to be BIOS or UEFI as firmware. I have tried a lot of things
> in the recent past, but I forgot whether this type of patching is possible.

The ACPI linker/loader commands are typically added to the "linker
script" in the very functions that build the ACPI payload.

And, distinguishing the firmwares is not necessary just for this; the
point of the firmware-side allocation is that QEMU does not dictate the
address. Each firmware is expected to use its own memory allocation
service, which in turn will ensure that the runtime OS stays away from
the allocated area. So the allocation address is ultimately determined
by the firmware.

The other two commands make the firmware patch the actual allocation
address (whatever it may be) into other ACPI artifacts, and make the
firmware pass the allocation address (whatever it may be) back to QEMU.

>> My operating knowledge about the TPM had been that
>>
>>    Components measure stuff into PCRs, and if any untrusted agent messes
>>    with those measurements, for example by directly writing to the PCRs,
>>    then the TPM will simply not unseal its secrets, hence such tampering
>>    is self-defeating for those agents.
>>
>> While this might be correct (I hope it is correct!), the *PPI* part of
>> TPM appears entirely different. In fact I don't have the slightest idea
>> *why* PPI is lumped together with the TPM.
> 
> The physical presence interface allows *automation of TPM operations and
> changing the TPM's state* (such as clearing all keys) that are typically
> only possible via interaction with the TPM menu in the firmware. Think
> of it as some TPM operations that can only run successfully while the
> system runs the firmware. Once the firmware has given control to the
> next stage (bootloader, kernel) these operations are not possible
> anymore since the firmware has execute some TPM commands that put the
> TPM into a state so it wouldn't allow those operations anymore.

OK, but if the OS is allowed to modify this set of "queued operations",
then what protection is expected of SMM? Whether you can modify the TPM
directly, or queue random commands for it at libery, what's the difference?

>> Can you explain in more detail what the PPI operations are, and why they
>> need protection, from what agents exactly? What is the purported
>> lifecycle of such PPI operations?
> 
> With the clearing of the TPM one would loose all keys associated with
> the TPM. So you don't want some software module to be able to set such a
> 'code', reset the machine, and the user looses all keys on the way. The
> control has to be strongly with the admin.

Where is this barrier erected, between OS and firmware, or between
privileged and non-privileged OS user?

SMM is only relevant if the barrier is expected between OS and firmware;
i.e. you want to constrain the OS kernel to a subset of valid
operations. If the barrier is between privileged and non-privileged OS
user, then the implementation belongs in the OS kernel, since mere users
don't have direct hardware access anyway.

> Also, to prevent fumbling with the variables, UEFI seems to make the variable 
> read-only.

That seems to imply the barrier is between OS kernel and firmware.

> I am wondering whether a malicious UEFI module could be written that
> patches the ACPI tables and does what it wants when it comes to these
> early TPM operations, rather than what the admin wants.

This is a good point, and it applies to more than just ACPI. The answer
is that it doesn't matter what *any* OS level

Re: [SeaBIOS] [PATCH v2 2/3] tcgbios: Add TPM Physical Presence interface support

2018-01-16 Thread Laszlo Ersek
On 01/16/18 19:36, Kevin O'Connor wrote:
> On Tue, Jan 16, 2018 at 11:41:02AM -0500, Stefan Berger wrote:
>> Add support for TPM 1.2 and TPM 2 Physical Presence interface (PPI).
>> A shared memory structure is located at 0xfffe f000 - 0xfffe f3ff
>> that SeaBIOS initializes (unless it has already been intialized) and
>> then searches for a code it is supposed to act upon. A code typically
>> requires that one or more TPM commands are being sent.
> 
> If I'm understanding the code correctly, it no longer hardcodes
> 0xfffef000 (great!).  The commit comment should also be updated.
> 
>>
>> The underlying spec can be accessed from this page here:
>>
>> https://trustedcomputinggroup.org/tcg-physical-presence-interface-specification/
>>
>> Version 1.30 is implemented.
>>
>> Signed-off-by: Stefan Berger 
>> ---
>>  src/post.c |  4 +++
>>  src/std/acpi.h | 10 ++
>>  src/std/tcg.h  | 31 ++
>>  src/tcgbios.c  | 99 
>> ++
>>  src/tcgbios.h  |  3 ++
>>  5 files changed, 147 insertions(+)

[...]

>> --- a/src/std/acpi.h
>> +++ b/src/std/acpi.h
>> @@ -320,4 +320,14 @@ struct tpm2_descriptor_rev2
>>  u64  log_area_start_address;
>>  } PACKED;
>>  
>> +#define QEMU_SIGNATURE 0x554d4551
>> +struct qemu_descriptor
>> +{
>> +ACPI_TABLE_HEADER_DEF
>> +u32 tpmppi_address;
>> +u8 tpm_version; /* 1 = 1.2, 2 = 2 */
>> +u8 tpmppi_version;
>> +#define TPM_PPI_VERSION_1_30   1
>> +} PACKED;
> 
> I'm confused at the purpose of this acpi table.  If I'm understanding
> it correctly, it is purely to pass information from QEMU to SeaBIOS
> (and perhaps OVMF?).  If so, I don't think this is a good way to do it
> - a regular fw_cfg setting seems simpler (and less likely to cause
> problems with OSes).

I agree; if the firmware is supposed to consume information from QEMU
for locating the register block of this platform device, please expose
the address in a new fw_cfg file.

Thanks!
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] Saving a few bytes across a reboot

2018-01-11 Thread Laszlo Ersek
On 01/11/18 18:16, Stefan Berger wrote:

> I can only point to the standard for the address. If QEMU has an API
> where we can first try to allocate fed4  and if that fails ask for
> another address, then we can use that. But does driver initialization
> work that way that we can first let all other devices register their
> MMIO requirements and then the TPM device ask whether fed4  is
> available and then falls back to using a random address?

As far as I understand, QEMU would keep the base address generally
fixed, but it could be moved if (a) another platform device comes along
that needs a large contiguous area and it cannot be accommodated without
moving other devices around, or (b) the user wanted to move the address
on the command line for whatever reason.

So, I don't think the QEMU API that you describe exists, or that there's
a use case for it. AFAICT board code is expected to place platform
devices up-front so that the latter peacefully co-exist.

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] Saving a few bytes across a reboot

2018-01-11 Thread Laszlo Ersek
(I'm not trying to further argue for the idea below, just to clarify it:)

On 01/11/18 15:29, Stefan Berger wrote:
> On 01/11/2018 09:02 AM, Laszlo Ersek wrote:
>> On 01/11/18 13:40, Igor Mammedov wrote:
>>> On Wed, 10 Jan 2018 17:45:52 +0100
>>> Laszlo Ersek <ler...@redhat.com> wrote:
>>>> (My understanding is that the guest has to populate the CRB, and then
>>>> kick the hypervisor, so at least the register used for kicking must be
>>>> in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space
>>>> (for platform devices). Thus, the register block must reside at a
>>>> QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
>>> MMIO doesn't have to be fixed nor exist at all, we could use
>>> linker write to file operation in FW for switching from guest
>>> to QEMU. That's obviously intrusive work for FW and QEMU
>>> compared to hardcodded address in both QEMU and FW but as
>>> benefit changes to QEMU and FW don't have to be tightly coupled
>>> and layout could be changed whenever need arises.
>> Marc-André wrote, "The [CRB] region is registered at the same address as
>> TIS (it's not entirely clear from the spec it is supposed to be there,
>> but my laptop tpm use the same)."
>>
>> And, the spec declares the register block at the fixed range
>> FED4_h-FED4_4FFFh.
>>
>> How about this:
>>
>> (1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
>>
>> (2) *except* make the base address of the register block a compat
>> property for the QEMU device,
>>
>> (3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose
>> the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS
>> and/or CRB
> 
> Why? Linux doesn't use this type of interface. Actually, for the TIS the
> base address has been hard coded as well.

The idea would be to hide the actual address from the OS. Let the OS go
through the ACPI methods only, and keep the ACPI constants in sync with
the device model.

> 
>>
>> (4) in the generated ACPI payload, adhere to the compat property (i.e.,
>> generate the base address values from the compat prop),
>>
>> (5) expose the base address stand-alone in a new fw_cfg file as well.
>>
>>
>> Benefits as I see it:
>>
>> - register block can move around from one QEMU release to next,
> 
> Why would we need that?

It's not a requirement that I'm presenting -- I took the requirement as
a given and attempted to satisfy it.

> fed4_ is presumably reserved for TPM device
> interfaces and shouldn't clash with anything in the future. With the PPI
> memory at _ - _00ffI am not so sure. Here we could use the
> proposed QEMU ACPI table and a hard-coded address, _ at the
> beginning. Would that not solve it? Why not?
> 
>>
>> - migration remains functional (ACPI comes from source host, but it
>>    matches the device model on the target host, due to the compat prop),
>>
>> - firmware remains dumb about TPM activations (OS calls ACPI calls
>>    virtual hardware),
> 
> Linux doesn't use the ACPI interface from what I can tell.
> 
> What are 'TPM activations'?

I coined this expression for "interacting with the TPM device". I used
this expression because the TPM ACPI spec uses the expression
"activation methods" for describing the various ways to interact with
the device (TIS, CRB, ACPI, ACPI+CRB are four methods that we've been
discussing).

So above I meant that the firmware does not participate in OS->TPM requests.

> We have a TIS interface for example that
> SeaBIOS uses to initialize the TPM1.2 / TPM2.
> 
> 
>>
>> - the ACPI-to-hardware interface is dictated by an industry spec, so we
> 
> Do you have a pointer to this spec?

I simply meant that a TIS client would have to be written in AML
(generated by QEMU). To the OS the device would be available via ACPI or
ACPI+CRB activation, but to the ACPI implementation itself, it would
look like a TIS or CRB device, with a moveable base address. This way
the OS would be separated from the base address (because the OS would
have to go through ACPI), and the firmware could reuse existent TIS
drivers with hopefully minimal customization (base address taken from
fw_cfg).

So, by the above industry spec, I simply meant the TIS interface.

Anyway, based on your description, there's a disconnect between the
Linux guest and the base address movability requirement:

- we have four activation methods: TIS+Cancel, CRB, ACPI, ACPI+CRB
- of this, Linux only supports the first two (TIS+Cancel, CRB), IIUC
- in addition, Linux hard-c

Re: [SeaBIOS] Saving a few bytes across a reboot

2018-01-11 Thread Laszlo Ersek
On 01/11/18 13:40, Igor Mammedov wrote:
> On Wed, 10 Jan 2018 17:45:52 +0100
> Laszlo Ersek <ler...@redhat.com> wrote:

>> (My understanding is that the guest has to populate the CRB, and then
>> kick the hypervisor, so at least the register used for kicking must be
>> in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space
>> (for platform devices). Thus, the register block must reside at a
>> QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
> 
> MMIO doesn't have to be fixed nor exist at all, we could use
> linker write to file operation in FW for switching from guest
> to QEMU. That's obviously intrusive work for FW and QEMU
> compared to hardcodded address in both QEMU and FW but as
> benefit changes to QEMU and FW don't have to be tightly coupled
> and layout could be changed whenever need arises.

Marc-André wrote, "The [CRB] region is registered at the same address as
TIS (it's not entirely clear from the spec it is supposed to be there,
but my laptop tpm use the same)."

And, the spec declares the register block at the fixed range
FED4_h-FED4_4FFFh.

How about this:

(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,

(2) *except* make the base address of the register block a compat
property for the QEMU device,

(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose
the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS
and/or CRB

(4) in the generated ACPI payload, adhere to the compat property (i.e.,
generate the base address values from the compat prop),

(5) expose the base address stand-alone in a new fw_cfg file as well.


Benefits as I see it:

- register block can move around from one QEMU release to next,

- migration remains functional (ACPI comes from source host, but it
  matches the device model on the target host, due to the compat prop),

- firmware remains dumb about TPM activations (OS calls ACPI calls
  virtual hardware),

- the ACPI-to-hardware interface is dictated by an industry spec, so we
  don't have to invent and document a paravirtual interface. If it ever
  becomes necessary for the firmware to directly access the TPM
  hardware (for example, to replay physical presence commands queued by
  the OS), fw can rely on the same industry spec, only the base address
  has to be updated -- which is available stand-alone from the named
  fw_cfg file.

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios

Re: [SeaBIOS] Saving a few bytes across a reboot

2018-01-11 Thread Laszlo Ersek
On 01/10/18 19:45, Stefan Berger wrote:
> On 01/10/2018 11:45 AM, Laszlo Ersek wrote:
>> On 01/10/18 16:19, Marc-André Lureau wrote:
>>> Hi
>>>
>>> - Original Message -
>>>> BTW, from the "TCG PC Client Platform TPM Profile (PTP)
>>>> Specification", it seems like the FIFO (TIS) interface is hard-coded
>>>> *in the spec* at FED4_h  FED4_4FFFh. So we don't even have
>>>> to make that dynamic.
>>>>
>>>> Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap
>>>> my brain around the exact resources that the CRB interface requries.
>>>> Marc-André, can you summarize those?
>>> The device is a relatively simple MMIO-only device on the sysbus:
>>> https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a80867584c487035db9#diff-1ef22a0d46031cf2701a185aed8ae40eR282
>>>
>>>
>>> The region is registered at the same address as TIS (it's not entirely
>>> clear from the spec it is supposed to be there, but my laptop tpm use
>>> the same). And it uses a size of 0x1000, although it's also unclear to
>>> me what should be the size of the command buffer (that size can also
>>> be defined at run-time now, iirc, I should adapt the code).
>> Thank you -- so the "immediate" register block is in MMIO space, and
>> (apparently) we can hard-code its physical address too.
>>
>> My question is if we need to allocate guest RAM in addition to the
>> register block, for the command buffer(s) that will transmit the
>> requests/responses. I see the code you quote above says,
>>
>> +    /* allocate ram in bios instead? */
>> +    memory_region_add_subregion(get_system_memory(),
>> +    TPM_CRB_ADDR_BASE + sizeof(struct crb_regs), >cmdmem);
>>
>> ... and AFAICS your commit message poses the exact same question :)
>>
>> Option 1: If we have enough room in MMIO space above the register block
>> at 0xFED4, then we could simply dump the CRB there too.
>>
>> Option 2: If not (or we want to avoid Option 1 for another reason), then
>> the linker/loader script has to make the guest fw allocate RAM, write
>> the allocation address to the TPM2 table with an ADD_POINTER command,
>> and write the address back to QEMU with a WRITE_POINTER command. Is my
>> understanding correct?
>>
>> I wonder why we'd want to bother with Option 2, since we have to place
>> the register block at a fixed MMIO address anyway.
>>
>> (My understanding is that the guest has to populate the CRB, and then
>> kick the hypervisor, so at least the register used for kicking must be
>> in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space
>> (for platform devices). Thus, the register block must reside at a
>> QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
>>
>>> My experiments so far running some Windows tests indicate that for
>>> TPM2, CRB+UEFI is required (and I managed to get an ovmf build with
>>> TPM2 support).
>> Awesome!
>>
>>> A few test failed, it seems the "Physical Presence Interface" (PPI) is
>>> also required.
>> Required for what goal, exactly?
>>
>>> I think that ACPI interface allows to run TPM commands during reboot,
>>> by having the firmware taking care of the security aspects.
>> Ugh :/ I mentioned those features in my earlier write-up, under points
>> (2f2b) and (2f2c). I'm very unhappy about them. They are a *huge* mess
>> for OVMF.
>>
>> - They would require including (at least a large part of) the
>>    Tcg2Smm/Tcg2Smm.inf driver, with all the complications I described
>>    earlier as counter-arguments,
>>
>> - they'd require including the MemoryOverwriteControl/TcgMor.inf driver,
>>
>> - and they'd require some real difficult platform code in OVMF (e.g.
>>    PEI-phase access to non-volatile UEFI variables, which I've by now
>>    failed to upstream twice; PEI-phase access to all RAM; and more).
>>
>> My personal opinion is that we should determine what goals require what
>> TPM features, and then we should aim at a minimal set. If I understand
>> correctly, PCRs and measurements already work (although the patches are
>> not upstream yet) -- is that correct?
>>
>> Personally I think the SSDT/_DSM-based features (TCG Hardware
>> Information, TCG Memory Clear Interface, TCG Physical Presence
>> Interface) are very much out of scope for "TPM Enablement".
>>
>>> I think that's what Stefan is working on for S

Re: [SeaBIOS] Saving a few bytes across a reboot

2018-01-10 Thread Laszlo Ersek
On 01/10/18 16:19, Marc-André Lureau wrote:
> Hi
>
> - Original Message -
>>
>> BTW, from the "TCG PC Client Platform TPM Profile (PTP)
>> Specification", it seems like the FIFO (TIS) interface is hard-coded
>> *in the spec* at FED4_h  FED4_4FFFh. So we don't even have
>> to make that dynamic.
>>
>> Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap
>> my brain around the exact resources that the CRB interface requries.
>> Marc-André, can you summarize those?
>
> The device is a relatively simple MMIO-only device on the sysbus:
> https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a80867584c487035db9#diff-1ef22a0d46031cf2701a185aed8ae40eR282
>
> The region is registered at the same address as TIS (it's not entirely
> clear from the spec it is supposed to be there, but my laptop tpm use
> the same). And it uses a size of 0x1000, although it's also unclear to
> me what should be the size of the command buffer (that size can also
> be defined at run-time now, iirc, I should adapt the code).

Thank you -- so the "immediate" register block is in MMIO space, and
(apparently) we can hard-code its physical address too.

My question is if we need to allocate guest RAM in addition to the
register block, for the command buffer(s) that will transmit the
requests/responses. I see the code you quote above says,

+/* allocate ram in bios instead? */
+memory_region_add_subregion(get_system_memory(),
+TPM_CRB_ADDR_BASE + sizeof(struct crb_regs), >cmdmem);

... and AFAICS your commit message poses the exact same question :)

Option 1: If we have enough room in MMIO space above the register block
at 0xFED4, then we could simply dump the CRB there too.

Option 2: If not (or we want to avoid Option 1 for another reason), then
the linker/loader script has to make the guest fw allocate RAM, write
the allocation address to the TPM2 table with an ADD_POINTER command,
and write the address back to QEMU with a WRITE_POINTER command. Is my
understanding correct?

I wonder why we'd want to bother with Option 2, since we have to place
the register block at a fixed MMIO address anyway.

(My understanding is that the guest has to populate the CRB, and then
kick the hypervisor, so at least the register used for kicking must be
in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space
(for platform devices). Thus, the register block must reside at a
QEMU-determined GPA. Once we do that, why bother about RAM allocation?)

> My experiments so far running some Windows tests indicate that for
> TPM2, CRB+UEFI is required (and I managed to get an ovmf build with
> TPM2 support).

Awesome!

> A few test failed, it seems the "Physical Presence Interface" (PPI) is
> also required.

Required for what goal, exactly?

> I think that ACPI interface allows to run TPM commands during reboot,
> by having the firmware taking care of the security aspects.

Ugh :/ I mentioned those features in my earlier write-up, under points
(2f2b) and (2f2c). I'm very unhappy about them. They are a *huge* mess
for OVMF.

- They would require including (at least a large part of) the
  Tcg2Smm/Tcg2Smm.inf driver, with all the complications I described
  earlier as counter-arguments,

- they'd require including the MemoryOverwriteControl/TcgMor.inf driver,

- and they'd require some real difficult platform code in OVMF (e.g.
  PEI-phase access to non-volatile UEFI variables, which I've by now
  failed to upstream twice; PEI-phase access to all RAM; and more).

My personal opinion is that we should determine what goals require what
TPM features, and then we should aim at a minimal set. If I understand
correctly, PCRs and measurements already work (although the patches are
not upstream yet) -- is that correct?

Personally I think the SSDT/_DSM-based features (TCG Hardware
Information, TCG Memory Clear Interface, TCG Physical Presence
Interface) are very much out of scope for "TPM Enablement".

> I think that's what Stefan is working on for Seabios and the safe
> memory region (sorry I haven't read the whole discussion, as I am not
> working on TPM atm)

Yeah, with e.g. the "TCG Memory Clear Interface" feature pulled into the
context -- from the "Platform Reset Attack Mitigation Specification" --,
I do understand Stefan's question. Said feature is about the OS setting
a flag in NVRAM, for the firmware to act upon, at next boot. "Saving a
few bytes across a reboot" maps to that.

(And, as far as I understand this spec, it tells traditional BIOS
implementors, "do whatever you want for implementing this NVRAM thingy",
while to UEFI implementors, it says, "use exactly this and that
non-volatile UEFI variable". Given this, I don't know how much
commonality would be possible between SeaBIOS and OVMF.)

Similarly, about "TCG Physical Presence Interface" -- defined in the TCG
Physical Presence Interface Specification --, I had written, "The OS can
queue TPM operations (?) that require Physical Presence, and 

Re: [SeaBIOS] Saving a few bytes across a reboot

2018-01-10 Thread Laszlo Ersek
Stefan,

On 01/09/18 20:02, Stefan Berger wrote:

> Another twist is that Intel's EDK2 also implements this but the data
> structure layout is different and they use SMM + SMIs etc.
> 
> https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Smm/Tpm.asl#L81

As I described in my investigation linked from
, we should not
include the Tcg2Smm driver in OVMF, for TPM enablement -- at least for
the short & mid terms.

What does the Tcg2Smm driver do? In section (2f), I described that the
driver installs two tables, "TPM2" and an "SSDT".

- The TPM2 table from this driver is unneeded, since QEMU generates its
  own TPM2 table, which describes the TPM device's access method --
  TIS+Cancel (method 6).

- The SSDT from the driver is again unneeded. It provides (via the _DSM
  method) an ACPI-level API that the OS can use, for talking to the TPM
  device. An implementation detail of this ACPI method is that it raises
  an SMI, for entering the firmware at an elevated privilege level (= in
  SMM). Then, the actual TPM hardware manipulation, or even the TPM
  *software emulation*, is performed by the firmware, in SMM.

This approach is totally ill-suited for the QEMU virtualization stack.
For starters, none of the firmware code exist -- as open source anyway
-- that would actually handle such ACPI->SMM requests. Second, I'm sure
we don't want to debug TPM software emulation running in SMM guest
firmware, rather than an actual QEMU device model.

Once we have a real device model, accessed via IO ports and/or MMIO
locations, perhaps in combination with request/response buffers
allocated in guest RAM, the SMI/SMM implementation detail falls away
completely. Our TPM emulation would attain its "privileged / protected"
status simply by existing in the hypervisor (QEMU).

So here's what should be done:

- QEMU should implement the TPM device model, using TIS+Cancel (method
  6) or CRB (method 7). These are collectively called "dTPM".

- QEMU should continue generating a TPM2 ACPI table, for describing one
  of the above access methods to the OS, as appropriate for the actual
  device model.

- OVMF should include the following drivers from edk2, without changes:
  - Tcg2Pei/Tcg2Pei.inf
  - Tcg2Dxe/Tcg2Dxe.inf

- OVMF should include the following drivers from edk2,
  - either verbatim (if they work out like that),
  - or with small customizations (if the drivers themselves offer
sufficiently flexible knobs),
  - or else as modules duplicated / rewritten under OvmfPkg,
  - or they might even turn out unnecessary:

  - Tcg2Config/Tcg2ConfigPei.inf
  - Tcg2Config/Tcg2ConfigDxe.inf

> QEMU would also be generating the ACPI for this UEFI I suppose. So now
> who needs to adapt to whom? And can EDK2 be adapted to do something
> different or should it remain as-is and SeaBIOS would have to work
> similarly as EDK2 does? I don't know much about SMM / SMIs and how it
> work unfortunately and whether it can work from the OS when ACPI raises
> an SMI. Any opinions ?

To be honest, I don't understand SeaBIOS's role here (beyond executing
the linker/loader script from QEMU). To my knowledge, SeaBIOS does not
intend to be a TPM client. As far as I understand, only
- UEFI applications,
- and then the OS (UEFI-based, or traditional BIOS-based)
are expected to function as TPM clients.

Under the approach described near the top,

- UEFI clients (such as UEFI boot loaders) are satisfied by the
  inclusion of the "Tcg2Dxe/Tcg2Dxe.inf" driver in OVMF -- because said
  driver produces the EFI_TCG2_PROTOCOL;

- and the OS (regardless of UEFI or traditional BIOS) is satisfied by
  finding the TPM hardware description in the TPM2 table of QEMU, and
  then by talking to the TPM device model (implemented in QEMU) with its
  own native driver.

So... I'm missing the point of the thread starter message -- "Saving a
few bytes across a reboot". Save them for what purpose?

BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification",
it seems like the FIFO (TIS) interface is hard-coded *in the spec* at
FED4_h – FED4_4FFFh. So we don't even have to make that dynamic.

Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my
brain around the exact resources that the CRB interface requries.
Marc-André, can you summarize those?

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH RFC] x86: use volatile asm for read/write{b, w, l} implementations

2018-01-04 Thread Laszlo Ersek
On 01/04/18 15:29, Vitaly Kuznetsov wrote:
> Laszlo Ersek <ler...@redhat.com> writes:

> In fact, the only writew() needs patching is in vp_notify(), when I
> replace it with 'asm volatile' everything works.
> 
>> * Does it make a difference if you disable EPT in the L1 KVM
>> configuration? (EPT is probably primarily controlled by the CPU features
>> exposed by L0 Hyper-V, and secondarily by the "ept" parameter of the
>> "kvm_intel" module in L1.)
>>
>> Asking about EPT because the virtio rings and descriptors are in RAM,
>> accessing which in L2 should "normally" never trap to L1/L0. However (I
>> *guess*), when those pages are accessed for the very first time in L2,
>> they likely do trap, and then the EPT setting in L1 might make a difference.
> 
> Disabling EPT helps!

OK...

> I also tried tracing L1 KVM and the difference between working and
> non-working cases seems to be:
> 
> 1) Working:
> 
> ...
><...>-51387 [014] 64765.695019: kvm_page_fault:   address 
> fe007000 error_code 182
><...>-51387 [014] 64765.695024: kvm_emulate_insn: 0:eca87: 66 
> 89 14 30
><...>-51387 [014] 64765.695026: vcpu_match_mmio:  gva 
> 0xfe007000 gpa 0xfe007000 Write GPA
><...>-51387 [014] 64765.695026: kvm_mmio: mmio write 
> len 2 gpa 0xfe007000 val 0x0
><...>-51387 [014] 64765.695033: kvm_entry:vcpu 0
><...>-51387 [014] 64765.695042: kvm_exit: reason 
> EPT_VIOLATION rip 0xeae17 info 181 306
><...>-51387 [014] 64765.695043: kvm_page_fault:   address 
> f0694 error_code 181
><...>-51387 [014] 64765.695044: kvm_entry:vcpu 0
> ...
> 
> 2) Broken:
> 
> ...
><...>-38071 [014] 63385.241117: kvm_page_fault:   address 
> fe007000 error_code 182
><...>-38071 [014] 63385.241121: kvm_emulate_insn: 0:ecffb: 66 
> 89 06
><...>-38071 [014] 63385.241123: vcpu_match_mmio:  gva 
> 0xfe007000 gpa 0xfe007000 Write GPA
><...>-38071 [014] 63385.241124: kvm_mmio: mmio write 
> len 2 gpa 0xfe007000 val 0x0
><...>-38071 [014] 63385.241143: kvm_entry:vcpu 0
><...>-38071 [014] 63385.241162: kvm_exit: reason 
> EXTERNAL_INTERRUPT rip 0xecffe info 0 80f6
><...>-38071 [014] 63385.241162: kvm_entry:vcpu 0
> ...
> 
> The 'kvm_emulate_insn' difference is actually the diferent versions of
> 'mov' we get with the current code and with my 'asm volatile'
> version. What makes me wonder is where the 'EXTERNAL_INTERRUPT' (only
> seen in broken version) comes from.
> 

I don't think said interrupt matters. I also don't think the MOV
differences matter; after all, in both cases we end up with the identical

  vcpu_match_mmio:  gva 0xfe007000 gpa 0xfe007000 Write GPA
  kvm_mmio: mmio write len 2 gpa 0xfe007000 val 0x0

sequence.


Here's another random idea:

I'll admit that I have no clue how SeaBIOS uses SMM, but I found an
earlier email from Paolo
<886757208.6870637.1484133921200.javamail.zim...@redhat.com> where he
wrote, "the main reason for it [i.e., SMM], is that it provides a safer
way to access a PCI device's memory BARs". (SeaBIOS commit 55215cd425d36
seems to give some background.)

And that kind of access is what vp_notify()/writew() does, and I see
"call32_smm" / "handle_smi" log entries in your thread starter,
intermixed with "vp notify".

Down-stream we disabled SMM in SeaBIOS because we deemed the additional
safety (see above) unnecessary for our limited BIOS service use cases
(=mostly grub), while SMM caused obscure problems:

- https://bugzilla.redhat.com/show_bug.cgi?id=1378006
- https://bugzilla.redhat.com/show_bug.cgi?id=1425516

So... can you rebuild SeaBIOS with "CONFIG_USE_SMM=n"?

(If you originally encountered the strange behavior with downstream
SeaBIOS, which already has CONFIG_USE_SMM=n, then please ignore...)

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH RFC] x86: use volatile asm for read/write{b, w, l} implementations

2018-01-04 Thread Laszlo Ersek
On 01/04/18 11:24, Vitaly Kuznetsov wrote:
> Laszlo Ersek <ler...@redhat.com> writes:
> 
>> Is it possible that the current barrier() is not sufficient for the
>> intended purpose in an L2 guest?
>>
>> What happens if you drop your current patch, but replace
>>
>>   __asm__ __volatile__("": : :"memory")
>>
>> in the barrier() macro definition, with a real, heavy-weight barrier,
>> such as
>>
>>   __asm__ __volatile__("mfence": : :"memory")
>>
>> (See mb() in "arch/x86/include/asm/barrier.h" in the kernel.)
>>
> 
> Thanks for the suggestion,
> 
> unfortunately, it doesn't change anything :-(
> 
>> ... I think running in L2 could play a role here; see
>> "Documentation/memory-barriers.txt", section "VIRTUAL MACHINE GUESTS";
>> from kernel commit 6a65d26385bf ("asm-generic: implement virt_xxx memory
>> barriers", 2016-01-12).
>>
>> See also the commit message.
>>
> 
> I see, thank you.
> 
> It seems, however, that the issue here is not about barriers: first of
> all it is 100% reproducible and second, surrounding '*(volatile u32
> *)addr = val' with all sorts of barriers doesn't help. I *think* this is
> some sort of a mis-assumption about this memory which is handled with
> vmexits so both L0 and L1 hypervisors are getting involved. More
> debugging ...

* Do you see the issue with both legacy-only (0.9.5) and modern-only
(1.0) virtio devices?

Asking about this because legacy and modern virtio devices use registers
in different address spaces (IO vs. MMIO).

* Does it make a difference if you disable EPT in the L1 KVM
configuration? (EPT is probably primarily controlled by the CPU features
exposed by L0 Hyper-V, and secondarily by the "ept" parameter of the
"kvm_intel" module in L1.)

Asking about EPT because the virtio rings and descriptors are in RAM,
accessing which in L2 should "normally" never trap to L1/L0. However (I
*guess*), when those pages are accessed for the very first time in L2,
they likely do trap, and then the EPT setting in L1 might make a difference.

* Somewhat relatedly, can you try launching QEMU in L1 with "-realtime
mlock=on"?

(Anyone please correct me if my ideas are bogus.)

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH RFC] x86: use volatile asm for read/write{b, w, l} implementations

2018-01-03 Thread Laszlo Ersek
On 01/03/18 14:41, Vitaly Kuznetsov wrote:
> QEMU/KVM guests running nested on top of Hyper-V fail to boot with
> virtio-blk-pci disks, the debug log ends with
>
> Booting from Hard Disk...
> call32_smm 0x000edd01 e97a0
> handle_smi cmd=b5 smbase=0x000a
> vp notify fe007000 (2) -- 0x0
> vp read   fe005000 (1) -> 0x0
> handle_smi cmd=b5 smbase=0x000a
> call32_smm done 0x000edd01 0
> Booting from :7c00
> call32_smm 0x000edd01 e97a4
> handle_smi cmd=b5 smbase=0x000a
> vp notify fe007000 (2) -- 0x0
> In resume (status=0)
> In 32bit resume
> Attempting a hard reboot
> ...
>
> I bisected the breakage to the following commit:
>
> commit f46739b1a819750c63fb5849844d99cc2ab001e8
> Author: Kevin O'Connor 
> Date:   Tue Feb 2 22:34:27 2016 -0500
>
> virtio: Convert to new PCI BAR helper functions
>
> But the commit itself appears to be correct. The problem is in how
> writew() function compiles into vp_notify(). For example, if we drop
> 'volatile' qualifier from the current writew() implementation
> everything starts to work. If we disassemble these two versions (as of
> f46739b1a) the difference will be:
>
>  :
>
>With 'volatile' (current)  Without 'volatile'
>
>0:   push   %ebx  0:   push   %ebx
>1:   mov%eax,%ecx 1:   mov%eax,%ecx
>3:   mov0x1518(%edx),%eax 3:   mov0x1518(%edx),%eax
>9:   cmpb   $0x0,0x2c(%ecx)   9:   cmpb   $0x0,0x2c(%ecx)
>d:   je 2f    d:   je 2e 
>f:   mov0x151c(%edx),%edx f:   mov0x151c(%edx),%edx
>   15:   mov0x28(%ecx),%ebx
>   18:   imul   %edx,%ebx  15:   imul   0x28(%ecx),%edx
>   1b:   mov0x8(%ecx),%edx 19:   mov0x8(%ecx),%ebx
>   1e:   add%ebx,%edx
>   20:   cmpb   $0x0,0xe(%ecx) 1c:   cmpb   $0x0,0xe(%ecx)
>   24:   je 2a 20:   je 28 
>   22:   add%ebx,%edx
>   26:   out%ax,(%dx)  24:   out%ax,(%dx)
>   28:   jmp48 26:   jmp47 
>   2a:   mov%ax,(%edx) 28:   mov%ax,(%ebx,%edx,1)
>   2d:   jmp48 2c:   jmp47 
>   2f:   lea0x20(%ecx),%ebx2e:   lea0x20(%ecx),%ebx
>   32:   cltd  31:   cltd
>   33:   push   %edx   32:   push   %edx
>   34:   push   %eax   33:   push   %eax
>   35:   mov$0x2,%ecx  34:   mov$0x2,%ecx
>   3a:   mov$0x10,%edx 39:   mov$0x10,%edx
>   3f:   mov%ebx,%eax  3e:   mov%ebx,%eax
>   41:   call   42 40:   call   41 
>   46:   pop%eax   45:   pop%eax
>   47:   pop%edx   46:   pop%edx
>   48:   pop%ebx   47:   pop%ebx
>   49:   ret   48:   ret
>
> My eyes fail to see an obvious compiler flaw here but probably the
> mov difference (at '2a' old, '28' new) is to blame. Doing some other
> subtle changes (e.g. adding local variables to the function) help in
> some cases too. At this point I got a bit lost with my debug so I
> looked at how Linux does this stuff and it seems we're not using
> '*(volatile u16) = ' there. Rewriting write/read{b,w,l} with volatile
> asm help.
>
> Signed-off-by: Vitaly Kuznetsov 
> ---
> RFC: This is rather an ongoing debug as I'm not able to point finger
> at the real culprit yet, I'd be grateful for any help and suggestions.
> In particular, I don't quite understand why nested virtualization
> makes a difference here.
> ---
>  src/x86.h | 21 +
>  1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/src/x86.h b/src/x86.h
> index 53378e9..d45122c 100644
> --- a/src/x86.h
> +++ b/src/x86.h
> @@ -199,30 +199,27 @@ static inline void smp_wmb(void) {
>  }
>
>  static inline void writel(void *addr, u32 val) {
> -barrier();
> -*(volatile u32 *)addr = val;
> +asm volatile("movl %0, %1" : : "d"(val), "m"(*(u32 *)addr) : "memory");
>  }
>  static inline void writew(void *addr, u16 val) {
> -barrier();
> -*(volatile u16 *)addr = val;
> +asm volatile("movw %0, %1" : : "d"(val), "m"(*(u16 *)addr) : "memory");
>  }
>  static inline void writeb(void *addr, u8 val) {
> -barrier();
> -*(volatile u8 *)addr = val;
> +asm volatile("movb %0, %1" : : "d"(val), "m"(*(u8 *)addr) : "memory");
>  }
>  static inline u32 readl(const void *addr) {
> -u32 val = *(volatile const u32 *)addr;
> -barrier();
> +u32 val;
> +asm volatile("movl %1, %0" : "=d"(val) : "m"(*(u32 *)addr) : "memory");
>  return val;
>  }
>  static inline u16 readw(const void *addr) {

Re: [SeaBIOS] [Qemu-devel] [PATCH v5 4/4] docs: update documentation considering PCIE-PCI bridge

2017-08-11 Thread Laszlo Ersek
On 08/11/17 01:31, Aleksandr Bezzubikov wrote:

> +PCIE-PCI bridge hot-plug
> +===
> +Guest OSes require extra efforts to enable PCIE-PCI bridge hot-plug.
> +Motivation - now on init any PCI Express root port which doesn't have
> +any device plugged in, has no free buses reserved to provide any of them
> +to a hot-plugged devices in future.
> +
> +To solve this problem we reserve additional buses on a firmware level.
> +Currently only SeaBIOS is supported.
> +The way of bus number to reserve delivery is special
> +Red Hat vendor-specific PCI capability, added to the root port
> +that is planned to have PCIE-PCI bridge hot-plugged in.
> +
> +Capability layout (defined in include/hw/pci/pci_bridge.h):
> +
> +uint8_t id; Standard PCI capability header field
> +uint8_t next;   Standard PCI capability header field
> +uint8_t len;Standard PCI vendor-specific capability header field
> +
> +uint8_t type;   Red Hat vendor-specific capability type
> +List of currently existing types:
> +RESOURCE_RESERVE = 1
> +
> +
> +uint32_t bus_res;   Minimum number of buses to reserve
> +
> +uint64_t io;   IO space to reserve
> +uint32_t mem   Non-prefetchable memory to reserve
> +
> +This two fields are mutually exclusive:

[*] mark this

> +uint32_t mem_pref_32;  Prefetchable memory to reserve (32-bit MMIO)
> +uint64_t mem_pref_64;  Prefetchable memory to reserve (64-bit MMIO)
> +
> +If any reservation field is -1 then this kind of reservation is not
> +needed and must be ignored by firmware.
> +
> +mem_pref_* fields mutual exclusiveness means they cannot be -1 both.

Please drop the last sentence; it is perfectly possible that a bridge
doesn't need either 32-bit or 64-bit prefetchable MMIO reservation.
"Mutually exclusive" usually means "at most one", not "exactly one".
(E.g., think of the "mutex" construct -- in the critical section being
protected by the mutex, there can be Thread 1, Thread 2, or none of them.)

So, beyond dropping the last sentence, I suggest to replace the one
marked with [*] with the following, for clarity:

At most one of the following two fields may be set to a value
different from -1:

With this update, for this patch:

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thanks!
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [Qemu-devel] [PATCH v4 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-09 Thread Laszlo Ersek
On 08/09/17 18:52, Aleksandr Bezzubikov wrote:
> 2017-08-09 13:18 GMT+03:00 Laszlo Ersek <ler...@redhat.com>:
>> On 08/08/17 21:21, Aleksandr Bezzubikov wrote:
>>> 2017-08-08 18:11 GMT+03:00 Laszlo Ersek <ler...@redhat.com>:
>>>> one comment below
>>>>
>>>> On 08/05/17 22:27, Aleksandr Bezzubikov wrote:
>>>>
>>>>> +Capability layout (defined in include/hw/pci/pci_bridge.h):
>>>>> +
>>>>> +uint8_t id; Standard PCI capability header field
>>>>> +uint8_t next;   Standard PCI capability header field
>>>>> +uint8_t len;Standard PCI vendor-specific capability header field
>>>>> +
>>>>> +uint8_t type;   Red Hat vendor-specific capability type
>>>>> +List of currently existing types:
>>>>> +QEMU_RESERVE = 1
>>>>> +
>>>>> +
>>>>> +uint32_t bus_res;   Minimum number of buses to reserve
>>>>> +
>>>>> +uint64_t io;IO space to reserve
>>>>> +uint64_t memNon-prefetchable memory to reserve
>>>>> +uint64_t mem_pref;  Prefetchable memory to reserve
>>>>
>>>> (I apologize if I missed any concrete points from the past messages
>>>> regarding this structure.)
>>>>
>>>> How is the firmware supposed to know whether the prefetchable MMIO
>>>> reservation should be made in 32-bit or 64-bit address space? If we
>>>> reserve prefetchable MMIO outside of the 32-bit address space, then
>>>> hot-plugging a device without 64-bit MMIO support could fail.
>>>>
>>>> My earlier request, to distinguish "prefetchable_32" from
>>>> "prefetchable_64" (mutually exclusively), was so that firmware would
>>>> know whether to restrict the MMIO reservation to 32-bit address
>>>> space.
>>>
>>> IIUC now (in SeaBIOS at least) we just assign this PREF registers
>>> unconditionally,
>>> so the decision about the mode can be made basing on !=0
>>> UPPER_PREF_LIMIT register.
>>> My idea was the same - we can just check if the value doesn't fit into
>>> 16-bit (PREF_LIMIT reg size, 32-bit MMIO). Do we really need separate
>>> fields for that?
>>
>> The PciBusDxe driver in edk2 tracks 32-bit and 64-bit MMIO resources
>> separately from each other, and other (independent) logic exists in it
>> that, on some conditions, allocates 64-bit MMIO BARs from 32-bit address
>> space. This is just to say that the distinction is intentional in
>> PciBusDxe.
>>
>> Furthermore, the Platform Init spec v1.6 says the following (this is
>> what OVMF will have to comply with, in the "platform hook" called by
>> PciBusDxe):
>>
>>> 12.6 PCI Hot Plug PCI Initialization Protocol
>>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
>>> ...
>>> Padding  The amount of resource padding that is required by the PCI
>>>  bus under the control of the specified HPC. Because the
>>>  caller does not know the size of this buffer, this buffer is
>>>  allocated by the callee and freed by the caller.
>>> ...
>>> The padding is returned in the form of ACPI (2.0 & 3.0) resource
>>> descriptors. The exact definition of each of the fields is the same as
>>> in the
>>> EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources()
>>> function. See the section 10.8 for the definition of this function.
>>
>> Following that pointer:
>>
>>> 10.8 PCI HostBridge Code Definitions
>>> 10.8.2 PCI Host Bridge Resource Allocation Protocol
>>>
>>> Table 8. ACPI 2.0 & 3.0 QWORD Address Space Descriptor Usage
>>>
>>> ByteByteData  Description
>>> Offset  Length
>>> ...
>>> 0x030x01  Resource type:
>>> 0: Memory range
>>> 1: I/O range
>>> 2: Bus number range
>>> ...
>>> 0x050x01  Type-specific flags. Ignored except as defined
>>>   in Table 3-3 and Table 3-4 below.
>>>
>>> 0x060x08  Address Space Granularity. Used to differentiate
>>>   between a 32-bit memory request and a 64-bit
>>>   memory request. For a 32-bit memory request,
>>>

Re: [SeaBIOS] [Qemu-devel] [PATCH v4 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-09 Thread Laszlo Ersek
On 08/08/17 21:21, Aleksandr Bezzubikov wrote:
> 2017-08-08 18:11 GMT+03:00 Laszlo Ersek <ler...@redhat.com>:
>> one comment below
>>
>> On 08/05/17 22:27, Aleksandr Bezzubikov wrote:
>>
>>> +Capability layout (defined in include/hw/pci/pci_bridge.h):
>>> +
>>> +uint8_t id; Standard PCI capability header field
>>> +uint8_t next;   Standard PCI capability header field
>>> +uint8_t len;Standard PCI vendor-specific capability header field
>>> +
>>> +uint8_t type;   Red Hat vendor-specific capability type
>>> +List of currently existing types:
>>> +QEMU_RESERVE = 1
>>> +
>>> +
>>> +uint32_t bus_res;   Minimum number of buses to reserve
>>> +
>>> +uint64_t io;IO space to reserve
>>> +uint64_t memNon-prefetchable memory to reserve
>>> +uint64_t mem_pref;  Prefetchable memory to reserve
>>
>> (I apologize if I missed any concrete points from the past messages
>> regarding this structure.)
>>
>> How is the firmware supposed to know whether the prefetchable MMIO
>> reservation should be made in 32-bit or 64-bit address space? If we
>> reserve prefetchable MMIO outside of the 32-bit address space, then
>> hot-plugging a device without 64-bit MMIO support could fail.
>>
>> My earlier request, to distinguish "prefetchable_32" from
>> "prefetchable_64" (mutually exclusively), was so that firmware would
>> know whether to restrict the MMIO reservation to 32-bit address
>> space.
>
> IIUC now (in SeaBIOS at least) we just assign this PREF registers
> unconditionally,
> so the decision about the mode can be made basing on !=0
> UPPER_PREF_LIMIT register.
> My idea was the same - we can just check if the value doesn't fit into
> 16-bit (PREF_LIMIT reg size, 32-bit MMIO). Do we really need separate
> fields for that?

The PciBusDxe driver in edk2 tracks 32-bit and 64-bit MMIO resources
separately from each other, and other (independent) logic exists in it
that, on some conditions, allocates 64-bit MMIO BARs from 32-bit address
space. This is just to say that the distinction is intentional in
PciBusDxe.

Furthermore, the Platform Init spec v1.6 says the following (this is
what OVMF will have to comply with, in the "platform hook" called by
PciBusDxe):

> 12.6 PCI Hot Plug PCI Initialization Protocol
> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
> ...
> Padding  The amount of resource padding that is required by the PCI
>  bus under the control of the specified HPC. Because the
>  caller does not know the size of this buffer, this buffer is
>  allocated by the callee and freed by the caller.
> ...
> The padding is returned in the form of ACPI (2.0 & 3.0) resource
> descriptors. The exact definition of each of the fields is the same as
> in the
> EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources()
> function. See the section 10.8 for the definition of this function.

Following that pointer:

> 10.8 PCI HostBridge Code Definitions
> 10.8.2 PCI Host Bridge Resource Allocation Protocol
>
> Table 8. ACPI 2.0 & 3.0 QWORD Address Space Descriptor Usage
>
> ByteByteData  Description
> Offset  Length
> ...
> 0x030x01  Resource type:
> 0: Memory range
> 1: I/O range
> 2: Bus number range
> ...
> 0x050x01  Type-specific flags. Ignored except as defined
>   in Table 3-3 and Table 3-4 below.
>
> 0x060x08  Address Space Granularity. Used to differentiate
>   between a 32-bit memory request and a 64-bit
>   memory request. For a 32-bit memory request,
>   this field should be set to 32. For a 64-bit
>   memory request, this field should be set to 64.
>   Ignored for I/O and bus resource requests.
>   Ignored during GetProposedResources().

The "Table 3-3" and "Table 3-4" references under "Type-specific flags"
are out of date (spec bug); in reality those are:
- Table 10. I/O Resource Flag (Resource Type = 1) Usage,
- Table 11. Memory Resource Flag (Resource Type = 0) Usage.

The latter is relevant here:

> Table 11. Memory Resource Flag (Resource Type = 0) Usage
>
> Bits  Meaning
> ...
> Bit[2:1]  _MEM. Memory attributes.
>   Value and Meaning:
> 0 The memory is nonprefetchable.
> 1 Invalid.
> 2 Invalid.
>   

Re: [SeaBIOS] [Qemu-devel] [PATCH v4 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-08 Thread Laszlo Ersek
one comment below

On 08/05/17 22:27, Aleksandr Bezzubikov wrote:

> +Capability layout (defined in include/hw/pci/pci_bridge.h):
> +
> +uint8_t id; Standard PCI capability header field
> +uint8_t next;   Standard PCI capability header field
> +uint8_t len;Standard PCI vendor-specific capability header field
> +
> +uint8_t type;   Red Hat vendor-specific capability type
> +List of currently existing types:
> +QEMU_RESERVE = 1
> +
> +
> +uint32_t bus_res;   Minimum number of buses to reserve
> +
> +uint64_t io;IO space to reserve
> +uint64_t memNon-prefetchable memory to reserve
> +uint64_t mem_pref;  Prefetchable memory to reserve

(I apologize if I missed any concrete points from the past messages
regarding this structure.)

How is the firmware supposed to know whether the prefetchable MMIO
reservation should be made in 32-bit or 64-bit address space? If we
reserve prefetchable MMIO outside of the 32-bit address space, then
hot-plugging a device without 64-bit MMIO support could fail.

My earlier request, to distinguish "prefetchable_32" from
"prefetchable_64" (mutually exclusively), was so that firmware would
know whether to restrict the MMIO reservation to 32-bit address space.

This is based on an earlier email from Alex to me:

On 10/03/16 18:01, Alex Williamson wrote:
> I don't think there's such a thing as a 64-bit non-prefetchable
> aperture.  In fact, there are not separate 32 and 64 bit prefetchable
> apertures.  The apertures are:
>
> I/O base/limit - (default 16bit, may be 32bit)
> Memory base/limit - (32bit only, non-prefetchable)
> Prefetchable Memory base/limit - (default 32bit, may be 64bit)
>
> This is according to Table 3-2 in the PCI-to-PCI bridge spec rev 1.2.

I don't care much about the 16-bit vs. 32-bit IO difference (that's
entirely academic and the Platform Spec init doesn't even provide a way
for OVMF to express such a difference). However, the optional
restriction to 32-bit matters for the prefetchable MMIO aperture.

Other than this, the patch looks good to me, and I'm ready to R-b.

Thanks!
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure

2017-08-04 Thread Laszlo Ersek
On 08/04/17 20:59, Alexander Bezzubikov wrote:
> 2017-08-01 20:28 GMT+03:00 Alexander Bezzubikov :
>> 2017-08-01 16:38 GMT+03:00 Marcel Apfelbaum :
>>> On 31/07/2017 22:01, Alexander Bezzubikov wrote:

 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin :
>
> On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote:
>>
>> 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum :
>>>
>>> On 31/07/2017 17:00, Michael S. Tsirkin wrote:


 On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote:
>
>
> On PCI init PCI bridge devices may need some
> extra info about bus number to reserve, IO, memory and
> prefetchable memory limits. QEMU can provide this
> with special vendor-specific PCI capability.
>
> This capability is intended to be used only
> for Red Hat PCI bridges, i.e. QEMU cooperation.
>
> Signed-off-by: Aleksandr Bezzubikov 
> ---
>src/fw/dev-pci.h | 62
> 
>1 file changed, 62 insertions(+)
>create mode 100644 src/fw/dev-pci.h
>
> diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h
> new file mode 100644
> index 000..fbd49ed
> --- /dev/null
> +++ b/src/fw/dev-pci.h
> @@ -0,0 +1,62 @@
> +#ifndef _PCI_CAP_H
> +#define _PCI_CAP_H
> +
> +#include "types.h"
> +
> +/*
> +
> +QEMU-specific vendor(Red Hat)-specific capability.
> +It's intended to provide some hints for firmware to init PCI
> devices.
> +
> +Its is shown below:
> +
> +Header:
> +
> +u8 id;   Standard PCI Capability Header field
> +u8 next; Standard PCI Capability Header field
> +u8 len;  Standard PCI Capability Header field
> +u8 type; Red Hat vendor-specific capability type:
> +   now only REDHAT_QEMU_CAP 1 exists
> +Data:
> +
> +u16 non_prefetchable_16; non-prefetchable memory limit
> +
> +u8 bus_res;  minimum bus number to reserve;
> + this is necessary for PCI Express Root Ports
> + to support PCIE-to-PCI bridge hotplug
> +
> +u8 io_8; IO limit in case of 8-bit limit value
> +u32 io_32;   IO limit in case of 16-bit limit value
> + io_8 and io_16 are mutually exclusive, in other words,
> + they can't be non-zero simultaneously
> +
> +u32 prefetchable_32; non-prefetchable memory limit
> + in case of 32-bit limit value
> +u64 prefetchable_64; non-prefetchable memory limit
> + in case of 64-bit limit value
> + prefetachable_32 and prefetchable_64
> are
> + mutually exclusive, in other words,
> + they can't be non-zero simultaneously
> +If any field in Data section is 0,
> +it means that such kind of reservation
> +is not needed.
>>
>>
>> I really don't like this 'mutually exclusive' fields approach because
>> IMHO it increases confusion level when undertanding this capability
>> structure.
>> But - if we came to consensus on that, then IO fields should be used
>> in the same way,
>> because as I understand, this 'mutual exclusivity' serves to distinguish
>> cases
>> when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT
>> registers.
>> And this is how both IO and PREFETCHABLE works, isn't it?
>
>
> I would just use simeple 64 bit registers. PCI spec has an ugly format
> with fields spread all over the place but that is because of
> compatibility concerns. It makes not sense to spend cycles just
> to be similarly messy.


 Then I suggest to use exactly one field of a maximum possible size
 for each reserving object, and get rid of mutually exclusive fields.
 Then it can be something like that (order and names can be changed):
 u8 bus;
 u16 non_pref;
 u32 io;
 u64 pref;

>>>
>>> I think Michael suggested:
>>>
>>> u64 bus_res;
>>> u64 mem_res;
>>> u64 io_res;
>>> u64 mem_pref_res;
>>>
>>> OR:
>>> u32 bus_res;
>>> u32 mem_res;
>>> u32 io_res;
>>> u64 mem_pref_res;
>>>
>>>
>>> We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's
>>> requests.
>>
>> Let's dwell on the second option (with -1 as 'ignore' sign), if no new
>> objections.
>>
> 
> BTW, talking about limit values provided in the 

Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-02 Thread Laszlo Ersek
On 08/02/17 15:47, Michael S. Tsirkin wrote:
> On Wed, Aug 02, 2017 at 12:23:46AM +0200, Laszlo Ersek wrote:
>> On 08/01/17 23:39, Michael S. Tsirkin wrote:
>>> On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote:
>>>> 2017-08-01 23:31 GMT+03:00 Laszlo Ersek <ler...@redhat.com>:
>>>>> (Whenever my comments conflict with Michael's or Marcel's, I defer to 
>>>>> them.)
>>>>>
>>>>> On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
>>>>>> Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com>
>>>>>> ---
>>>>>>  docs/pcie.txt|  46 ++
>>>>>>  docs/pcie_pci_bridge.txt | 121 
>>>>>> +++
>>>>>>  2 files changed, 147 insertions(+), 20 deletions(-)
>>>>>>  create mode 100644 docs/pcie_pci_bridge.txt
>>>>>>
>>>>>> diff --git a/docs/pcie.txt b/docs/pcie.txt
>>>>>> index 5bada24..338b50e 100644
>>>>>> --- a/docs/pcie.txt
>>>>>> +++ b/docs/pcie.txt
>>>>>> @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on 
>>>>>> the Root Complex:
>>>>>>  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
>>>>>> Express
>>>>>>  hierarchies.
>>>>>>
>>>>>> -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
>>>>>> +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
>>>>>>  hierarchies.
>>>>>>
>>>>>>  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root 
>>>>>> Buses
>>>>>
>>>>> When reviewing previous patches modifying / adding this file, I
>>>>> requested that we spell out "PCI Express" every single time. I'd like to
>>>>> see the same in this patch, if possible.
>>>>
>>>> OK, I didn't know it.
>>>>
>>>>>
>>>>>> @@ -55,18 +55,18 @@ Place only the following kinds of devices directly 
>>>>>> on the Root Complex:
>>>>>> pcie.0 bus
>>>>>> 
>>>>>> 
>>>>>>  |||  |
>>>>>> -   ---   --   --   
>>>>>> --
>>>>>> -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  
>>>>>> |
>>>>>> -   ---   --   --   
>>>>>> --
>>>>>> +   ---   --   ---   
>>>>>> --
>>>>>> +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie 
>>>>>>  |
>>>>>> +   ---   --   ---   
>>>>>> --
>>>>>>
>>>>>>  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated 
>>>>>> Endpoint use:
>>>>>>-device [,bus=pcie.0]
>>>>>>  2.1.2 To expose a new PCI Express Root Bus use:
>>>>>>-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
>>>>>> -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
>>>>>> +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges 
>>>>>> can be connected
>>>>>
>>>>> It would be nice if we could keep the flowing text wrapped to 80 chars.
>>>>>
>>>>> Also, here you add the "PCI Express-PCI" bridge to the list of allowed
>>>>> controllers (and you keep DMI-PCI as permitted), but above DMI was
>>>>> replaced. I think these should be made consistent -- we should make up
>>>>> our minds if we continue to recommend the DMI-PCI bridge or not. If not,
>>>>> then we should eradicate all traces of it. If we want to keep it at
>>>>> least for compatibility, then it should remain as fully documented as it
>>>>> is now.
>>>>
>>>> Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge
>>>> even for compatibility and may want to use a new PCIE-PCI bridge
>>>> everywhere (of course, except some cases when users are
>>>> sure they need exactly DMI-PCI bridge for some reason)
>>>
>>> Can dmi-pci support shpc? why doesn't it? For compatibility?
>>
>> I don't know why, but the fact that it doesn't is the reason libvirt
>> settled on auto-creating a dmi-pci bridge and a pci-pci bridge under
>> that for Q35. The reasoning was (IIRC Laine's words correctly) that the
>> dmi-pci bridge cannot receive hotplugged devices, while the pci-pci
>> bridge cannot be connected to the root complex. So both were needed.
>>
>> Thanks
>> Laszlo
> 
> OK. Is it true that dmi-pci + pci-pci under it will allow hotplug
> on Q35 if we just flip the bit in _OSC?

Marcel, what say you?... :)

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-01 Thread Laszlo Ersek
On 08/01/17 23:39, Michael S. Tsirkin wrote:
> On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote:
>> 2017-08-01 23:31 GMT+03:00 Laszlo Ersek <ler...@redhat.com>:
>>> (Whenever my comments conflict with Michael's or Marcel's, I defer to them.)
>>>
>>> On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
>>>> Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com>
>>>> ---
>>>>  docs/pcie.txt|  46 ++
>>>>  docs/pcie_pci_bridge.txt | 121 
>>>> +++
>>>>  2 files changed, 147 insertions(+), 20 deletions(-)
>>>>  create mode 100644 docs/pcie_pci_bridge.txt
>>>>
>>>> diff --git a/docs/pcie.txt b/docs/pcie.txt
>>>> index 5bada24..338b50e 100644
>>>> --- a/docs/pcie.txt
>>>> +++ b/docs/pcie.txt
>>>> @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on 
>>>> the Root Complex:
>>>>  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
>>>> Express
>>>>  hierarchies.
>>>>
>>>> -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
>>>> +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
>>>>  hierarchies.
>>>>
>>>>  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root 
>>>> Buses
>>>
>>> When reviewing previous patches modifying / adding this file, I
>>> requested that we spell out "PCI Express" every single time. I'd like to
>>> see the same in this patch, if possible.
>>
>> OK, I didn't know it.
>>
>>>
>>>> @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on 
>>>> the Root Complex:
>>>> pcie.0 bus
>>>> 
>>>> 
>>>>  |||  |
>>>> -   ---   --   --   --
>>>> -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
>>>> -   ---   --   --   --
>>>> +   ---   --   ---   --
>>>> +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie  |
>>>> +   ---   --   ---   --
>>>>
>>>>  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint 
>>>> use:
>>>>-device [,bus=pcie.0]
>>>>  2.1.2 To expose a new PCI Express Root Bus use:
>>>>-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
>>>> -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
>>>> +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges 
>>>> can be connected
>>>
>>> It would be nice if we could keep the flowing text wrapped to 80 chars.
>>>
>>> Also, here you add the "PCI Express-PCI" bridge to the list of allowed
>>> controllers (and you keep DMI-PCI as permitted), but above DMI was
>>> replaced. I think these should be made consistent -- we should make up
>>> our minds if we continue to recommend the DMI-PCI bridge or not. If not,
>>> then we should eradicate all traces of it. If we want to keep it at
>>> least for compatibility, then it should remain as fully documented as it
>>> is now.
>>
>> Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge
>> even for compatibility and may want to use a new PCIE-PCI bridge
>> everywhere (of course, except some cases when users are
>> sure they need exactly DMI-PCI bridge for some reason)
> 
> Can dmi-pci support shpc? why doesn't it? For compatibility?

I don't know why, but the fact that it doesn't is the reason libvirt
settled on auto-creating a dmi-pci bridge and a pci-pci bridge under
that for Q35. The reasoning was (IIRC Laine's words correctly) that the
dmi-pci bridge cannot receive hotplugged devices, while the pci-pci
bridge cannot be connected to the root complex. So both were needed.

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-01 Thread Laszlo Ersek
(Whenever my comments conflict with Michael's or Marcel's, I defer to them.)

On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
> Signed-off-by: Aleksandr Bezzubikov 
> ---
>  docs/pcie.txt|  46 ++
>  docs/pcie_pci_bridge.txt | 121 
> +++
>  2 files changed, 147 insertions(+), 20 deletions(-)
>  create mode 100644 docs/pcie_pci_bridge.txt
> 
> diff --git a/docs/pcie.txt b/docs/pcie.txt
> index 5bada24..338b50e 100644
> --- a/docs/pcie.txt
> +++ b/docs/pcie.txt
> @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the 
> Root Complex:
>  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
> Express
>  hierarchies.
>  
> -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
> +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
>  hierarchies.
>  
>  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses

When reviewing previous patches modifying / adding this file, I
requested that we spell out "PCI Express" every single time. I'd like to
see the same in this patch, if possible.

> @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the 
> Root Complex:
> pcie.0 bus
> 
> 
>  |||  |
> -   ---   --   --   --
> -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
> -   ---   --   --   --
> +   ---   --   ---   --
> +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie  |
> +   ---   --   ---   --
>  
>  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use:
>-device [,bus=pcie.0]
>  2.1.2 To expose a new PCI Express Root Bus use:
>-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
> -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
> +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can 
> be connected

It would be nice if we could keep the flowing text wrapped to 80 chars.

Also, here you add the "PCI Express-PCI" bridge to the list of allowed
controllers (and you keep DMI-PCI as permitted), but above DMI was
replaced. I think these should be made consistent -- we should make up
our minds if we continue to recommend the DMI-PCI bridge or not. If not,
then we should eradicate all traces of it. If we want to keep it at
least for compatibility, then it should remain as fully documented as it
is now.

>to the pcie.1 bus:
>-device 
> ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z]  
>\
> -  -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1
> +  -device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1
>  
>  
>  2.2 PCI Express only hierarchy
> @@ -130,21 +130,25 @@ Notes:
>  Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints,
>  but, as mentioned in section 5, doing so means the legacy PCI
>  device in question will be incapable of hot-unplugging.
> -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination
> +Besides that use PCIE-PCI Bridges (pcie-pci-bridge) in combination
>  with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
> +Instead of the PCIE-PCI Bridge DMI-PCI one can be used,
> +but it doens't support hot-plug, is not crossplatform and since that

s/doens't/doesn't/

s/since that/therefore it/

> +is obsolete and deprecated. Use the PCIE-PCI Bridge if you're not 
> +absolutely sure you need the DMI-PCI Bridge.
>  
> -Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge
> +Prefer flat hierarchies. For most scenarios a single PCIE-PCI Bridge
>  (having 32 slots) and several PCI-PCI Bridges attached to it
>  (each supporting also 32 slots) will support hundreds of legacy devices.
> -The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge
> +The recommendation is to populate one PCI-PCI Bridge under the PCIE-PCI 
> Bridge
>  until is full and then plug a new PCI-PCI Bridge...
>  
> pcie.0 bus
> --
>  ||
> -   ---   --
> -   | PCI Dev |   | DMI-PCI BRIDGE |
> -   ----
> +   ---   ---
> +   | PCI Dev |   | PCIE-PCI BRIDGE |
> +   -----
> ||
>----
>| PCI-PCI Bridge || 

Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware

2017-07-31 Thread Laszlo Ersek
On 07/31/17 20:55, Michael S. Tsirkin wrote:
> On Mon, Jul 31, 2017 at 08:16:49PM +0200, Laszlo Ersek wrote:
>> OK. If the proposed solution with the r/o mem base/limit registers is
>> rooted in the spec (and I think it indeed must be; apparently this would
>> be the same as what we're already planning for IO disablement), then
>> that's a strong argument for PciBusDxe to accommodate this probing in
>> the platform hook.
>>
>> Thanks
>> Laszlo
> 
> Do you mean making base/limit read-only?

Yes, I do. (Perhaps writing "r/o" was too terse.)

Thanks
Laszlo


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware

2017-07-31 Thread Laszlo Ersek
On 07/29/17 01:15, Michael S. Tsirkin wrote:
> On Thu, Jul 27, 2017 at 03:58:58PM +0200, Laszlo Ersek wrote:
>> On 07/27/17 11:39, Marcel Apfelbaum wrote:
>>> On 27/07/2017 2:28, Michael S. Tsirkin wrote:
>>>> On Thu, Jul 27, 2017 at 12:54:07AM +0300, Alexander Bezzubikov wrote:
>>>>> 2017-07-26 22:43 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>:
>>>>>> On Sun, Jul 23, 2017 at 01:15:41AM +0300, Aleksandr Bezzubikov wrote:
>>>>>>> On PCI init PCI bridges may need some
>>>>>>> extra info about bus number to reserve, IO, memory and
>>>>>>> prefetchable memory limits. QEMU can provide this
>>>>>>> with special
>>>>>>
>>>>>> with a special
>>>>>>
>>>>>>> vendor-specific PCI capability.
>>>>>>>
>>>>>>> Sizes of limits match ones from
>>>>>>> PCI Type 1 Configuration Space Header,
>>>>>>> number of buses to reserve occupies only 1 byte
>>>>>>> since it is the size of Subordinate Bus Number register.
>>>>>>>
>>>>>>> Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com>
>>>>>>> ---
>>>>>>>   hw/pci/pci_bridge.c | 27 +++
>>>>>>>   include/hw/pci/pci_bridge.h | 18 ++
>>>>>>>   2 files changed, 45 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
>>>>>>> index 720119b..8ec6c2c 100644
>>>>>>> --- a/hw/pci/pci_bridge.c
>>>>>>> +++ b/hw/pci/pci_bridge.c
>>>>>>> @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const
>>>>>>> char* bus_name,
>>>>>>>   br->bus_name = bus_name;
>>>>>>>   }
>>>>>>>
>>>>>>> +
>>>>>>> +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset,
>>>>>>
>>>>>> help? should be qemu_cap_init?
>>>>>>
>>>>>>> +  uint8_t bus_reserve, uint32_t io_limit,
>>>>>>> +  uint16_t mem_limit, uint64_t
>>>>>>> pref_limit,
>>>>>>> +  Error **errp)
>>>>>>> +{
>>>>>>> +size_t cap_len = sizeof(PCIBridgeQemuCap);
>>>>>>> +PCIBridgeQemuCap cap;
>>>>>>
>>>>>> This leaks info to guest. You want to init all fields here:
>>>>>>
>>>>>> cap = {
>>>>>>   .len = 
>>>>>> };
>>>>>
>>>>> I surely can do this for len field, but as Laszlo proposed
>>>>> we can use mutually exclusive fields,
>>>>> e.g. pref_32 and pref_64, the only way I have left
>>>>> is to use ternary operator (if we surely need this
>>>>> big initializer). Keeping some if's would look better,
>>>>> I think.
>>>>>
>>>>>>
>>>>>>> +
>>>>>>> +cap.len = cap_len;
>>>>>>> +cap.bus_res = bus_reserve;
>>>>>>> +cap.io_lim = io_limit & 0xFF;
>>>>>>> +cap.io_lim_upper = io_limit >> 8 & 0x;
>>>>>>> +cap.mem_lim = mem_limit;
>>>>>>> +cap.pref_lim = pref_limit & 0x;
>>>>>>> +cap.pref_lim_upper = pref_limit >> 16 & 0x;
>>>>>>
>>>>>> Please use pci_set_word etc or cpu_to_leXX.
>>>>>>
>>>>>
>>>>> Since now we've decided to avoid fields separation into  +
>>>>> ,
>>>>> this bitmask along with pci_set_word are no longer needed.
>>>>>
>>>>>> I think it's easiest to replace struct with a set of macros then
>>>>>> pci_set_word does the work for you.
>>>>>>
>>>>>
>>>>> I don't really want to use macros here because structure
>>>>> show us the whole capability layout and this can
>>>>> decrease documenting efforts. More than that,
>>>>> memcpy usage is very convenient here, and I wouldn't like
>>>>> to lose it.
>>>>&g

Re: [SeaBIOS] [qemu PATCH for 2.10] i386: acpi: provide an XSDT instead of an RSDT

2017-07-28 Thread Laszlo Ersek
On 07/27/17 22:40, Kevin O'Connor wrote:
> On Wed, Jul 26, 2017 at 11:31:36AM +0200, Paolo Bonzini wrote:
>> The tables that QEMU provides are not ACPI 1.0 compatible since commit
>> 77af8a2b95 ("hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve
>> guest OS support.", 2017-05-03).  This is visible with Windows 2000,
>> which refuses to parse the rev3 FADT and fails to boot.
>>
>> The recommended solution in this case is to build two FADTs, v1 being
>> pointed to by the RSDT and v3 by the XSDT.  However, we leave this task
>> to the firmware.  This patch simply switches the RSDT to the XSDT, which
>> is valid for all ACPI 2.0-friendly operating systems and also leaves
>> SeaBIOS the freedom to build an RSDT that points to the compatibility
>> FADT.
> 
> Another possible solution to this issue would be for QEMU to instruct
> the firmware to build both rev1 and rev3 FADTs, but be clear which
> links are for legacy purposes only.  This could be done with a new
> ADD_LEGACY_POINTER linker loader command.  Existing firmwares should
> ignore the new ADD_LEGACY_POINTER command and new versions of SeaBIOS
> could be extended to honor it.

I confirm OVMF ignores (skips) unknown commands.

But, so I can understand better, can you please explain what the effect
of these patches would be? IIUC, some pointer updates would not be
performed in OVMF (and old SeaBIOS) that would take place in new
SeaBIOS. What pointers are these exactly (where do they live and what do
they point at)?

- RSDT[0] would point to FADTv1, RSDT[n] (n>=1) would point to the rest
of the tables, and OVMF wouldn't set (or follow) any of these pointers,
- XSDT[0] would point to FADTv3, XSDT[n] (n>=1) would point to the rest
of the tables, and both SeaBIOS and OVMF would see these pointers,
- RSDP.RSDT would point to the RSDT, and OVMF would not see (or follow)
this pointer,
- RSDP.XSDT would point to the XSDT, and both SeaBIOS and OVMF would see
this pointer.

Is this a correct interpretation? If so, I think it would work for OVMF.

First, OVMF would not patch RSDP.RSDT, nor RSDT[n] (n>=0).

Second, in the 2nd phase processing of pointers, OVMF would not follow
the RSDP.RSDT link for identifying the pointed-to RSDT for installation
(which is not really relevant, since OVMF skips the installation of the
RSDT anyway, when it recognizes it).

Third, none of the RSDT[n] links would be followed for identifying other
tables for installation; meaning neither FADTv1 nor the other (commonly
used) tables would be identified / installed. XSDT would work like now,
and a FADTv3 plus the rest of the tables would be installed from that go.

Fourth, what about the links within the FADTv1 (to the FACS and DSDT)?
AFAICS in build_fadt1(), those pointers continue to be patched with the
non-legacy ADD_POINTER command. This is not necessarily a problem if
FADTv1.FACS and FADTv3.FACS point to the exact same address (similarly
if FADTv1.DSDT and FADTv3.DSDT point to the exact same address), because
OVMF already has a kind of memoization against installing the exact same
pointed-to table twice (e.g., when FADTv3.DSDT and FADTv3.X_DSDT refer
to the same address). Still, for completeness, maybe the FADTv1.FACS and
FADTv1.DSDT pointers should also be patched with the new legacy
ADD_POINTER command, in build_fadt1().

Basically, once you split a pointer between the RSDT "tree" and the XSDT
"tree", all the pointers to ACPI data tables in that table-subtree
(recursively) should be patched accordingly (all legacy or all
non-legacy). Pointers to other things than ACPI data tables need no
special handling (as their identification / probing is already
suppressed with suitable zero prefixes).

Thanks!
Laszlo

> 
> I proto-typed it (but haven't done significant testing).  Admittedly,
> it is a pretty ugly hack.
> 
> -Kevin
> 
> 
> == SeaBIOS patch ===
> 
> --- a/src/fw/romfile_loader.c
> +++ b/src/fw/romfile_loader.c
> @@ -234,6 +234,7 @@ int romfile_loader_execute(const char *name)
>  case ROMFILE_LOADER_COMMAND_ALLOCATE:
>  romfile_loader_allocate(entry, files);
>  break;
> +case ROMFILE_LOADER_COMMAND_ADD_LEGACY_POINTER:
>  case ROMFILE_LOADER_COMMAND_ADD_POINTER:
>  romfile_loader_add_pointer(entry, files);
>  break;
> diff --git a/src/fw/romfile_loader.h b/src/fw/romfile_loader.h
> index fcd4ab2..4e266e8 100644
> --- a/src/fw/romfile_loader.h
> +++ b/src/fw/romfile_loader.h
> @@ -77,6 +77,7 @@ enum {
>  ROMFILE_LOADER_COMMAND_ADD_POINTER= 0x2,
>  ROMFILE_LOADER_COMMAND_ADD_CHECKSUM   = 0x3,
>  ROMFILE_LOADER_COMMAND_WRITE_POINTER  = 0x4,
> +ROMFILE_LOADER_COMMAND_ADD_LEGACY_POINTER = 0x5,
>  };
>  
>  enum {
> 
> 
> == QEMU patch ===
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 36a6cc4..eed1a2c 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ 

Re: [SeaBIOS] [Qemu-devel] Commit 77af8a2b95b79699de650965d5228772743efe84 breaks Windows 2000 support

2017-07-27 Thread Laszlo Ersek
On 07/27/17 16:59, Kevin O'Connor wrote:
> On Wed, Jul 26, 2017 at 04:21:23PM -0400, Paolo Bonzini wrote:

>>> C - We'd be introducing "shared ownership" of the acpi tables.  Some
>>> of the tables would be produced by QEMU and some of them by
>>> SeaBIOS.  Explaining when and why to future developers would be
>>> a challenge.
>>
>> The advantage is that the same shared ownership is already present in
>> OVMF.  The RSDP/RSDT/XSDT are entirely created by the firmware in
>> OVMF. (The rev1 FADT isn't but that's just missing code; the table
>> manager in general would be ready for that).  In any case this
>> doesn't seem like something that cannot be solved by code comments.
>
> I'd argue that the shared ownership in the EDK2 code was a poor design
> choice.

The reason we can't just exclude the reference implementation of
EFI_ACPI_TABLE_PROTOCOL from OVMF whole-sale, and reimplement the ACPI
linker/loader from scratch, is that some other (independent) edk2
modules will want to use EFI_ACPI_TABLE_PROTOCOL for installing their
own (one-off) tables, such as IBFT, BGRT and so on, *in addition to*
QEMU's. Given that these ACPI tables mostly do *not* describe hardware
(but software features and/or configuration), it's hard to claim that
they should also be generated by QEMU.

Therefore the dual origin for ACPI tables looks unavoidable in UEFI,
it's just that there should be a lot more flexible "connect" from QEMU's
linker/loader to the installed ACPI tables than EFI_ACPI_TABLE_PROTOCOL.

Basically this is a fight over ownership. Each of QEMU's ACPI
linker/loader and EFI_ACPI_TABLE_PROTOCOL thinks that it fully owns the
root of the table tree. :(

> Case in point - we're only having this conversation because of its
> limitations - SeaBIOS is capable of deploying the acpi tables in the
> proposed layout without any code changes today.

Yes.

But let's not forget that SeaBIOS is capable of delegating the full
low-level construction of the table tree to QEMU because no independent
/ 3rd party BIOS-level code wants to install its own tables (again,
IBFT, BGRT, ...) This is not true of UEFI, where the guiding principle
of the standardized interfaces is to enable cooperation between
independent, binary-only modules. (So, for example, if you shove a new
PCI add-on card in your motherboard, the UEFI driver in that oprom could
install a separate ACPI table, by looking up and calling
EFI_ACPI_TABLE_PROTOCOL.)

> I'm not against changing SeaBIOS, but it's a priority for me that we
> continue to make it possible to deploy future ACPI table changes (no
> matter how quirky) in a way that does not require future SeaBIOS
> releases.

It's a good goal.

I apologize for forgetting the context, but what exactly was the
argument against:

- splitting modern ACPI generation from ancient ACPI generation (so that
we can assign separate maintainers to ancient vs. modern),

- restricting ancient ACPI generation to old machine types?

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware

2017-07-27 Thread Laszlo Ersek
On 07/27/17 11:39, Marcel Apfelbaum wrote:
> On 27/07/2017 2:28, Michael S. Tsirkin wrote:
>> On Thu, Jul 27, 2017 at 12:54:07AM +0300, Alexander Bezzubikov wrote:
>>> 2017-07-26 22:43 GMT+03:00 Michael S. Tsirkin :
 On Sun, Jul 23, 2017 at 01:15:41AM +0300, Aleksandr Bezzubikov wrote:
> On PCI init PCI bridges may need some
> extra info about bus number to reserve, IO, memory and
> prefetchable memory limits. QEMU can provide this
> with special

 with a special

> vendor-specific PCI capability.
>
> Sizes of limits match ones from
> PCI Type 1 Configuration Space Header,
> number of buses to reserve occupies only 1 byte
> since it is the size of Subordinate Bus Number register.
>
> Signed-off-by: Aleksandr Bezzubikov 
> ---
>   hw/pci/pci_bridge.c | 27 +++
>   include/hw/pci/pci_bridge.h | 18 ++
>   2 files changed, 45 insertions(+)
>
> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
> index 720119b..8ec6c2c 100644
> --- a/hw/pci/pci_bridge.c
> +++ b/hw/pci/pci_bridge.c
> @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const
> char* bus_name,
>   br->bus_name = bus_name;
>   }
>
> +
> +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset,

 help? should be qemu_cap_init?

> +  uint8_t bus_reserve, uint32_t io_limit,
> +  uint16_t mem_limit, uint64_t
> pref_limit,
> +  Error **errp)
> +{
> +size_t cap_len = sizeof(PCIBridgeQemuCap);
> +PCIBridgeQemuCap cap;

 This leaks info to guest. You want to init all fields here:

 cap = {
   .len = 
 };
>>>
>>> I surely can do this for len field, but as Laszlo proposed
>>> we can use mutually exclusive fields,
>>> e.g. pref_32 and pref_64, the only way I have left
>>> is to use ternary operator (if we surely need this
>>> big initializer). Keeping some if's would look better,
>>> I think.
>>>

> +
> +cap.len = cap_len;
> +cap.bus_res = bus_reserve;
> +cap.io_lim = io_limit & 0xFF;
> +cap.io_lim_upper = io_limit >> 8 & 0x;
> +cap.mem_lim = mem_limit;
> +cap.pref_lim = pref_limit & 0x;
> +cap.pref_lim_upper = pref_limit >> 16 & 0x;

 Please use pci_set_word etc or cpu_to_leXX.

>>>
>>> Since now we've decided to avoid fields separation into  +
>>> ,
>>> this bitmask along with pci_set_word are no longer needed.
>>>
 I think it's easiest to replace struct with a set of macros then
 pci_set_word does the work for you.

>>>
>>> I don't really want to use macros here because structure
>>> show us the whole capability layout and this can
>>> decrease documenting efforts. More than that,
>>> memcpy usage is very convenient here, and I wouldn't like
>>> to lose it.
>>>

> +
> +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR,
> +cap_offset, cap_len, errp);
> +if (offset < 0) {
> +return offset;
> +}
> +
> +memcpy(dev->config + offset + 2, (char *) + 2, cap_len - 2);

 +2 is yacky. See how virtio does it:

  memcpy(dev->config + offset + PCI_CAP_FLAGS, >cap_len,
 cap->cap_len - PCI_CAP_FLAGS);


>>>
>>> OK.
>>>
> +return 0;
> +}
> +
>   static const TypeInfo pci_bridge_type_info = {
>   .name = TYPE_PCI_BRIDGE,
>   .parent = TYPE_PCI_DEVICE,
> diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> index ff7cbaa..c9f642c 100644
> --- a/include/hw/pci/pci_bridge.h
> +++ b/include/hw/pci/pci_bridge.h
> @@ -67,4 +67,22 @@ void pci_bridge_map_irq(PCIBridge *br, const
> char* bus_name,
>   #define  PCI_BRIDGE_CTL_DISCARD_STATUS   0x400   /* Discard
> timer status */
>   #define  PCI_BRIDGE_CTL_DISCARD_SERR 0x800   /* Discard timer
> SERR# enable */
>
> +typedef struct PCIBridgeQemuCap {
> +uint8_t id; /* Standard PCI capability header field */
> +uint8_t next;   /* Standard PCI capability header field */
> +uint8_t len;/* Standard PCI vendor-specific capability
> header field */
> +uint8_t bus_res;
> +uint32_t pref_lim_upper;

 Big endian? Ugh.

>>>
>>> Agreed, and this's gonna to disappear with
>>> the new layout.
>>>
> +uint16_t pref_lim;
> +uint16_t mem_lim;

 I'd say we need 64 bit for memory.

>>>
>>> Why? Non-prefetchable MEMORY_LIMIT register is 16 bits long.
>>
>> Hmm ok, but e.g. for io there are bridges that have extra registers
>> to specify non-standard non-aligned registers.
>>
> +uint16_t io_lim_upper;
> +uint8_t 

Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware

2017-07-26 Thread Laszlo Ersek
On 07/26/17 23:54, Alexander Bezzubikov wrote:
> 2017-07-26 22:43 GMT+03:00 Michael S. Tsirkin :
>> On Sun, Jul 23, 2017 at 01:15:41AM +0300, Aleksandr Bezzubikov wrote:

>>> +PCIBridgeQemuCap cap;
>>
>> This leaks info to guest. You want to init all fields here:
>>
>> cap = {
>>  .len = 
>> };
> 
> I surely can do this for len field, but as Laszlo proposed
> we can use mutually exclusive fields,
> e.g. pref_32 and pref_64, the only way I have left
> is to use ternary operator (if we surely need this
> big initializer). Keeping some if's would look better,
> I think.

I think it's fine to use "if"s in order to set up the structure
partially / gradually, but then please clear the structure up-front:


  PCIBridgeQemuCap cap = { 0 };

(In general "{ 0 }" is the best initializer ever, because it can
zero-init a variable of *any* type at all. Gcc might complain about the
inexact depth of {} nesting of course, but it's nonetheless valid C.)

Or else add a memset-to-zero.

Or else, do just

  PCIBridgeQemuCap cap = { .len = ... };

which will zero-fill every other field. ("[...] all subobjects that are
not initialized explicitly shall be initialized implicitly the same as
objects that have static storage duration").

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init

2017-07-26 Thread Laszlo Ersek
On 07/26/17 18:22, Marcel Apfelbaum wrote:
> On 26/07/2017 18:20, Laszlo Ersek wrote:

[snip]

>> However, what does the hot-pluggability of the PCIe-PCI bridge buy us?
>> In other words, what does it buy us when we do not add the PCIe-PCI
>> bridge immediately at guest startup, as an integrated device?
>>  > Why is it a problem to "commit" in advance? I understand that we might
>> not like the DMI-PCI bridge (due to it being legacy), but what speaks
>> against cold-plugging the PCIe-PCI bridge either as an integrated device
>> in pcie.0 (assuming that is permitted), or cold-plugging the PCIe-PCI
>> bridge in a similarly cold-plugged PCIe root port?
>>
> 
> We want to keep Q35 clean, and for most cases we don't want any
> legacy PCI stuff if not especially required.
> 
>> I mean, in the cold-plugged case, you use up two bus numbers at the
>> most, one for the root port, and another for the PCIe-PCI bridge. In the
>> hot-plugged case, you have to start with the cold-plugged root port just
>> the same (so that you can communicate the bus number reservation *at
>> all*), and then reserve (= use up in advance) the bus number, the IO
>> space, and the MMIO space(s). I don't see the difference; hot-plugging
>> the PCIe-PCI bridge (= not committing in advance) doesn't seem to save
>> any resources.
>>
> 
> Is not about resources, more about usage model.
> 
>> I guess I would see a difference if we reserved more than one bus number
>> in the hotplug case, namely in order to support recursive hotplug under
>> the PCIe-PCI bridge. But, you confirmed that we intend to keep the flat
>> hierarchy (ie the exercise is only for enabling legacy PCI endpoints,
>> not for recursive hotplug).  The PCIe-PCI bridge isn't a device that
>> does anything at all on its own, so why not just coldplug it? Its
>> resources have to be reserved in advance anyway.
>>
> 
> Even if we prefer flat hierarchies, we should allow a sane nested
> bridges configuration, so we will some times reserve more than one.
> 
>> So, thus far I would say "just cold-plug the PCIe-PCI bridge at startup,
>> possibly even make it an integrated device, and then you don't need to
>> reserve bus numbers (and other apertures)".
>>
>> Where am I wrong?
>>
> 
> Nothing wrong, I am just looking for feature parity Q35 vs PC.
> Users may want to continue using [nested] PCI bridges, and
> we want the Q35 machine to be used by more users in order
> to make it reliable faster, while keeping it clean by default.
> 
> We had a discussion on this matter on last year KVM forum
> and the hot-pluggable PCIe-PCI bridge was the general consensus.

OK. I don't want to question or go back on that consensus now; I'd just
like to point out that all that you describe (nested bridges, and
enabling legacy PCI with PCIe-PCI bridges, *on demand*) is still
possible with cold-plugging.

I.e., the default setup of Q35 does not need to include legacy PCI
bridges. It's just that the pre-launch configuration effort for a Q35
user to *reserve* resources for legacy PCI is the exact same as the
pre-launch configuration effort to *actually cold-plug* the bridge.

[snip]

>>>> The PI spec says,
>>>>
>>>>> [...] For all the root HPCs and the nonroot HPCs, call
>>>>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding() to obtain the
>>>>> amount of overallocation and add that amount to the requests from the
>>>>> physical devices. Reprogram the bus numbers by taking into account the
>>>>> bus resource padding information. [...]
>>>>
>>>> However, according to my interpretation of the source code, PciBusDxe
>>>> does not consider bus number padding for non-root HPCs (which are "all"
>>>> HPCs on QEMU).
>>>>
>>>
>>> Theoretically speaking, it is possible to change the  behavior, right?
>>
>> Not just theoretically; in the past I have changed PciBusDxe -- it
>> wouldn't identify QEMU's hotplug controllers (root port, downstream port
>> etc) appropriately, and I managed to get some patches in. It's just that
>> the less we understand the current code and the more intrusive/extensive
>> the change is, the harder it is to sell the *idea*. PciBusDxe is
>> platform-independent and shipped on many a physical system too.
>>
> 
> Understood, but from your explanation it sounds like the existings
> callback sites(hooks) are enough.

That's the problem: they don't appear to, if you consider bus number
reservations. The existing callback sites seem fine regarding IO and
MMIO, but the only callback

Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init

2017-07-26 Thread Laszlo Ersek
On 07/26/17 08:48, Marcel Apfelbaum wrote:
> On 25/07/2017 18:46, Laszlo Ersek wrote:

[snip]

>> (2) Bus range reservation, and hotplugging bridges. What's the
>> motivation? Our recommendations in "docs/pcie.txt" suggest flat
>> hierarchies.
>>
> 
> It remains flat. You have one single PCIE-PCI bridge plugged
> into a PCIe Root Port, no deep nesting.
> 
> The reason is to be able to support legacy PCI devices without
> "committing" with a DMI-PCI bridge in advance. (Keep Q35 without)
> legacy hw.
> 
> The only way to support PCI devices in Q35 is to have them cold-plugged
> into the pcie.0 bus, which is good, but not enough for expanding the
> Q35 usability in order to make it eventually the default
> QEMU x86 machine (I know this is another discussion and I am in
> minority, at least for now).
> 
> The plan is:
> Start Q35 machine as usual, but one of the PCIe Root Ports includes
> hints for firmware needed t support legacy PCI devices. (IO Ports range,
> extra bus,...)
> 
> Once a pci device is needed you have 2 options:
> 1. Plug a PCIe-PCI bridge into a PCIe Root Port and the PCI device
>in the bridge.
> 2. Hotplug a PCIe-PCI bridge into a PCIe Root Port and then hotplug
>a PCI device into the bridge.

Thank you for the explanation, it makes the intent a lot clearer.

However, what does the hot-pluggability of the PCIe-PCI bridge buy us?
In other words, what does it buy us when we do not add the PCIe-PCI
bridge immediately at guest startup, as an integrated device?

Why is it a problem to "commit" in advance? I understand that we might
not like the DMI-PCI bridge (due to it being legacy), but what speaks
against cold-plugging the PCIe-PCI bridge either as an integrated device
in pcie.0 (assuming that is permitted), or cold-plugging the PCIe-PCI
bridge in a similarly cold-plugged PCIe root port?

I mean, in the cold-plugged case, you use up two bus numbers at the
most, one for the root port, and another for the PCIe-PCI bridge. In the
hot-plugged case, you have to start with the cold-plugged root port just
the same (so that you can communicate the bus number reservation *at
all*), and then reserve (= use up in advance) the bus number, the IO
space, and the MMIO space(s). I don't see the difference; hot-plugging
the PCIe-PCI bridge (= not committing in advance) doesn't seem to save
any resources.

I guess I would see a difference if we reserved more than one bus number
in the hotplug case, namely in order to support recursive hotplug under
the PCIe-PCI bridge. But, you confirmed that we intend to keep the flat
hierarchy (ie the exercise is only for enabling legacy PCI endpoints,
not for recursive hotplug).  The PCIe-PCI bridge isn't a device that
does anything at all on its own, so why not just coldplug it? Its
resources have to be reserved in advance anyway.

So, thus far I would say "just cold-plug the PCIe-PCI bridge at startup,
possibly even make it an integrated device, and then you don't need to
reserve bus numbers (and other apertures)".

Where am I wrong?

[snip]

>> (4) Whether the reservation size should be absolute or relative (raised
>> by Gerd). IIUC, Gerd suggests that the absolute aperture size should be
>> specified (as a minimum), not the increment / reservation for hotplug
>> purposes.
>>
>> The Platform Initialization Specification, v1.6, downloadable at
>> <http://www.uefi.org/specs>, writes the following under
>>
>>EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
>>
>> in whose implementation I will have to parse the values from the
>> capability structure, and return the appropriate representation to the
>> platform-independent PciBusDxe driver (i.e., the enumeration /
>> allocation agent):
>>
>>> The padding is returned in the form of ACPI (2.0 & 3.0) resource
>>> descriptors. The exact definition of each of the fields is the same as
>>> in the
>>> EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources()
>>> function. See the section 10.8 for the definition of this function.
>>>
>>> The PCI bus driver is responsible for adding this resource request to
>>> the resource requests by the physical PCI devices. If Attributes is
>>> EfiPaddingPciBus, the padding takes effect at the PCI bus level. If
>>> Attributes is EfiPaddingPciRootBridge, the required padding takes
>>> effect at the root bridge level. For details, see the definition of
>>> EFI_HPC_PADDING_ATTRIBUTES in "Related Definitions" below.
>>
>> Emphasis on "*adding* this resource request to the resource requests by
>> the physical PCI devices".
>>
>> However... After checking some OVMF logs, it seems

Re: [SeaBIOS] [Qemu-devel] Commit 77af8a2b95b79699de650965d5228772743efe84 breaks Windows 2000 support

2017-07-26 Thread Laszlo Ersek
Digressing:

On 07/26/17 10:53, Paolo Bonzini wrote:
> On 25/07/2017 23:25, Phil Dennis-Jordan wrote:
>> Thanks for this, Paolo. Very interesting idea.
>>
>> I couldn't get things working initially, but with a few fixups on the
>> SeaBIOS side I can boot both legacy and modern OSes. See comments
>> inline below for details on changes required.
>>
>> Successfully booted (only a brief test):
>> - Windows 2000
>> - Windows XP (32 bit)
>> - Windows 7 (32 bit)
>> - Windows 10 (64 bit, SeaBIOS)
>> - Windows 10 (64 bit, OVMF)
>> - macOS 10.12 (patched OVMF)
>
> Thanks Phil!  You unwittingly tested the compatibility path on all
> these OSes, since my QEMU patch forgot to setup rsdp->length,
> rsdp->revision and the extended checksum.  However, I've now tested
> Windows XP, Linux w/SeaBIOS, Linux w/patched SeaBIOS and Linux w/OVMF.
>
> I've now found out that edk2 contains similar logic.  It uses a PCD (a
> compile-time flag essentially) to choose between ACPI >= 2.0 tables or
> ACPI 1.0-compatible tables.  In the latter case, edk2 takes care of
> producing a v1 FADT if needed (similar to this patch) and linking the
> RSDT to it; otherwise it keeps whatever FADT was provided by platform
> code and produces an XSDT.

Not exactly; the PCD controls whether the EFI_ACPI_TABLE_PROTOCOL will
expose an RSDT, an XSDT, or both (with matching contents). The FADT
always comes from the specific edk2 platform (i.e., OVMF client code),
and it is not translated in any way, regardless of the PCD value.

>From "MdeModulePkg/MdeModulePkg.dec":

>   ## Indicates which ACPI versions are targeted by the ACPI tables exposed to 
> the OS
>   #  These values are aligned with the definitions in 
> MdePkg/Include/Protocol/AcpiSystemDescriptionTable.h
>   #   BIT 1 - EFI_ACPI_TABLE_VERSION_1_0B.
>   #   BIT 2 - EFI_ACPI_TABLE_VERSION_2_0.
>   #   BIT 3 - EFI_ACPI_TABLE_VERSION_3_0.
>   #   BIT 4 - EFI_ACPI_TABLE_VERSION_4_0.
>   #   BIT 5 - EFI_ACPI_TABLE_VERSION_5_0.
>   # @Prompt Exposed ACPI table versions.
>   
> gEfiMdeModulePkgTokenSpaceGuid.PcdAcpiExposedTableVersions|0x3E|UINT32|0x0001004c

The expectation is that the specific edk2 platform overrides this PCD at
build time (if necessary), and then goes on (at boot time) to install
ACPI tables -- using EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable() -- that
actually match the PCD setting.

>From the "MdeModulePkg/Universal/Acpi/AcpiTableDxe/" driver's POV (that
is, from the EFI_ACPI_TABLE_PROTOCOL implementation's POV), the platform
controls *both* the PCD and the actually installed tables like the FADT,
so EFI_ACPI_TABLE_PROTOCOL expects the platform to make these
consistent.

The tiny little problem is that the PCD is a build-time flag, but QEMU
provides the FADT (and friends) at boot time, dynamically, in a format
that is essentially opaque to OVMF. So OVMF is sticking with the default
PCD (see above), resulting in both RSDT and XSDT root tables, regardless
of the contents of the FADT and friends.

A somewhat (but not too much) similar situation is with the SMBIOS
tables. The tables are composed / exported by QEMU over fw_cfg, and OVMF
/ AAVMF have to set some version-like PCDs that match the content:
- PcdSmbiosDocRev
- PcdSmbiosVersion

We do some ugly hacks in OVMF to ensure that these PCDs are set "in
time", before the generic "MdeModulePkg/Universal/SmbiosDxe" --
providing EFI_SMBIOS_PROTOCOL -- starts up and consumes the PCDs.
Namely, we have "OvmfPkg/Library/SmbiosVersionLib" which sets these PCDs
based on fw_cfg, and we link this library via NULL class resolution into
"MdeModulePkg/Universal/SmbiosDxe". So the PCDs will be set up just
before EFI_SMBIOS_PROTOCOL is initialized and provided. In turn,
"OvmfPkg/SmbiosPlatformDxe", which actually calls
EFI_SMBIOS_PROTOCOL.Add() on the tables provided by QEMU, has a depex on
EFI_SMBIOS_PROTOCOL -- first, this depex ensures that
EFI_SMBIOS_PROTOCOL can be used by "OvmfPkg/SmbiosPlatformDxe", but
second, the depex *also* ensures that the PCDs will have been set
correctly by the time "OvmfPkg/SmbiosPlatformDxe" calls
EFI_SMBIOS_PROTOCOL.Add() for the first time.

You might ask why we don't do the same in the ACPI case (i.e., for
PcdAcpiExposedTableVersions). It's due to the following differences:

- (less importantly,) "MdeModulePkg.dec" allows platforms to pick
  "dynamic" for PcdSmbiosDocRev and PcdSmbiosVersion, not just "fixed at
  build". IOW, MdeModulePkg already expects platforms to set the SMBIOS
  version PCDs dynamically, if those platforms can ensure the setting
  occurs "early enough".

- (more importantly,) the information needed by OVMF, for setting the
  SMBIOS version PCDs in "OvmfPkg/Library/SmbiosVersionLib", is readily
  available for parsing from the separate, dedicated fw_cfg file called
  "etc/smbios/smbios-anchor". In fact, OVMF doesn't use this file for
  anything else than grabbing the versions for the PCDs. The actual
  "anchor" table (the smbios entry point) is produced by the
  EFI_SMBIOS_PROTOCOL 

Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init

2017-07-25 Thread Laszlo Ersek
On 07/23/17 00:11, Aleksandr Bezzubikov wrote:
> Now PCI bridges get a bus range number on a system init, basing on
> currently plugged devices. That's why when one wants to hotplug
> another bridge, it needs his child bus, which the parent is unable to
> provide (speaking about virtual device). The suggested workaround is
> to have vendor-specific capability in Red Hat PCI bridges that
> contains number of additional bus to reserve on BIOS PCI init. So this
> capability is intented only for pure QEMU->SeaBIOS usage.
>
> Considering all aforesaid, this series is directly connected with
> QEMU RFC series (v2) "Generic PCIE-PCI Bridge".
>
> Although the new PCI capability is supposed to contain various limits
> along with bus number to reserve, now only its full layout is
> proposed, but only bus_reserve field is used in QEMU and BIOS. Limits
> usage is still a subject for implementation as now the main goal of
> this series to provide necessary support from the  firmware side to
> PCIE-PCI bridge hotplug.
>
> Changes v1->v2:
> 1. New #define for Red Hat vendor added (addresses Konrad's comment).
> 2. Refactored pci_find_capability function (addresses Marcel's
>comment).
> 3. Capability reworked:
>   - data type added;
>   - reserve space in a structure for IO, memory and
> prefetchable memory limits.
>
>
> Aleksandr Bezzubikov (4):
>   pci: refactor pci_find_capapibilty to get bdf as the first argument
> instead of the whole pci_device
>   pci: add RedHat vendor ID
>   pci: add QEMU-specific PCI capability structure
>   pci: enable RedHat PCI bridges to reserve additional buses on PCI
> init
>
>  src/fw/pciinit.c| 18 ++
>  src/hw/pci_cap.h| 23 +++
>  src/hw/pci_ids.h|  2 ++
>  src/hw/pcidevice.c  | 12 ++--
>  src/hw/pcidevice.h  |  2 +-
>  src/hw/virtio-pci.c |  4 ++--
>  6 files changed, 48 insertions(+), 13 deletions(-)
>  create mode 100644 src/hw/pci_cap.h
>

Coming back from PTO, it's hard for me to follow up on all the comments
that have been made across the v1 and v2 of this RFC series, so I'll
just provide a brain dump here:

(1) Mentioned by Michael: documentation. That's the most important part.
I haven't seen the QEMU patches, so perhaps they already include
documentation. If not, please start this work with adding a detailed
description do QEMU's docs/ or docs/specs/.

There are a number of preexistent documents that might be related, just
search docs/ for filenames with "pci" in them.


(2) Bus range reservation, and hotplugging bridges. What's the
motivation? Our recommendations in "docs/pcie.txt" suggest flat
hierarchies.

If this use case is really necessary, I think it should be covered in
"docs/pcie.txt". In particular it has a consequence for PXB as well
(search "pcie.txt" for "bus_nr") -- if users employ extra root buses,
then the bus number partitions that they specify must account for any
bridges that they plan to hot-plug (and for the bus range reservations
on the cold-plugged bridges behind those extra root buses).


(3) Regarding the contents and the format of the capability structure, I
wrote up my thoughts earlier in

  https://bugzilla.redhat.com/show_bug.cgi?id=1434747#c8

Let me quote it here for ease of commenting:

> (In reply to Gerd Hoffmann from comment #7)
> > So, now that the generic ports are there we can go on figure how to
> > handle this best.  I still think the best way to communicate window
> > size hints would be to use a vendor specific pci capability (instead
> > of setting the desired size on reset).  The information will always
> > be available then and we don't run into initialization order issues.
>
> This seems good to me -- I can't promise 100% without actually trying,
> but I think I should be able to parse the capability list in config
> space for this hint, in the GetResourcePadding() callback.
>
> I propose that we try to handle this issue "holistically", together
> with bug 1434740. We need a method that provides controls for both IO
> and MMIO:
>
> - For IO, we need a mechanism that can prevent *both* firmware *and*
>   Linux from reserving IO for PCI Express ports. I think Marcel's
>   approach in bug 1344299 is sufficient, i.e., set the IO base/limit
>   registers of the bridge to 0 for disabling IO support. And, if not
>   disabled, just go with the default 4KB IO reservation (for both PCI
>   Express ports and legacy PCI bridges, as the latter is documented in
>   the guidelines).
>
> - For MMIO, the vendor specific capability structure should work
>   something like this:
> - if the capability is missing, reserve 2MB, 32-bit,
>   non-prefetchable,
>
> - otherwise, the capability structure should consist of 3 fields
>   (reservation sizes):
> - uint32_t non_prefetchable_32,
> - uint32_t prefetchable_32,
> - uint64_t prefetchable_64,
>
> - of prefetchable_32 and prefetchable_64, at most one may be
>   nonzero (they 

Re: [SeaBIOS] allocation zone extensions for the firmware linker/loader

2017-06-06 Thread Laszlo Ersek
On 06/05/17 18:02, Michael S. Tsirkin wrote:
> On Sat, Jun 03, 2017 at 09:36:23AM +0200, Laszlo Ersek wrote:
>> On 06/02/17 17:45, Laszlo Ersek wrote:
>>
>>> The patches can cause linker/loader breakage when old firmware is booted
>>> on new QEMU. However, that's no problem (it's nothing new), the next
>>> release of QEMU should bundle the new firmware binaries as always.
>>
>> Dave made a good point (which I should have realized myself, really!),
>> namely if you launch old fw on old qemu, then migrate the guest to a new
>> qemu and then reboot the guest on the target host, within the migrated
>> VM, things will break.
>>
>> So that makes this approach dead in the water.
>>
>> Possible mitigations I could think of:
>> - Make it machine type dependent. Complicated (we don't usually bind
>> ACPI generation to machine types) and wouldn't help existing devices.
>> - Let the firmware negotiate these extensions. Very complicated (new
>> fw-cfg files are needed for negotiation) and wouldn't help existing devices.
> 
> This last option *shouldn't* be complicated. If it is something's wrong.
> 
> Maybe we made a mistake when we added etc/smi/*features*.
> 
> It's not too late to move these to etc/*features* for new
> machine types if we want to and if you can do the firmware
> work. Then you'd just take out a bit and be done with it.
> 
> I don't insist on doing the ACPI thing now but I do think
> infrastructure for negotiating extensions should be there.

Different drivers in the firmware would need to negotiate different
questions / features with QEMU independently of each other. The "thing"
in OVMF that negotiates (and uses) the SMI broadcast is very-very
different and separate from the "thing" in OVMF that handles the ACPI
linker/loader script.

As one example, the first "thing" mentioned above is not even built into
ArmVirtQemu (only into OVMF, i.e. x86), while the second "thing" is
built into both aarch64 and x86 firmware.

So, I think we couldn't share the same fw_cfg files (if they needed
write access & lock-down too, i.e. actual negotiation from the firmware)
between wildly unrelated features.

The virtio feature negotiation is different because each device gets its
own negotiation, and that maps very well to UEFI concepts too.

BTW, do we have a specific concern relating to the number of fw_cfg
files? That count can now be raised from machine type to machine type,
but Paolo didn't seem to like raising the current value (or maybe I
misunderstood him):

http://mid.mail-archive.com/2e6dec37-8b69-979b-c856-406233273066@redhat.com

... Also, above I wrote, with regard to feature negotiation, that it
"wouldn't help existing devices". By that I mean this:

Consider the NOACPI content hint as an example. If the firmware doesn't
negotiate it (before selecting and downloading the ACPI payload), then
QEMU cannot generate the NOACPI content hint. In turn QEMU must keep the
OVMF SDT Header Probe suppressor (those paddings and AML additions) enabled.

But, for the QEMU developers it means that the suppressor code has to be
kept around forever, for compatibility with old machine types. And if
you do that, then why add a negotiable NOACPI hint at all? That would
just further complicate device code (because now you have to generate
two different AML payloads), where the old one (the one with the
explicit suppressor) would work just fine "forever".

Thanks,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] allocation zone extensions for the firmware linker/loader

2017-06-03 Thread Laszlo Ersek
On 06/02/17 17:45, Laszlo Ersek wrote:

> The patches can cause linker/loader breakage when old firmware is booted
> on new QEMU. However, that's no problem (it's nothing new), the next
> release of QEMU should bundle the new firmware binaries as always.

Dave made a good point (which I should have realized myself, really!),
namely if you launch old fw on old qemu, then migrate the guest to a new
qemu and then reboot the guest on the target host, within the migrated
VM, things will break.

So that makes this approach dead in the water.

Possible mitigations I could think of:
- Make it machine type dependent. Complicated (we don't usually bind
ACPI generation to machine types) and wouldn't help existing devices.
- Let the firmware negotiate these extensions. Very complicated (new
fw-cfg files are needed for negotiation) and wouldn't help existing devices.

So I guess I'll do what Igor and Gerd suggested: record in advance
whether any pointer field narrower than 8 bytes points into a given
blob, and if so, forbid allocating that blob from 64-bit address space.
This should solve Ard's needs purely within the firmware.

Regarding the NOACPI hint, I guess I'm dropping that. I only meant
NOACPI for addressing Igor's long-standing dislike for the "ACPI SDT
header probe suppression" in VMGENID (and future similar devices). But,
there's no actual *technical* need to eliminate that (unlike the
technical need for 64-bit blob allocations which should really be
solved), so I guess it's OK to postpone NOACPI indefinitely.

Self-nack for this set of sets.

Thanks for the feedback,
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] allocation zone extensions for the firmware linker/loader

2017-06-02 Thread Laszlo Ersek
On 06/02/17 18:30, Michael S. Tsirkin wrote:
> On Fri, Jun 02, 2017 at 05:45:21PM +0200, Laszlo Ersek wrote:
>> Hi,
>>
>> this message is cross-posted to three lists (qemu, seabios, edk2). I'll
>> follow up with three patch series, one series for each project. I'll
>> cross-post all of the patches as well, but I'll add the project name in
>> the "bag of tags" in the subject lines.
>>
>> The QEMU series introduces two extensions to the ALLOCATE firmware
>> linker/loader command.
>>
>> One extension is a new allocation zone, with value 3, for allowing the
>> firmware to allocate the fw_cfg blobs in 64-bit address space.
> 
> Seems to make sense. I guess it's safe to do this if no
> pointers to this table are 32 bit, right?

That's right. For example, the TCPA patch (6 of 7) in the QEMU series
does this, because the ACPI_BUILD_TPMLOG_FILE is only referenced by a
64-bit pointer.

> Is there a chance we'll ever be able to use this on PC
> assuming the need to support 32 bit guests?

Well, sticking with the TCPA example, if an ACPI table defines *only* an
8-byte pointer to some memory area, that seems to preclude support for
32-bit guests already, generally speaking, no?

But otherwise I agree, requiring support for 32-bit guests makes this
allocation zone a lot less useful.

>> The other extension is a repurposing of the most significant bit (bit 7)
>> in the zone field. This bit becomes orthogonal to the rest of the zone
>> field. If the bit is set, it means that QEMU promises the firmware that
>> the blob referenced by the ALLOCATE command contains no ACPI tables at all.
> 
> This one is a bit strange in that it does not seem to be about
> allocations - it seems to be about content.

Sure, I only stuffed it in the Zone field because that's where I found a
free bit :)

> 
> I'd like to better understand what makes ACPI special.
> 
> I see two other things that make acpi special, but I'd like to make sure
> 1. I think that RSDT from qemu is more or less ignored by OVMF, it
>builds it from tables supplied. Thus pointers from RSDT only serve to
>find beginning of tables - they are not really patched in. So ACPI
>tables are special in that their actual addresses are unused. As a
>result they can be moved at will after linker runs.

Sort of. OVMF performs two passes on the linker/loader script. The first
pass is fully identical to what SeaBIOS does, so the RSDT too is
allocated (as part of its containing fw_cfg blob, of course) and its
fields are patched like anything else.

In the second pass, the (now relocated) pointers are checked again,
based on the ADD_POINTER commands. Wherever the (now relocated) pointers
point, we probe for ACPI table headers. If the target passes the probe
(i.e., it looks like an ACPI table), we call
EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable() with it. This installs a
separate copy of that table, maintaining RSDT and XSDT internally.

Now, the RSD PTR table has a pointer to the RSDT, so normally we'd
invoke InstallAcpiTable() with the RSDT as well, in the second pass. To
prevent that, we have a quirk that recognizes RSDT and XSDT, and skips them.

So you can say that the RSDT from QEMU is *ultimately* ignored in OVMF,
but for the first pass to complete (which is identical to SeaBIOS's only
pass), OVMF uses the RSDT (as embedded in its containing fw_cfg blob) too.

There are more details about the second pass. If a (relocated) pointer
points to some stuff that doesn't look like an ACPI table header, then
we don't install that thing with InstallAcpiTable(). Instead, we think
that at that location the containing fw_cfg blob contains an
opregion-like area, so its actual absolute address is relevant (remember
we are past the first pass, which completed all the relocations). So in
this case, the containing fw_cfg blob is marked as "non-releasable", and
we'll keep it forever (in AcpiNVS memory). Otherwise, if at the end of
processing a blob is marked as releasable (the default), because all
pointers into it pointed only at ACPI table headers, we free the blob.

There are more details about the second pass. There can be several
pointers that point to the same address (= same offset of the same
pointed-to fw_cfg blob). In such cases, only the first encounter is
honored with an InstallAcpiTable() call, further surfacings of the same
target address are skipped.

Now, the heuristics to determine whether a pointed-to location is an
ACPI table header or not, is not fool-proof. It recognizes all valid
headers (so no false negatives), I think, but it could also
mis-recognize random garbage in an opregion-like blob as an ACPI table
header (so there's a chance for false positives). In order to suppress
these false positives deterministically, there are two methods:

- prefix all such areas in the pointed-to b

[SeaBIOS] [edk2 PATCH 2/3] OvmfPkg/AcpiPlatformDxe: support NOACPI content hint in ALLOCATE command

2017-06-02 Thread Laszlo Ersek
This driver currently relies on a 2nd pass processing of the ADD_POINTER
commands to identify potential ACPI tables in the pointed-to blobs.

In order to tell apart ACPI tables from other operation region-like areas
within pointed-to blobs, we employ a heuristic called "ACPI SDT header
probe" at the target locations of the ADD_POINTER commands. While all ACPI
tables generated by QEMU satisfy this check (i.e., there are no false
negatives), blob content that is *not* an ACPI table has a very slight
chance to pass the test as well (i.e., there is a small chance for false
positives).

In order to suppress this small chance, in QEMU we've historically
formatted opregion-like areas in blobs with a fixed size zero prefix (see
e.g. "docs/specs/vmgenid.txt"), which guarantees that the probe in
OvmfPkg/AcpiPlatformDxe will fail. However, this "suppressor prefix" has
had to be taken into account explicitly in generated AML code -- the
prefix size has had to be added to the patched integer object in AML, at
runtime --, leading to awkwardness.

QEMU is introducing a new hint for the ALLOCATE command, as the most
significant bit of the UINT8 "Zone" field, for disabling the ACPI SDT
header probe in OvmfPkg/AcpiPlatformDxe, for all the pointers that point
into the blob downloaded with the ALLOCATE command. When the bit is set,
the blob is guaranteed to contain no ACPI tables. When the bit is clear,
the behavior is left unchanged.

In ProcessCmdAllocate(), save the hint for later.

In Process2ndPassCmdAddPointer(), consult the saved hint. If QEMU reported
the blob as containing no ACPI table data, then omit the ACPI SDT header
probing and mark the pointed-to blob as unreleasable.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Jordan Justen <jordan.l.jus...@intel.com>
Cc: Leif Lindholm <leif.lindh...@linaro.org>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 OvmfPkg/AcpiPlatformDxe/QemuLoader.h|  9 +-
 OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | 29 +++-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/OvmfPkg/AcpiPlatformDxe/QemuLoader.h 
b/OvmfPkg/AcpiPlatformDxe/QemuLoader.h
index 437776d86d9a..fa558540e62b 100644
--- a/OvmfPkg/AcpiPlatformDxe/QemuLoader.h
+++ b/OvmfPkg/AcpiPlatformDxe/QemuLoader.h
@@ -34,19 +34,26 @@ typedef enum {
 typedef enum {
   QemuLoaderAllocHigh = 1,
   QemuLoaderAllocFSeg
 } QEMU_LOADER_ALLOC_ZONE;
 
+typedef enum {
+  QemuLoaderAllocContentMixed  = 0x00,
+  QemuLoaderAllocContentNoAcpi = 0x80,
+} QEMU_LOADER_ALLOC_CONTENT;
+
 #pragma pack (1)
 //
 // QemuLoaderCmdAllocate: download the fw_cfg file named File, to a buffer
 // allocated in the zone specified by Zone, aligned at a multiple of Alignment.
 //
 typedef struct {
   UINT8  File[QEMU_LOADER_FNAME_SIZE]; // NUL-terminated
   UINT32 Alignment;// power of two
-  UINT8  Zone; // QEMU_LOADER_ALLOC_ZONE values
+  UINT8  Zone; // One QEMU_LOADER_ALLOC_ZONE value
+   // OR-ed together with one
+   // QEMU_LOADER_ALLOC_CONTENT value
 } QEMU_LOADER_ALLOCATE;
 
 //
 // QemuLoaderCmdAddPointer: the bytes at
 // [PointerOffset..PointerOffset+PointerSize) in the file PointerFile contain a
diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c 
b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
index 4a7b051288bc..23d543ffe361 100644
--- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
+++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
@@ -36,10 +36,12 @@ typedef struct {
 // key.
   UINTN   Size; // The number of bytes in this blob.
   UINT8   *Base;// Pointer to the blob data.
   BOOLEAN Releasable;   // TRUE iff the blob should be released
 // at the end of processing.
+  BOOLEAN AcpiTablesExcluded;   // TRUE iff QEMU guarantees that the
+// blob contains no ACPI tables
 } BLOB;
 
 
 /**
   Compare a standalone key against a user structure containing an embedded key.
@@ -167,10 +169,12 @@ ProcessCmdAllocate (
   )
 {
   FIRMWARE_CONFIG_ITEM FwCfgItem;
   UINTNFwCfgSize;
   EFI_STATUS   Status;
+  UINT32   Zone;
+  BOOLEAN  AcpiTablesExcluded;
   UINTNNumPages;
   EFI_PHYSICAL_ADDRESS Address;
   BLOB *Blob;
 
   if (Allocate->File[QEMU_LOADER_FNAME_SIZE - 1] != '\0') {
@@ -189,10 +193,18 @@ ProcessCmdAllocate (
 DEBUG ((EFI_D_ERROR, "%a: Qem

[SeaBIOS] [edk2 PATCH 0/3] OvmfPkg/AcpiPlatformDxe: NOACPI hint and 64-bit zone in fw_cfg blob alloc

2017-06-02 Thread Laszlo Ersek
Please see the parent blurb
<c76b36de-ebf9-c662-d454-0a95b43901e8@redhat.com">http://mid.mail-archive.com/c76b36de-ebf9-c662-d454-0a95b43901e8@redhat.com>
for a high level description.

Repo:   https://github.com/lersek/edk2.git
Branch: zone_hints

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Jordan Justen <jordan.l.jus...@intel.com>
Cc: Leif Lindholm <leif.lindh...@linaro.org>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>

Thanks
Laszlo

Laszlo Ersek (3):
  OvmfPkg/AcpiPlatformDxe: rename BLOB.HostsOnlyTableData to
BLOB.Releasable
  OvmfPkg/AcpiPlatformDxe: support NOACPI content hint in ALLOCATE
command
  OvmfPkg/AcpiPlatformDxe: support 64-bit zone in ALLOCATE command

 OvmfPkg/AcpiPlatformDxe/QemuLoader.h| 12 -
 OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | 53 +++-
 2 files changed, 50 insertions(+), 15 deletions(-)

-- 
2.9.3


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [edk2 PATCH 3/3] OvmfPkg/AcpiPlatformDxe: support 64-bit zone in ALLOCATE command

2017-06-02 Thread Laszlo Ersek
The QemuLoaderAlloc64Bit (3) Zone value permits the guest firmware to
allocate the blob being downloaded anywhere in the 64-bit address space.
Set the maximum Address value in ProcessCmdAllocate() accordingly.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Jordan Justen <jordan.l.jus...@intel.com>
Cc: Leif Lindholm <leif.lindh...@linaro.org>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 OvmfPkg/AcpiPlatformDxe/QemuLoader.h| 3 ++-
 OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/OvmfPkg/AcpiPlatformDxe/QemuLoader.h 
b/OvmfPkg/AcpiPlatformDxe/QemuLoader.h
index fa558540e62b..1daa918ff9b7 100644
--- a/OvmfPkg/AcpiPlatformDxe/QemuLoader.h
+++ b/OvmfPkg/AcpiPlatformDxe/QemuLoader.h
@@ -31,11 +31,12 @@ typedef enum {
   QemuLoaderCmdWritePointer,
 } QEMU_LOADER_COMMAND_TYPE;
 
 typedef enum {
   QemuLoaderAllocHigh = 1,
-  QemuLoaderAllocFSeg
+  QemuLoaderAllocFSeg,
+  QemuLoaderAlloc64Bit,
 } QEMU_LOADER_ALLOC_ZONE;
 
 typedef enum {
   QemuLoaderAllocContentMixed  = 0x00,
   QemuLoaderAllocContentNoAcpi = 0x80,
diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c 
b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
index 23d543ffe361..0b0b3f590f2b 100644
--- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
+++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
@@ -202,11 +202,11 @@ ProcessCmdAllocate (
   } else {
 AcpiTablesExcluded = FALSE;
   }
 
   NumPages = EFI_SIZE_TO_PAGES (FwCfgSize);
-  Address = 0x;
+  Address = (Zone == QemuLoaderAlloc64Bit) ? MAX_UINT64 : MAX_UINT32;
   Status = gBS->AllocatePages (AllocateMaxAddress, EfiACPIMemoryNVS, NumPages,
   );
   if (EFI_ERROR (Status)) {
 return Status;
   }
-- 
2.9.3


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [edk2 PATCH 1/3] OvmfPkg/AcpiPlatformDxe: rename BLOB.HostsOnlyTableData to BLOB.Releasable

2017-06-02 Thread Laszlo Ersek
The "BLOB.HostsOnlyTableData" field tracks whether the allocated &
downloaded fw_cfg blob should be released in the end. The current name
"HostsOnlyTableData" reflects only the original determinant for this,
namely whether the blob hosts ACPI table data only -- because in that case
EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable() would create deep copies of all
referenced parts of the blob.

However, in commit 9965cbd424f2 ("OvmfPkg/AcpiPlatformDxe: implement the
QEMU_LOADER_WRITE_POINTER command", 2017-02-08) we flipped the field to
FALSE in ProcessCmdWritePointer() too, because the blob must also not be
released if we send its allocation address back to QEMU. Therefore we
should more generally call the field "Releasable".

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Jordan Justen <jordan.l.jus...@intel.com>
Cc: Leif Lindholm <leif.lindh...@linaro.org>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | 22 ++--
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c 
b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
index 1bc5fe297a96..4a7b051288bc 100644
--- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
+++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
@@ -34,13 +34,12 @@ typedef struct {
   UINT8   File[QEMU_LOADER_FNAME_SIZE]; // NUL-terminated name of the fw_cfg
 // blob. This is the ordering / search
 // key.
   UINTN   Size; // The number of bytes in this blob.
   UINT8   *Base;// Pointer to the blob data.
-  BOOLEAN HostsOnlyTableData;   // TRUE iff the blob has been found to
-// only contain data that is directly
-// part of ACPI tables.
+  BOOLEAN Releasable;   // TRUE iff the blob should be released
+// at the end of processing.
 } BLOB;
 
 
 /**
   Compare a standalone key against a user structure containing an embedded key.
@@ -206,11 +205,11 @@ ProcessCmdAllocate (
 goto FreePages;
   }
   CopyMem (Blob->File, Allocate->File, QEMU_LOADER_FNAME_SIZE);
   Blob->Size = FwCfgSize;
   Blob->Base = (VOID *)(UINTN)Address;
-  Blob->HostsOnlyTableData = TRUE;
+  Blob->Releasable = TRUE;
 
   Status = OrderedCollectionInsert (Tracker, NULL, Blob);
   if (Status == RETURN_ALREADY_STARTED) {
 DEBUG ((EFI_D_ERROR, "%a: duplicated file \"%a\"\n", __FUNCTION__,
   Allocate->File));
@@ -505,11 +504,11 @@ ProcessCmdWritePointer (
   //
   // Because QEMU has now learned PointeeBlob->Base, we must mark PointeeBlob
   // as unreleasable, for the case when the whole linker/loader script is
   // handled successfully.
   //
-  PointeeBlob->HostsOnlyTableData = FALSE;
+  PointeeBlob->Releasable = FALSE;
 
   DEBUG ((DEBUG_VERBOSE, "%a: PointerFile=\"%a\" PointeeFile=\"%a\" "
 "PointerOffset=0x%x PointeeOffset=0x%x PointerSize=%d\n", __FUNCTION__,
 WritePointer->PointerFile, WritePointer->PointeeFile,
 WritePointer->PointerOffset, WritePointer->PointeeOffset,
@@ -612,12 +611,13 @@ UndoCmdWritePointer (
  before, or an ACPI table different from RSDT
  and XSDT has been installed (reflected by
  InstalledKey and NumInstalled), or RSDT or
  XSDT has been identified but not installed, or
  the fw_cfg blob pointed-into by AddPointer has
- been marked as hosting something else than
- just direct ACPI table contents.
+ been marked as non-releasable due to hosting
+ something else than just direct ACPI table
+ contents.
 
   @returnError codes returned by
  AcpiProtocol->InstallAcpiTable().
 **/
 STATIC
@@ -740,11 +740,11 @@ Process2ndPassCmdAddPointer (
 }
   }
 
   if (TableSize == 0) {
 DEBUG ((EFI_D_VERBOSE, "not found; marking fw_cfg blob as opaque\n"));
-Blob2->HostsOnlyTableData = FALSE;
+Blob2->Releasable = FALSE;
 return EFI_SUCCESS;
   }
 
   if (*NumInstalled == INSTALLED_TABLES_MAX) {
 DEBUG ((EFI_D_ERROR, "%a: can't install more than %d tab

[SeaBIOS] [seabios PATCH 2/2] romfile_loader: alloc: cope with the UEFI-oriented 64BIT zone hint

2017-06-02 Thread Laszlo Ersek
ROMFILE_LOADER_ALLOC_ZONE_64BIT permits the guest firmware to allocate the
blob being downloaded anywhere in the 64-bit address space. In SeaBIOS, we
can simply alias this zone request to ROMFILE_LOADER_ALLOC_ZONE_HIGH
(i.e., allocate the blob in 32-bit address space.)

Cc: "Kevin O'Connor" <ke...@koconnor.net>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 src/fw/romfile_loader.h | 7 ---
 src/fw/romfile_loader.c | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/fw/romfile_loader.h b/src/fw/romfile_loader.h
index d90c3db24331..9828d4ad1094 100644
--- a/src/fw/romfile_loader.h
+++ b/src/fw/romfile_loader.h
@@ -11,11 +11,11 @@ struct romfile_loader_entry_s {
 u32 command;
 union {
 /*
  * COMMAND_ALLOCATE - allocate a table from @alloc.file
  * subject to @alloc.align alignment (must be power of 2)
- * and @alloc.zone (can be HIGH or FSEG) requirements.
+ * and @alloc.zone (see ROMFILE_LOADER_ALLOC_ZONE_*) requirements.
  * The most significant bit (bit 7) of @alloc.zone is used as a content
  * hint for UEFI guest firmware, see ROMFILE_LOADER_ALLOC_CONTENT_*.
  *
  * Must appear exactly once for each file, and before
  * this file is referenced by any other command.
@@ -80,12 +80,13 @@ enum {
 ROMFILE_LOADER_COMMAND_ADD_CHECKSUM  = 0x3,
 ROMFILE_LOADER_COMMAND_WRITE_POINTER = 0x4,
 };
 
 enum {
-ROMFILE_LOADER_ALLOC_ZONE_HIGH = 0x1,
-ROMFILE_LOADER_ALLOC_ZONE_FSEG = 0x2,
+ROMFILE_LOADER_ALLOC_ZONE_HIGH  = 0x1,
+ROMFILE_LOADER_ALLOC_ZONE_FSEG  = 0x2,
+ROMFILE_LOADER_ALLOC_ZONE_64BIT = 0x3,
 };
 
 enum {
 ROMFILE_LOADER_ALLOC_CONTENT_MIXED  = 0x00,
 ROMFILE_LOADER_ALLOC_CONTENT_NOACPI = 0x80,
diff --git a/src/fw/romfile_loader.c b/src/fw/romfile_loader.c
index 6a457902a36a..c0c476b58990 100644
--- a/src/fw/romfile_loader.c
+++ b/src/fw/romfile_loader.c
@@ -68,10 +68,11 @@ static void romfile_loader_allocate(struct 
romfile_loader_entry_s *entry,
 
 zone_req = entry->alloc.zone;
 zone_req &= ~(unsigned)ROMFILE_LOADER_ALLOC_CONTENT_NOACPI;
 switch (zone_req) {
 case ROMFILE_LOADER_ALLOC_ZONE_HIGH:
+case ROMFILE_LOADER_ALLOC_ZONE_64BIT:
 zone = 
 break;
 case ROMFILE_LOADER_ALLOC_ZONE_FSEG:
 zone = 
 break;
-- 
2.9.3


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [seabios PATCH 0/2] romfile_loader: cope with the UEFI-oriented allocation extensions

2017-06-02 Thread Laszlo Ersek
Please see the parent blurb
<c76b36de-ebf9-c662-d454-0a95b43901e8@redhat.com">http://mid.mail-archive.com/c76b36de-ebf9-c662-d454-0a95b43901e8@redhat.com>
for a high level description.

Cc: "Kevin O'Connor" <ke...@koconnor.net>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>

Thanks
Laszlo

Laszlo Ersek (2):
  romfile_loader: alloc: cope with the UEFI-oriented NOACPI content hint
  romfile_loader: alloc: cope with the UEFI-oriented 64BIT zone hint

 src/fw/romfile_loader.h | 14 +++---
 src/fw/romfile_loader.c |  6 +-
 2 files changed, 16 insertions(+), 4 deletions(-)

-- 
2.9.3


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [seabios PATCH 1/2] romfile_loader: alloc: cope with the UEFI-oriented NOACPI content hint

2017-06-02 Thread Laszlo Ersek
OvmfPkg/AcpiPlatformDxe, which implements the client for QEMU's
linker/loader in the OVMF and ArmVirtQemu virtual UEFI firmwares,
currently relies on a 2nd pass processing of the ADD_POINTER commands, to
identify potential ACPI tables in the pointed-to blobs. The reason for
this is that ACPI tables must be individually passed to
EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable() for installation.

In order to tell apart ACPI tables from other operation region-like areas
within pointed-to blobs, OvmfPkg/AcpiPlatformDxe employs a heuristic
called "ACPI SDT header probe" at the target locations of the ADD_POINTER
commands. While all ACPI tables generated by QEMU satisfy this check
(i.e., there are no false negatives), blob content that is *not* an ACPI
table has a very slight chance to pass the test as well (i.e., there is a
small chance for false positives).

In order to suppress this small chance, in QEMU we've historically
formatted opregion-like areas in blobs with a fixed size zero prefix (see
e.g. "docs/specs/vmgenid.txt"), which guarantees that the probe in
OvmfPkg/AcpiPlatformDxe will fail. However, this "suppressor prefix" has
had to be taken into account explicitly in generated AML code -- the
prefix size has had to be added to the patched integer object in AML, at
runtime --, leading to awkwardness.

QEMU is introducing a new hint for the ALLOC command, as the most
significant bit of the uint8_t "zone" field, for disabling the ACPI SDT
header probe in OvmfPkg/AcpiPlatformDxe, for all the pointers that point
into the blob downloaded with the ALLOC command. When the bit is set, the
blob is guaranteed to contain no ACPI tables. When the bit is clear, the
behavior is left unchanged.

For SeaBIOS, this bit is irrelevant, thus mask it out.

Cc: "Kevin O'Connor" <ke...@koconnor.net>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 src/fw/romfile_loader.h | 7 +++
 src/fw/romfile_loader.c | 5 -
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/fw/romfile_loader.h b/src/fw/romfile_loader.h
index fcd4ab236b61..d90c3db24331 100644
--- a/src/fw/romfile_loader.h
+++ b/src/fw/romfile_loader.h
@@ -12,10 +12,12 @@ struct romfile_loader_entry_s {
 union {
 /*
  * COMMAND_ALLOCATE - allocate a table from @alloc.file
  * subject to @alloc.align alignment (must be power of 2)
  * and @alloc.zone (can be HIGH or FSEG) requirements.
+ * The most significant bit (bit 7) of @alloc.zone is used as a content
+ * hint for UEFI guest firmware, see ROMFILE_LOADER_ALLOC_CONTENT_*.
  *
  * Must appear exactly once for each file, and before
  * this file is referenced by any other command.
  */
 struct {
@@ -82,10 +84,15 @@ enum {
 enum {
 ROMFILE_LOADER_ALLOC_ZONE_HIGH = 0x1,
 ROMFILE_LOADER_ALLOC_ZONE_FSEG = 0x2,
 };
 
+enum {
+ROMFILE_LOADER_ALLOC_CONTENT_MIXED  = 0x00,
+ROMFILE_LOADER_ALLOC_CONTENT_NOACPI = 0x80,
+};
+
 int romfile_loader_execute(const char *name);
 
 void romfile_fw_cfg_resume(void);
 
 #endif
diff --git a/src/fw/romfile_loader.c b/src/fw/romfile_loader.c
index 18476e2075e3..6a457902a36a 100644
--- a/src/fw/romfile_loader.c
+++ b/src/fw/romfile_loader.c
@@ -55,19 +55,22 @@ void romfile_fw_cfg_resume(void)
 
 static void romfile_loader_allocate(struct romfile_loader_entry_s *entry,
 struct romfile_loader_files *files)
 {
 struct zone_s *zone;
+unsigned zone_req;
 struct romfile_loader_file *file = >files[files->nfiles];
 void *data;
 int ret;
 unsigned alloc_align = le32_to_cpu(entry->alloc.align);
 
 if (alloc_align & (alloc_align - 1))
 goto err;
 
-switch (entry->alloc.zone) {
+zone_req = entry->alloc.zone;
+zone_req &= ~(unsigned)ROMFILE_LOADER_ALLOC_CONTENT_NOACPI;
+switch (zone_req) {
 case ROMFILE_LOADER_ALLOC_ZONE_HIGH:
 zone = 
 break;
 case ROMFILE_LOADER_ALLOC_ZONE_FSEG:
 zone = 
-- 
2.9.3



___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [qemu PATCH 5/7] hw/acpi/vmgenid: ask the fw to alloc VMGENID_GUID_FW_CFG_FILE as NOACPI

2017-06-02 Thread Laszlo Ersek
The "etc/vmgenid_guid" fw_cfg blob is guaranteed not to contain ACPI
tables, so turning off the ACPI SDT header probe in OVMF is the right
thing to do.

SeaBIOS needs a patch for recognizing (and masking out) the
BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI bit, but its behavior will not
change.

By setting the BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI bit, we can
eliminate the "OVMF SDT Header probe suppressor" -- we can shift the GUID
to offset zero from offset 40 decimal (VMGENID_GUID_OFFSET).

Regarding the allocation zone, we cannot relax that to 64-bit, because the
"VGIA" object, into which the address of "etc/vmgenid_guid" is patched, is
only a DWORD.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---

Notes:
I tested this change extensively,
- by repeating the steps written up in
  
<http://mid.mail-archive.com/c052d05e-71a5-1a6a-f34f-17d14167c2f6@redhat.com>,
- by doing the same after S3 suspend/resume,
- by verifying firmware logs,
- by directly checking ACPI content and dmesg in a Linux guest.

I know Ben has a pending patch titled "[PATCH] tests: Add unit tests for
the VM Generation ID feature", that one should be adapted as well
(replace VMGENID_GUID_OFFSET with plain 0). I'd be happy to do that.

 docs/specs/vmgenid.txt| 49 ---
 include/hw/acpi/vmgenid.h |  3 ---
 hw/acpi/vmgenid.c | 21 
 3 files changed, 25 insertions(+), 48 deletions(-)

diff --git a/docs/specs/vmgenid.txt b/docs/specs/vmgenid.txt
index aa9f5186767c..fb9f372edbc8 100644
--- a/docs/specs/vmgenid.txt
+++ b/docs/specs/vmgenid.txt
@@ -63,76 +63,74 @@ Xen) put it in the main descriptor table (Differentiated 
System Description
 Table or DSDT).  For ease of debugging and implementation, we have decided to
 put it in its own Secondary System Description Table, or SSDT.
 
 The following is a dump of the contents from a running system:
 
-# iasl -p ./SSDT -d /sys/firmware/acpi/tables/SSDT
+# acpidump -n SSDT -b
+# iasl -d ssdt.dat
 
 Intel ACPI Component Architecture
-ASL+ Optimizing Compiler version 20150717-64
-Copyright (c) 2000 - 2015 Intel Corporation
+ASL+ Optimizing Compiler version 20160527-64
+Copyright (c) 2000 - 2016 Intel Corporation
 
-Reading ACPI table from file /sys/firmware/acpi/tables/SSDT - Length
-0198 (0xC6)
+Input file ssdt.dat, Length 0xC6 (198) bytes
 ACPI: SSDT 0x C6 (v01 BOCHS  VMGENID  0001 BXPC
 0001)
-Acpi table [SSDT] successfully installed and loaded
 Pass 1 parse of [SSDT]
 Pass 2 parse of [SSDT]
 Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)
 
 Parsing completed
 Disassembly completed
-ASL Output:./SSDT.dsl - 1631 bytes
-# cat SSDT.dsl
+ASL Output:ssdt.dsl - 1559 bytes
+# cat ssdt.dsl
 /*
  * Intel ACPI Component Architecture
- * AML/ASL+ Disassembler version 20150717-64
- * Copyright (c) 2000 - 2015 Intel Corporation
+ * AML/ASL+ Disassembler version 20160527-64
+ * Copyright (c) 2000 - 2016 Intel Corporation
  *
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of /sys/firmware/acpi/tables/SSDT, Sun Feb  5 00:19:37 2017
+ * Disassembly of ssdt.dat, Fri Jun  2 15:29:10 2017
  *
  * Original Table Header:
  * Signature"SSDT"
- * Length   0x00CA (202)
+ * Length   0x00C6 (198)
  * Revision 0x01
- * Checksum 0x4B
+ * Checksum 0x38
  * OEM ID   "BOCHS "
  * OEM Table ID "VMGENID"
  * OEM Revision 0x0001 (1)
  * Compiler ID  "BXPC"
  * Compiler Version 0x0001 (1)
  */
-DefinitionBlock ("/sys/firmware/acpi/tables/SSDT.aml", "SSDT", 1, "BOCHS ",
-"VMGENID", 0x0001)
+DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x0001)
 {
-Name (VGIA, 0x07FFF000)
+Name (VGIA, 0xB000)
 Scope (\_SB)
 {
 Device (VGEN)
 {
 Name (_HID, "QEMUVGID")  // _HID: Hardware ID
 Name (_CID, "VM_Gen_Counter")  // _CID: Compatible ID
 Name (_DDN, "VM_Gen_Counter")  // _DDN: DOS Device Name
 Method (_STA, 0, NotSerialized)  // _STA: Status
 {
 Local0 = 0x0F
-If ((VGIA == Zero))
+If (VGIA == Zero)
 {
 Local0 = Zero
  

[SeaBIOS] [qemu PATCH 6/7] hw/i386/acpi-build: ask the fw to alloc ACPI_BUILD_TPMLOG_FILE with 64bit/NOACPI

2017-06-02 Thread Laszlo Ersek
The "etc/tpm/log" fw_cfg blob is guaranteed not to contain ACPI tables, so
turning off the ACPI SDT header probe in OVMF is the right thing to do.

In addition, the address of the blob is patched into the
"TCPA.log_area_start_address" field, which has type "uint64_t". Therefore
we can change the allocation zone to 64-bit as well.

SeaBIOS needs a patch for recognizing (and masking out) the
BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI bit, and also for handling
BIOS_LINKER_LOADER_ALLOC_ZONE_64BIT the same as
BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH, but its allocation behavior will not
change.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---

Notes:
I don't know how to test this device, so I didn't. Help from the
device's maintainer would be highly appreciated. Thanks.

 hw/i386/acpi-build.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 3c4c28c6c2ca..1ec008ec5003 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2285,12 +2285,12 @@ build_tpm_tcpa(GArray *table_data, BIOSLinker *linker, 
GArray *tcpalog)
 tcpa->platform_class = cpu_to_le16(TPM_TCPA_ACPI_CLASS_CLIENT);
 tcpa->log_area_minimum_length = cpu_to_le32(TPM_LOG_AREA_MINIMUM_SIZE);
 acpi_data_push(tcpalog, le32_to_cpu(tcpa->log_area_minimum_length));
 
 bios_linker_loader_alloc(linker, ACPI_BUILD_TPMLOG_FILE, tcpalog, 1,
- BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH,
- BIOS_LINKER_LOADER_ALLOC_CONTENT_MIXED);
+ BIOS_LINKER_LOADER_ALLOC_ZONE_64BIT,
+ BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI);
 
 /* log area start address to be filled by Guest linker */
 bios_linker_loader_add_pointer(linker,
 ACPI_BUILD_TABLE_FILE, log_addr_offset, log_addr_size,
 ACPI_BUILD_TPMLOG_FILE, 0);
-- 
2.9.3



___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [qemu PATCH 4/7] hw/acpi/nvdimm: ask the firmware to allocate NVDIMM_DSM_MEM_FILE as NOACPI

2017-06-02 Thread Laszlo Ersek
The "etc/acpi/nvdimm-mem" fw_cfg blob is guaranteed not to contain ACPI
tables, so turning off the ACPI SDT header probe in OVMF is the right
thing to do.

SeaBIOS needs a patch for recognizing (and masking out) the
BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI bit, but its behavior will not
change.

Regarding the allocation zone, we cannot relax that to 64-bit, because the
"MEMA" object (NVDIMM_ACPI_MEM_ADDR), into which the address of
"etc/acpi/nvdimm-mem" is patched, is only a DWORD.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---

Notes:
I don't know how to test this device, so I didn't. Help from the
device's maintainer would be highly appreciated. Thanks.

 hw/acpi/nvdimm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 81bd0214fb3e..34b9a0f39a02 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -1263,11 +1263,11 @@ static void nvdimm_build_ssdt(GArray *table_offsets, 
GArray *table_data,
 
 bios_linker_loader_alloc(linker,
  NVDIMM_DSM_MEM_FILE, dsm_dma_arrea,
  sizeof(NvdimmDsmIn),
  BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH,
- BIOS_LINKER_LOADER_ALLOC_CONTENT_MIXED);
+ BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI);
 bios_linker_loader_add_pointer(linker,
 ACPI_BUILD_TABLE_FILE, mem_addr_offset, sizeof(uint32_t),
 NVDIMM_DSM_MEM_FILE, 0);
 build_header(linker, table_data,
 (void *)(table_data->data + nvdimm_ssdt),
-- 
2.9.3



___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [qemu PATCH 0/7] bios-linker-loader: introduce the NOACPI hint and the 64-bit zone for ALLOCATE

2017-06-02 Thread Laszlo Ersek
Please see the parent blurb
<c76b36de-ebf9-c662-d454-0a95b43901e8@redhat.com">http://mid.mail-archive.com/c76b36de-ebf9-c662-d454-0a95b43901e8@redhat.com>
for a high level description.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>

Thanks
Laszlo

Laszlo Ersek (7):
  hw/acpi/bios-linker-loader: expose allocation zone as an enum
  hw/acpi/bios-linker-loader: introduce "no ACPI tables" content hint
for ALLOC
  hw/acpi/bios-linker-loader: introduce
BIOS_LINKER_LOADER_ALLOC_ZONE_64BIT
  hw/acpi/nvdimm: ask the firmware to allocate NVDIMM_DSM_MEM_FILE as
NOACPI
  hw/acpi/vmgenid: ask the fw to alloc VMGENID_GUID_FW_CFG_FILE as
NOACPI
  hw/i386/acpi-build: ask the fw to alloc ACPI_BUILD_TPMLOG_FILE with
64bit/NOACPI
  hw/arm/virt-acpi-build: make the fw alloc blobs with ACPI tables as
64bit

 docs/specs/vmgenid.txt   | 49 
 include/hw/acpi/bios-linker-loader.h | 22 +++-
 include/hw/acpi/vmgenid.h|  3 ---
 hw/acpi/bios-linker-loader.c | 18 ++---
 hw/acpi/nvdimm.c |  4 ++-
 hw/acpi/vmgenid.c| 25 ++
 hw/arm/virt-acpi-build.c |  6 +++--
 hw/i386/acpi-build.c |  9 ---
 8 files changed, 70 insertions(+), 66 deletions(-)

-- 
2.9.3


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] [qemu PATCH 1/7] hw/acpi/bios-linker-loader: expose allocation zone as an enum

2017-06-02 Thread Laszlo Ersek
In a later patch, we'll introduce another allocation zone (which won't fit
in the "alloc_fseg" bool). For now, just move the enum constants from
"bios-linker-loader.c" to "bios-linker-loader.h", and update the
bios_linker_loader_alloc() function prototype so that callers can directly
pass in the enumeration constants.

This is all the more justified because at the bios_linker_loader_alloc()
call sites, the true/false arguments passed in to the current "alloc_fseg"
boolean parameter are always accompanied by a textual comment that spells
out the actual zone. So this patch improves clarity in itself.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 include/hw/acpi/bios-linker-loader.h | 10 +-
 hw/acpi/bios-linker-loader.c | 14 --
 hw/acpi/nvdimm.c |  3 ++-
 hw/acpi/vmgenid.c|  5 +++--
 hw/arm/virt-acpi-build.c |  4 ++--
 hw/i386/acpi-build.c |  6 +++---
 6 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/include/hw/acpi/bios-linker-loader.h 
b/include/hw/acpi/bios-linker-loader.h
index efe17b0b9cb0..8d55f1fab32b 100644
--- a/include/hw/acpi/bios-linker-loader.h
+++ b/include/hw/acpi/bios-linker-loader.h
@@ -5,17 +5,25 @@
 typedef struct BIOSLinker {
 GArray *cmd_blob;
 GArray *file_list;
 } BIOSLinker;
 
+typedef enum BIOSLinkerLoaderAllocZone {
+/* request blob allocation in 32-bit memory */
+BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH = 0x1,
+
+/* request blob allocation in FSEG zone (useful for the RSDP ACPI table) */
+BIOS_LINKER_LOADER_ALLOC_ZONE_FSEG = 0x2,
+} BIOSLinkerLoaderAllocZone;
+
 BIOSLinker *bios_linker_loader_init(void);
 
 void bios_linker_loader_alloc(BIOSLinker *linker,
   const char *file_name,
   GArray *file_blob,
   uint32_t alloc_align,
-  bool alloc_fseg);
+  BIOSLinkerLoaderAllocZone zone);
 
 void bios_linker_loader_add_checksum(BIOSLinker *linker, const char *file,
  unsigned start_offset, unsigned size,
  unsigned checksum_offset);
 
diff --git a/hw/acpi/bios-linker-loader.c b/hw/acpi/bios-linker-loader.c
index 046183a0f142..9754d98e7345 100644
--- a/hw/acpi/bios-linker-loader.c
+++ b/hw/acpi/bios-linker-loader.c
@@ -38,11 +38,11 @@ struct BiosLinkerLoaderEntry {
 uint32_t command;
 union {
 /*
  * COMMAND_ALLOCATE - allocate a table from @alloc.file
  * subject to @alloc.align alignment (must be power of 2)
- * and @alloc.zone (can be HIGH or FSEG) requirements.
+ * and @alloc.zone (see BIOSLinkerLoaderAllocZone) requirements.
  *
  * Must appear exactly once for each file, and before
  * this file is referenced by any other command.
  */
 struct {
@@ -104,15 +104,10 @@ enum {
 BIOS_LINKER_LOADER_COMMAND_ADD_POINTER   = 0x2,
 BIOS_LINKER_LOADER_COMMAND_ADD_CHECKSUM  = 0x3,
 BIOS_LINKER_LOADER_COMMAND_WRITE_POINTER = 0x4,
 };
 
-enum {
-BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH = 0x1,
-BIOS_LINKER_LOADER_ALLOC_ZONE_FSEG = 0x2,
-};
-
 /*
  * BiosLinkerFileEntry:
  *
  * An internal type used for book-keeping file entries
  */
@@ -173,19 +168,19 @@ bios_linker_find_file(const BIOSLinker *linker, const 
char *name)
  *
  * @linker: linker object instance
  * @file_name: name of the file blob to be loaded
  * @file_blob: pointer to blob corresponding to @file_name
  * @alloc_align: required minimal alignment in bytes. Must be a power of 2.
- * @alloc_fseg: request allocation in FSEG zone (useful for the RSDP ACPI 
table)
+ * @zone: request allocation in this zone
  *
  * Note: this command must precede any other linker command using this file.
  */
 void bios_linker_loader_alloc(BIOSLinker *linker,
   const char *file_name,
   GArray *file_blob,
   uint32_t alloc_align,
-  bool alloc_fseg)
+  BIOSLinkerLoaderAllocZone zone)
 {
 BiosLinkerLoaderEntry entry;
 BiosLinkerFileEntry file = { g_strdup(file_name), file_blob};
 
 assert(!(alloc_align & (alloc_align - 1)));
@@ -195,12 +190,11 @@ void bios_linker_loader_alloc(BIOSLinker *linker,
 
 memset(, 0, sizeof entry);
 strncpy(entry.alloc.file, file_name, sizeof entry.al

[SeaBIOS] [qemu PATCH 2/7] hw/acpi/bios-linker-loader: introduce "no ACPI tables" content hint for ALLOC

2017-06-02 Thread Laszlo Ersek
OvmfPkg/AcpiPlatformDxe, which implements the client for QEMU's
linker/loader in the OVMF and ArmVirtQemu virtual UEFI firmwares,
currently relies on a 2nd pass processing of the ADD_POINTER commands, to
identify potential ACPI tables in the pointed-to blobs. The reason for
this is that ACPI tables must be individually passed to
EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable() for installation.

In order to tell apart ACPI tables from other operation region-like areas
within pointed-to blobs, OvmfPkg/AcpiPlatformDxe employs a heuristic
called "ACPI SDT header probe" at the target locations of the ADD_POINTER
commands. While all ACPI tables generated by QEMU satisfy this check
(i.e., there are no false negatives), blob content that is *not* an ACPI
table has a very slight chance to pass the test as well (i.e., there is a
small chance for false positives).

In order to suppress this small chance, we've historically formatted
opregion-like areas in blobs with a fixed size zero prefix (see e.g.
"docs/specs/vmgenid.txt"), which guarantees that the probe in
OvmfPkg/AcpiPlatformDxe will fail. However, this "suppressor prefix" has
had to be taken into account explicitly in generated AML code -- the
prefix size has had to be added to the patched integer object in AML, at
runtime --, leading to awkwardness.

Introduce a new hint for the ALLOC command, as the most significant bit of
the uint8_t "zone" field, for disabling the ACPI SDT header probe in
OvmfPkg/AcpiPlatformDxe, for all the pointers that point into the blob
downloaded with the ALLOC command. When the bit is set, the blob is
guaranteed to contain no ACPI tables. When the bit is clear, the behavior
is left unchanged.

In this initial patch, all bios_linker_loader_alloc() invocations are left
with intact behavior.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 include/hw/acpi/bios-linker-loader.h | 11 ++-
 hw/acpi/bios-linker-loader.c |  8 ++--
 hw/acpi/nvdimm.c |  3 ++-
 hw/acpi/vmgenid.c|  3 ++-
 hw/arm/virt-acpi-build.c |  6 --
 hw/i386/acpi-build.c |  9 ++---
 6 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/include/hw/acpi/bios-linker-loader.h 
b/include/hw/acpi/bios-linker-loader.h
index 8d55f1fab32b..5202fd14977d 100644
--- a/include/hw/acpi/bios-linker-loader.h
+++ b/include/hw/acpi/bios-linker-loader.h
@@ -13,17 +13,26 @@ typedef enum BIOSLinkerLoaderAllocZone {
 
 /* request blob allocation in FSEG zone (useful for the RSDP ACPI table) */
 BIOS_LINKER_LOADER_ALLOC_ZONE_FSEG = 0x2,
 } BIOSLinkerLoaderAllocZone;
 
+typedef enum BIOSLinkerLoaderAllocContent {
+/* the blob may or may not contain ACPI tables */
+BIOS_LINKER_LOADER_ALLOC_CONTENT_MIXED = 0x00,
+
+/* the blob is guaranteed not to contain ACPI tables */
+BIOS_LINKER_LOADER_ALLOC_CONTENT_NOACPI = 0x80,
+} BIOSLinkerLoaderAllocContent;
+
 BIOSLinker *bios_linker_loader_init(void);
 
 void bios_linker_loader_alloc(BIOSLinker *linker,
   const char *file_name,
   GArray *file_blob,
   uint32_t alloc_align,
-  BIOSLinkerLoaderAllocZone zone);
+  BIOSLinkerLoaderAllocZone zone,
+  BIOSLinkerLoaderAllocContent content);
 
 void bios_linker_loader_add_checksum(BIOSLinker *linker, const char *file,
  unsigned start_offset, unsigned size,
  unsigned checksum_offset);
 
diff --git a/hw/acpi/bios-linker-loader.c b/hw/acpi/bios-linker-loader.c
index 9754d98e7345..4ad9260fe72d 100644
--- a/hw/acpi/bios-linker-loader.c
+++ b/hw/acpi/bios-linker-loader.c
@@ -39,10 +39,12 @@ struct BiosLinkerLoaderEntry {
 union {
 /*
  * COMMAND_ALLOCATE - allocate a table from @alloc.file
  * subject to @alloc.align alignment (must be power of 2)
  * and @alloc.zone (see BIOSLinkerLoaderAllocZone) requirements.
+ * The most significant bit (bit 7) of @alloc.zone is used as a content
+ * hint for UEFI guest firmware, see BIOSLinkerLoaderAllocContent.
  *
  * Must appear exactly once for each file, and before
  * this file is referenced by any other command.
  */
 struct {
@@ -169,18 +171,20 @@ bios_linker_find_file(const BIOSLinker *linker, const 
char *name)
  * @linker: linker object instance
  * @f

[SeaBIOS] [qemu PATCH 7/7] hw/arm/virt-acpi-build: make the fw alloc blobs with ACPI tables as 64bit

2017-06-02 Thread Laszlo Ersek
Thanks to commit cb51ac2ffe36 ("hw/arm/virt: generate 64-bit addressable
ACPI objects", 2017-04-10), all pointer fields in the ACPI tables in the
"etc/acpi/rsdp" (ACPI_BUILD_RSDP_FILE) and "etc/acpi/tables"
(ACPI_BUILD_TABLE_FILE) fw_cfg blobs are 64-bit wide.

Therefore we can allow the guest firmware to allocate these blobs from
64-bit address space.

Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Ben Warren <b...@skyportsystems.com>
Cc: Dongjiu Geng <gengdong...@huawei.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: Shannon Zhao <zhaoshengl...@huawei.com>
Cc: Stefan Berger <stef...@linux.vnet.ibm.com>
Cc: Xiao Guangrong <guangrong.x...@linux.intel.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---

Notes:
I verified this change with firmware logs and a Linux guest's dmesg.

 hw/arm/virt-acpi-build.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 1c20b851a611..8648d89decb7 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -370,11 +370,11 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, 
unsigned xsdt_tbl_offset)
 unsigned xsdt_pa_size = sizeof(rsdp->xsdt_physical_address);
 unsigned xsdt_pa_offset =
 (char *)>xsdt_physical_address - rsdp_table->data;
 
 bios_linker_loader_alloc(linker, ACPI_BUILD_RSDP_FILE, rsdp_table, 16,
- BIOS_LINKER_LOADER_ALLOC_ZONE_FSEG,
+ BIOS_LINKER_LOADER_ALLOC_ZONE_64BIT,
  BIOS_LINKER_LOADER_ALLOC_CONTENT_MIXED);
 
 memcpy(>signature, "RSD PTR ", sizeof(rsdp->signature));
 memcpy(rsdp->oem_id, ACPI_BUILD_APPNAME6, sizeof(rsdp->oem_id));
 rsdp->length = cpu_to_le32(sizeof(*rsdp));
@@ -750,11 +750,11 @@ void virt_acpi_build(VirtMachineState *vms, 
AcpiBuildTables *tables)
 table_offsets = g_array_new(false, true /* clear */,
 sizeof(uint32_t));
 
 bios_linker_loader_alloc(tables->linker,
  ACPI_BUILD_TABLE_FILE, tables_blob,
- 64, BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH,
+ 64, BIOS_LINKER_LOADER_ALLOC_ZONE_64BIT,
  BIOS_LINKER_LOADER_ALLOC_CONTENT_MIXED);
 
 /* DSDT is pointed to by FADT */
 dsdt = tables_blob->len;
 build_dsdt(tables_blob, tables->linker, vms);
-- 
2.9.3


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


[SeaBIOS] allocation zone extensions for the firmware linker/loader

2017-06-02 Thread Laszlo Ersek
Hi,

this message is cross-posted to three lists (qemu, seabios, edk2). I'll
follow up with three patch series, one series for each project. I'll
cross-post all of the patches as well, but I'll add the project name in
the "bag of tags" in the subject lines.

The QEMU series introduces two extensions to the ALLOCATE firmware
linker/loader command.

One extension is a new allocation zone, with value 3, for allowing the
firmware to allocate the fw_cfg blobs in 64-bit address space.

The other extension is a repurposing of the most significant bit (bit 7)
in the zone field. This bit becomes orthogonal to the rest of the zone
field. If the bit is set, it means that QEMU promises the firmware that
the blob referenced by the ALLOCATE command contains no ACPI tables at all.

After introducing these, the QEMU series puts them to use, covering all
of the currently generated ALLOCATE commands, as appropriate. Among the
benefits we can mention
- the removal of the OVMF ACPI SDT Header Probe suppressor from VMGENID
(and from any similar future devices),
- and the fact that the "virt" machine type (and maybe other machine
types) of the arm/aarch64 target will no longer require RAM under 4GB
for ACPI to work.

Both of these extensions are irrelevant for SeaBIOS, therefore the
SeaBIOS patches simply mask out bit 7 (for ignoring the "no ACPI
content" hint), and fall back to the HIGH zone (= 32-bit address space)
when the 64-bit zone is permitted.

In other words, SeaBIOS needs some patches to recognize the new zone
values, but beyond that, the behavior is unchanged.

Both extensions are important for virtual UEFI firmware (OVMF in x86
guests and ArmVirtQemu in aarch64 guests). The edk2 patches add support
to OvmfPkg/AcpiPlatformDxe for the extensions. Please see the commit
messages for details (all the extensions are explained in detail in the
relevant patches for all of the projects).

The patches can cause linker/loader breakage when old firmware is booted
on new QEMU. However, that's no problem (it's nothing new), the next
release of QEMU should bundle the new firmware binaries as always.

New firmware will continue running on old QEMU without issues.

(In case you have sent me emails about this in the last few tens of
hours, please know that I'm not ignoring them, I just haven't seen /
read them. Reading emails every five minutes makes focused work
impossible, so when I'm busy, I tend to read email once per day.)

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://mail.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v3] config: Add function to check if fw_cfg exists

2017-03-29 Thread Laszlo Ersek
On 03/28/17 23:03, Petr Berky wrote:
> It was found qemu_get_present_cpus_count may return impossible
> number of cpus because of not checking if fw_cfg exists before
> using it. That may lead to undefined behavior of emulator,
> in particular Bochs that freezes.
> 
> Signed-off-by: Petr Berky <petr.be...@email.cz>
> ---
>  src/fw/paravirt.c | 12 +++-
>  src/fw/paravirt.h |  1 +
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 707502d..5b23d78 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -32,9 +32,16 @@ u32 RamSize;
>  u64 RamSizeOver4G;
>  // Type of emulator platform.
>  int PlatformRunningOn VARFSEG;
> +// cfg enabled
> +int cfg_enabled = 0;
>  // cfg_dma enabled
>  int cfg_dma_enabled = 0;
>  
> +inline int qemu_cfg_enabled(void)
> +{
> +return cfg_enabled;
> +}
> +
>  inline int qemu_cfg_dma_enabled(void)
>  {
>  return cfg_dma_enabled;
> @@ -392,7 +399,9 @@ u16
>  qemu_get_present_cpus_count(void)
>  {
>  u16 smp_count = 0;
> -qemu_cfg_read_entry(_count, QEMU_CFG_NB_CPUS, sizeof(smp_count));
> +if (qemu_cfg_enabled()) {
> +qemu_cfg_read_entry(_count, QEMU_CFG_NB_CPUS, sizeof(smp_count));
> +}
>  u16 cmos_cpu_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
>  if (smp_count < cmos_cpu_count) {
>  smp_count = cmos_cpu_count;
> @@ -571,6 +580,7 @@ void qemu_cfg_init(void)
>  return;
>  
>  dprintf(1, "Found QEMU fw_cfg\n");
> +cfg_enabled = 1;
>  
>  // Detect DMA interface.
>  u32 id;
> diff --git a/src/fw/paravirt.h b/src/fw/paravirt.h
> index 16f3d9a..a14d83e 100644
> --- a/src/fw/paravirt.h
> +++ b/src/fw/paravirt.h
> @@ -49,6 +49,7 @@ static inline int runningOnKVM(void) {
>  // QEMU_CFG_DMA ID bit
>  #define QEMU_CFG_VERSION_DMA2
>  
> +int qemu_cfg_enabled(void);
>  int qemu_cfg_dma_enabled(void);
>  void qemu_preinit(void);
>  void qemu_platform_setup(void);
> 

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v2] config: Add function to check if fw_cfg exists

2017-03-14 Thread Laszlo Ersek
On 03/15/17 00:09, Petr Berky wrote:
> From b06589c683a7defb4853a3b810bd7e6a12abe2d6 Mon Sep 17 00:00:00 2001
> From: Petr Berky <petr.be...@email.cz>
> Date: Tue, 14 Mar 2017 23:32:15 +0100
> Subject: [PATCH v2] config: Add function to check if fw_cfg exists
> 
> It was found qemu_get_present_cpus_count may return impossible
> number of cpus because of not checking if fw_cfg exists before
> using it. That  may lead to undefined behavior of emulator,
> in particular Bochs that freezes.
> 
> Signed-off-by: Petr Berky <petr.be...@email.cz>
> ---
>  src/fw/paravirt.c | 12 +++-
>  src/fw/paravirt.h |  1 +
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 707502d..dfc69d4 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -32,9 +32,16 @@ u32 RamSize;
>  u64 RamSizeOver4G;
>  // Type of emulator platform.
>  int PlatformRunningOn VARFSEG;
> +// cfg enabled
> +int cfg_enabled = 0;
>  // cfg_dma enabled
>  int cfg_dma_enabled = 0;
> 
> +inline int qemu_cfg_enabled(void)
> +{
> +return cfg_enabled;
> +}
> +
>  inline int qemu_cfg_dma_enabled(void)
>  {
>  return cfg_dma_enabled;
> @@ -392,7 +399,9 @@ u16
>  qemu_get_present_cpus_count(void)
>  {
>  u16 smp_count = 0;
> -qemu_cfg_read_entry(_count, QEMU_CFG_NB_CPUS, sizeof(smp_count));
> +if (qemu_cfg_enabled()) {
> +qemu_cfg_read_entry(_count, QEMU_CFG_NB_CPUS,
> sizeof(smp_count));
> +}
>  u16 cmos_cpu_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
>  if (smp_count < cmos_cpu_count) {
>  smp_count = cmos_cpu_count;
> @@ -570,6 +579,7 @@ void qemu_cfg_init(void)
>  if (inb(PORT_QEMU_CFG_DATA) != sig[i])
>  return;
> 
> +cfg_enabled = 1;
>  dprintf(1, "Found QEMU fw_cfg\n");
> 
>  // Detect DMA interface.

If we wanted to parallel the DMA check 100%, we'd set the variable under
the debug message, not above it, but even I am not that pedantic. :)

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Igor, can you check if this is safe for S3 resume too? I think it is,
but I had better ask you.

Thanks
Laszlo


> diff --git a/src/fw/paravirt.h b/src/fw/paravirt.h
> index 16f3d9a..a14d83e 100644
> --- a/src/fw/paravirt.h
> +++ b/src/fw/paravirt.h
> @@ -49,6 +49,7 @@ static inline int runningOnKVM(void) {
>  // QEMU_CFG_DMA ID bit
>  #define QEMU_CFG_VERSION_DMA2
> 
> +int qemu_cfg_enabled(void);
>  int qemu_cfg_dma_enabled(void);
>  void qemu_preinit(void);
>  void qemu_platform_setup(void);


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] config: Add function to check if fw_cfg exists

2017-03-14 Thread Laszlo Ersek
On 03/14/17 21:33, petr.be...@email.cz wrote:
> From 405de6e571a2bf332452a17ae98f7b3a0613365e Mon Sep 17 00:00:00 2001
> From: Petr Berky 
> Date: Tue, 14 Mar 2017 20:30:52 +0100
> Subject: [PATCH] config: Add function to check if fw_cfg exists
> 
> It was found qemu_get_present_cpus_count may return impossible
> number of cpus because of not checking if fw_cfg exists before
> using it. That  may lead to undefined behavior of emulator,
> in particular Bochs that freezes.
> 
> Signed-off-by: Petr Berky 
> ---
>  src/fw/paravirt.c | 28 +---
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 707502d..b2cfc23 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -220,6 +220,21 @@ qemu_cfg_select(u16 f)
>  outw(f, PORT_QEMU_CFG_CTL);
>  }
>  
> +static int
> +qemu_cfg_check_signature(void)
> +{
> +int i;
> +char *sig = "QEMU";
> +
> +qemu_cfg_select(QEMU_CFG_SIGNATURE);
> +for (i = 0; i < 4; i++) {
> +if (inb(PORT_QEMU_CFG_DATA) != sig[i]) {
> +return -1;
> +}
> +}
> +return 0;
> +}
> +
>  static void
>  qemu_cfg_dma_transfer(void *address, u32 length, u32 control)
>  {
> @@ -392,7 +407,9 @@ u16
>  qemu_get_present_cpus_count(void)
>  {
>  u16 smp_count = 0;
> -qemu_cfg_read_entry(_count, QEMU_CFG_NB_CPUS, sizeof(smp_count));
> +if (qemu_cfg_check_signature() == 0) {
> +qemu_cfg_read_entry(_count, QEMU_CFG_NB_CPUS, sizeof(smp_count));
> +}
>  u16 cmos_cpu_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
>  if (smp_count < cmos_cpu_count) {
>  smp_count = cmos_cpu_count;
> @@ -563,12 +580,9 @@ void qemu_cfg_init(void)
>  return;
>  
>  // Detect fw_cfg interface.
> -qemu_cfg_select(QEMU_CFG_SIGNATURE);
> -char *sig = "QEMU";
> -int i;
> -for (i = 0; i < 4; i++)
> -if (inb(PORT_QEMU_CFG_DATA) != sig[i])
> -return;
> +if (qemu_cfg_check_signature() != 0) {
> +return;
> +}
>  
>  dprintf(1, "Found QEMU fw_cfg\n");
>  
> 

"src/fw/paravirt.c" already has an extern function called
qemu_cfg_dma_enabled(), whic is based on the static global variable
"cfg_dma_enabled", which is set in qemu_cfg_init().

The above is about the DMA interface for fw_cfg. I think it would be a
"natural" extension to add a similar global variable and helper function
(called qemu_cfg_enabled()) for the basic fw_cfg presence as well.

Then qemu_cfg_check_signature() would not be necessary in this patch.

Just an idea of course.

Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH 0/9] add support for generic lun enumeration

2017-03-02 Thread Laszlo Ersek
On 03/02/17 20:48, Roman Kagan wrote:
> On Thu, Mar 02, 2017 at 06:20:29PM +0100, Laszlo Ersek wrote:
>> On 03/01/17 11:45, Roman Kagan wrote:
>>> A number of SCSI drivers currently only see luns #0 in their targets.
>>>
>>> This may be a problem when drives have to be assigned bigger lun
>>> numbers, e.g. because the storage controllers don't provide enough
>>> target numbers to accomodate all drives.
>>> (In particular, I'm about to submit a driver for Hyper-V VMBus SCSI
>>> controller which is limited to 2 targets only).
>>
>> How do you run SeaBIOS in Hyper-V guests?
> 
> We run it in QEMU with Hyper-V VMBus paravitual storage controller.
> It's not in upstream QEMU yet, we're hammering it out and about to
> submit it soonish.  (I spoke about this project at the KVM Forum last
> summer).

Yes. I just wanted to make sure it was the same thing.

Thanks
Laszlo

___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH 0/9] add support for generic lun enumeration

2017-03-02 Thread Laszlo Ersek
On 03/01/17 11:45, Roman Kagan wrote:
> A number of SCSI drivers currently only see luns #0 in their targets.
> 
> This may be a problem when drives have to be assigned bigger lun
> numbers, e.g. because the storage controllers don't provide enough
> target numbers to accomodate all drives.
> (In particular, I'm about to submit a driver for Hyper-V VMBus SCSI
> controller which is limited to 2 targets only).

How do you run SeaBIOS in Hyper-V guests?

Thanks
Laszlo

> 
> This series adds generic SCSI lun enumeration (either via REPORT LUNS
> command or sequentially trying every lun), and makes the respective
> drivers use it.
> 
> Note that the series has only been minimally tested against a recent QEMU.
> 
> Roman Kagan (9):
>   blockcmd: accept only disks and CD-ROMs
>   blockcmd: generic SCSI luns enumeration
>   virtio-scsi: enumerate luns with REPORT LUNS
>   esp-scsi: enumerate luns with REPORT LUNS
>   usb-uas: enumerate luns with REPORT LUNS
>   pvscsi: fix the comment about lun enumeration
>   mpt-scsi: try to enumerate luns with REPORT LUNS
>   lsi-scsi: reset in case of a serious problem
>   lsi-scsi: try to enumerate luns with REPORT LUNS
> 
>  src/hw/blockcmd.h|  4 +++
>  src/hw/blockcmd.c| 96 
> 
>  src/hw/esp-scsi.c| 35 +--
>  src/hw/lsi-scsi.c| 39 +++--
>  src/hw/mpt-scsi.c| 40 ++
>  src/hw/pvscsi.c  |  2 +-
>  src/hw/usb-uas.c | 45 +++-
>  src/hw/virtio-scsi.c | 38 ++---
>  8 files changed, 235 insertions(+), 64 deletions(-)
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v7 4/5] QEMU fw_cfg: Add functions for accessing files by key

2017-02-21 Thread Laszlo Ersek
On 02/21/17 04:56, b...@skyportsystems.com wrote:
> From: Ben Warren <b...@skyportsystems.com>
> 
> Due to memory contraints, when resuming from S3 the fw_cfg "files" API
> isn't available.  This adds a simple API to get a file 'key', and to
> write to the file using the key as a reference.
> 
> Signed-off-by: Ben Warren <b...@skyportsystems.com>
> Reviewed-by: Igor Mammedov <imamm...@redhat.com>
> Reviewed-by: Laszlo Ersek <ler...@redhat.com>
> ---
>  src/fw/paravirt.c | 41 ++---
>  src/fw/paravirt.h |  2 ++
>  2 files changed, 32 insertions(+), 11 deletions(-)

Yep, looks good, my R-b stands.

Thanks,
Laszlo

> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 4618647..707502d 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -329,6 +329,22 @@ qemu_cfg_read_file(struct romfile_s *file, void *dst, 
> u32 maxlen)
>  return file->size;
>  }
>  
> +// Bare-bones function for writing a file knowing only its unique
> +// identifying key (select)
> +int
> +qemu_cfg_write_file_simple(void *src, u16 key, u32 offset, u32 len)
> +{
> +if (offset == 0) {
> +/* Do it in one transfer */
> +qemu_cfg_write_entry(src, key, len);
> +} else {
> +qemu_cfg_select(key);
> +qemu_cfg_skip(offset);
> +qemu_cfg_write(src, len);
> +}
> +return len;
> +}
> +
>  int
>  qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 len)
>  {
> @@ -339,17 +355,8 @@ qemu_cfg_write_file(void *src, struct romfile_s *file, 
> u32 offset, u32 len)
>  warn_internalerror();
>  return -1;
>  }
> -struct qemu_romfile_s *qfile;
> -qfile = container_of(file, struct qemu_romfile_s, file);
> -if (offset == 0) {
> -/* Do it in one transfer */
> -qemu_cfg_write_entry(src, qfile->select, len);
> -} else {
> -qemu_cfg_select(qfile->select);
> -qemu_cfg_skip(offset);
> -qemu_cfg_write(src, len);
> -}
> -return len;
> +return qemu_cfg_write_file_simple(src, qemu_get_romfile_key(file),
> +  offset, len);
>  }
>  
>  static void
> @@ -370,6 +377,18 @@ qemu_romfile_add(char *name, int select, int skip, int 
> size)
>  }
>  
>  u16
> +qemu_get_romfile_key(struct romfile_s *file)
> +{
> +struct qemu_romfile_s *qfile;
> +if (file->copy != qemu_cfg_read_file) {
> +warn_internalerror();
> +return 0;
> +}
> +qfile = container_of(file, struct qemu_romfile_s, file);
> +return qfile->select;
> +}
> +
> +u16
>  qemu_get_present_cpus_count(void)
>  {
>  u16 smp_count = 0;
> diff --git a/src/fw/paravirt.h b/src/fw/paravirt.h
> index fb220d8..16f3d9a 100644
> --- a/src/fw/paravirt.h
> +++ b/src/fw/paravirt.h
> @@ -56,5 +56,7 @@ void qemu_cfg_init(void);
>  
>  u16 qemu_get_present_cpus_count(void);
>  int qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 
> len);
> +int qemu_cfg_write_file_simple(void *src, u16 key, u32 offset, u32 len);
> +u16 qemu_get_romfile_key(struct romfile_s *file);
>  
>  #endif
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v6 5/5] QEMU fw_cfg: Write fw_cfg back on S3 resume

2017-02-20 Thread Laszlo Ersek
On 02/20/17 21:14, b...@skyportsystems.com wrote:
> From: Ben Warren <b...@skyportsystems.com>
> 
> Any pointers to BIOS-allocated memory that were written back to QEMU
> fw_cfg files are replayed when resuming from S3 sleep.
> 
> Signed-off-by: Ben Warren <b...@skyportsystems.com>
> ---
>  src/fw/romfile_loader.c | 33 +
>  src/fw/romfile_loader.h |  2 ++
>  src/resume.c|  4 
>  3 files changed, 39 insertions(+)
> 
> diff --git a/src/fw/romfile_loader.c b/src/fw/romfile_loader.c
> index 30e7b58..14bc908 100644
> --- a/src/fw/romfile_loader.c
> +++ b/src/fw/romfile_loader.c
> @@ -4,6 +4,7 @@
>  #include "string.h" // strcmp
>  #include "romfile.h" // struct romfile_s
>  #include "malloc.h" // Zone*, _malloc
> +#include "list.h" // struct hlist_node
>  #include "output.h" // warn_*
>  #include "paravirt.h" // qemu_cfg_write_file
>  
> @@ -16,6 +17,16 @@ struct romfile_loader_files {
>  struct romfile_loader_file files[];
>  };
>  
> +// Data structures for storing "write pointer" entries for possible replay
> +struct romfile_wr_pointer_entry {
> +u64 pointer;
> +u32 offset;
> +u16 key;
> +u8 ptr_size;
> +struct hlist_node node;
> +};
> +static struct hlist_head romfile_pointer_list;
> +
>  static struct romfile_loader_file *
>  romfile_loader_find(const char *name,
>  struct romfile_loader_files *files)
> @@ -29,6 +40,19 @@ romfile_loader_find(const char *name,
>  return NULL;
>  }
>  
> +// Replay "write pointer" entries back to QEMU
> +void romfile_fw_cfg_resume(void)
> +{
> +if (!CONFIG_QEMU)
> +return;
> +
> +struct romfile_wr_pointer_entry *entry;
> +hlist_for_each_entry(entry, _pointer_list, node) {
> +qemu_cfg_write_file_simple(>pointer, entry->key,
> +   entry->offset, entry->ptr_size);
> +}
> +}
> +
>  static void romfile_loader_allocate(struct romfile_loader_entry_s *entry,
>  struct romfile_loader_files *files)
>  {
> @@ -163,6 +187,15 @@ static void romfile_loader_write_pointer(struct 
> romfile_loader_entry_s *entry,
>  entry->wr_pointer.size) != 
> entry->wr_pointer.size) {
>  goto err;
>  }
> +
> +/* Store the info so it can replayed later if necessary */
> +struct romfile_wr_pointer_entry *store = malloc_high(sizeof(*store));
> +store->pointer = pointer;
> +store->key = qemu_get_romfile_key(dest_file);

I suggested to remove the error checking here, because
qemu_get_romfile_key() couldn't fail.

However, after Kevin's suggestion for qemu_get_romfile_key(), i.e., to
verify file->copy, that function can still fail.

... But, by the time we get here, we've already used the selector key
implicitly in the call to qemu_cfg_write_file(). And,
qemu_cfg_write_file() does the file->copy check, returns errors
appropriately, and we do check its retval.

So I agree that checking qemu_get_romfile_key()'s retval in addition
would buy us nothing; it's safe like this.

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thanks
Laszlo



> +store->offset = dst_offset;
> +store->ptr_size = entry->wr_pointer.size;
> +hlist_add_head(>node, _pointer_list);
> +
>  return;
>   err:
>  warn_internalerror();
> diff --git a/src/fw/romfile_loader.h b/src/fw/romfile_loader.h
> index 4dc50ab..fcd4ab2 100644
> --- a/src/fw/romfile_loader.h
> +++ b/src/fw/romfile_loader.h
> @@ -86,4 +86,6 @@ enum {
>  
>  int romfile_loader_execute(const char *name);
>  
> +void romfile_fw_cfg_resume(void);
> +
>  #endif
> diff --git a/src/resume.c b/src/resume.c
> index e67cfce..99fa34f 100644
> --- a/src/resume.c
> +++ b/src/resume.c
> @@ -17,6 +17,7 @@
>  #include "string.h" // memset
>  #include "util.h" // dma_setup
>  #include "tcgbios.h" // tpm_s3_resume
> +#include "fw/romfile_loader.h" // romfile_fw_cfg_resume
>  
>  // Handler for post calls that look like a resume.
>  void VISIBLE16
> @@ -105,6 +106,9 @@ s3_resume(void)
>  tpm_s3_resume();
>  s3_resume_vga();
>  
> +/* Replay any fw_cfg entries that go back to the host */
> +romfile_fw_cfg_resume();
> +
>  make_bios_readonly();
>  
>  // Invoke the resume vector.
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v6 4/5] QEMU fw_cfg: Add functions for accessing files by key

2017-02-20 Thread Laszlo Ersek
On 02/20/17 21:14, b...@skyportsystems.com wrote:
> From: Ben Warren <b...@skyportsystems.com>
> 
> Due to memory contraints, when resuming from S3 the fw_cfg "files" API
> isn't available.  This adds a simple API to get a file 'key', and to
> write to the file using the key as a reference.
> 
> Signed-off-by: Ben Warren <b...@skyportsystems.com>
> ---
>  src/fw/paravirt.c | 32 ++--
>  src/fw/paravirt.h |  2 ++
>  2 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 4618647..225b08b 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -329,6 +329,17 @@ qemu_cfg_read_file(struct romfile_s *file, void *dst, 
> u32 maxlen)
>  return file->size;
>  }
>  
> +// Bare-bones function for writing a file knowing only its unique
> +// identifying key (select)
> +int
> +qemu_cfg_write_file_simple(void *src, u16 key, u32 offset, u32 len)
> +{
> +qemu_cfg_select(key);
> +qemu_cfg_skip(offset);
> +qemu_cfg_write(src, len);
> +return len;
> +}
> +
>  int
>  qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 len)
>  {
> @@ -339,15 +350,12 @@ qemu_cfg_write_file(void *src, struct romfile_s *file, 
> u32 offset, u32 len)
>  warn_internalerror();
>  return -1;
>  }
> -struct qemu_romfile_s *qfile;
> -qfile = container_of(file, struct qemu_romfile_s, file);
> +u16 key = qemu_get_romfile_key(file);
>  if (offset == 0) {
>  /* Do it in one transfer */
> -qemu_cfg_write_entry(src, qfile->select, len);
> +qemu_cfg_write_entry(src, key, len);
>  } else {
> -qemu_cfg_select(qfile->select);
> -qemu_cfg_skip(offset);
> -qemu_cfg_write(src, len);
> +qemu_cfg_write_file_simple(src, key, offset, len);
>  }
>  return len;
>  }

One of the ideas that I mentioned here was to move not just the second
branch of the "if" to qemu_cfg_write_file_simple(), but the entire "if"
-- both branches. Because, qemu_cfg_write_entry() looks suitable for S3
too, and if that kind of optimization makes sense for normal boot, then
it makes sense for S3 resume as well.

Anyway, this is not a functional problem, I won't obsess about it.

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thanks
Laszlo


> @@ -370,6 +378,18 @@ qemu_romfile_add(char *name, int select, int skip, int 
> size)
>  }
>  
>  u16
> +qemu_get_romfile_key(struct romfile_s *file)
> +{
> +struct qemu_romfile_s *qfile;
> +if (file->copy != qemu_cfg_read_file) {
> +warn_internalerror();
> +return 0;
> +}
> +qfile = container_of(file, struct qemu_romfile_s, file);
> +return qfile->select;
> +}
> +
> +u16
>  qemu_get_present_cpus_count(void)
>  {
>  u16 smp_count = 0;
> diff --git a/src/fw/paravirt.h b/src/fw/paravirt.h
> index fb220d8..16f3d9a 100644
> --- a/src/fw/paravirt.h
> +++ b/src/fw/paravirt.h
> @@ -56,5 +56,7 @@ void qemu_cfg_init(void);
>  
>  u16 qemu_get_present_cpus_count(void);
>  int qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 
> len);
> +int qemu_cfg_write_file_simple(void *src, u16 key, u32 offset, u32 len);
> +u16 qemu_get_romfile_key(struct romfile_s *file);
>  
>  #endif
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v5 5/5] QEMU fw_cfg: Write fw_cfg back on S3 resume

2017-02-20 Thread Laszlo Ersek
On 02/18/17 07:21, b...@skyportsystems.com wrote:
> From: Ben Warren 
> 
> Any pointers to BIOS-allocated memory that were written back to QEMU
> fw_cfg files are replayed when resuming from S3 sleep.
> 
> Signed-off-by: Ben Warren 
> ---
>  src/fw/romfile_loader.c | 35 +++
>  src/fw/romfile_loader.h |  2 ++
>  src/resume.c|  4 
>  3 files changed, 41 insertions(+)
> 
> diff --git a/src/fw/romfile_loader.c b/src/fw/romfile_loader.c
> index 30e7b58..33aaec4 100644
> --- a/src/fw/romfile_loader.c
> +++ b/src/fw/romfile_loader.c
> @@ -4,6 +4,7 @@
>  #include "string.h" // strcmp
>  #include "romfile.h" // struct romfile_s
>  #include "malloc.h" // Zone*, _malloc
> +#include "list.h" // struct hlist_node
>  #include "output.h" // warn_*
>  #include "paravirt.h" // qemu_cfg_write_file
>  
> @@ -16,6 +17,16 @@ struct romfile_loader_files {
>  struct romfile_loader_file files[];
>  };
>  
> +// Data structures for storing "write pointer" entries for possible replay
> +struct romfile_wr_pointer_entry {
> +u64 pointer;
> +u32 offset;
> +u16 key;
> +u8 ptr_size;
> +struct hlist_node node;
> +};
> +static struct hlist_head romfile_pointer_list;
> +
>  static struct romfile_loader_file *
>  romfile_loader_find(const char *name,
>  struct romfile_loader_files *files)
> @@ -29,6 +40,16 @@ romfile_loader_find(const char *name,
>  return NULL;
>  }
>  
> +// Replay "write pointer" entries back to QEMU
> +void romfile_fw_cfg_resume(void)
> +{
> +struct romfile_wr_pointer_entry *entry;
> +hlist_for_each_entry(entry, _pointer_list, node) {
> +qemu_cfg_write_file_simple(>pointer, entry->key,
> +   entry->offset, entry->ptr_size);
> +}
> +}
> +
>  static void romfile_loader_allocate(struct romfile_loader_entry_s *entry,
>  struct romfile_loader_files *files)
>  {
> @@ -163,6 +184,20 @@ static void romfile_loader_write_pointer(struct 
> romfile_loader_entry_s *entry,
>  entry->wr_pointer.size) != 
> entry->wr_pointer.size) {
>  goto err;
>  }
> +
> +/* Store the info so it can replayed later if necessary */
> +struct romfile_wr_pointer_entry *store = _malloc(,
> + sizeof(*store), 4);

I don't know enough of the SeaBIOS memory allocation system to know if
this is safe. I assume this will place the allocation in reserved memory.

> +struct hlist_node **pprev = _pointer_list.first;
> +store->pointer = pointer;
> +store->key = qemu_get_romfile_key(dest_file);
> +if (store->key == 0) {
> +goto err;
> +}

Based on my comment on container_of() in patch #4,
qemu_get_romfile_key() shouldn't be able to fail, so this check should
be unnecessary.

> +store->offset = dst_offset;
> +store->ptr_size = entry->wr_pointer.size;
> +hlist_add(>node, pprev);

I think the code can be simplified a bit, by calling hlist_add_head()
instead (you can drop the local "pprev" variable then):

  hlist_add_head(>node, _pointer_list);

> +
>  return;
>   err:
>  warn_internalerror();
> diff --git a/src/fw/romfile_loader.h b/src/fw/romfile_loader.h
> index 4dc50ab..fcd4ab2 100644
> --- a/src/fw/romfile_loader.h
> +++ b/src/fw/romfile_loader.h
> @@ -86,4 +86,6 @@ enum {
>  
>  int romfile_loader_execute(const char *name);
>  
> +void romfile_fw_cfg_resume(void);
> +
>  #endif
> diff --git a/src/resume.c b/src/resume.c
> index e67cfce..99fa34f 100644
> --- a/src/resume.c
> +++ b/src/resume.c
> @@ -17,6 +17,7 @@
>  #include "string.h" // memset
>  #include "util.h" // dma_setup
>  #include "tcgbios.h" // tpm_s3_resume
> +#include "fw/romfile_loader.h" // romfile_fw_cfg_resume
>  
>  // Handler for post calls that look like a resume.
>  void VISIBLE16
> @@ -105,6 +106,9 @@ s3_resume(void)
>  tpm_s3_resume();
>  s3_resume_vga();
>  
> +/* Replay any fw_cfg entries that go back to the host */
> +romfile_fw_cfg_resume();
> +

The functionality in "romfile_loader.c" is conditional on CONFIG_QEMU:

  qemu_platform_setup()[src/fw/paravirt.c]
// returns immediately if !CONFIG_QEMU
romfile_loader_execute()   [src/fw/romfile_loader.c]

So we shouldn't make the replay unconditional either.

Although "romfile_pointer_list" will be empty, if
romfile_loader_execute() never runs, we should also save on code size
(-> build time optimization) if CONFIG_QEMU is false. Please add the check

if (!CONFIG_QEMU)
return;

to the top of romfile_fw_cfg_resume(), similarly to the example in
smp_resume() [src/fw/smp.c], which is also called from s3_resume().

(Comments from others are most welcome, of course; this is just how I
see things.)

Thanks!
Laszlo

>  make_bios_readonly();
>  
>  // Invoke the resume vector.
> 



Re: [SeaBIOS] [PATCH v5 4/5] QEMU fw_cfg: Add functions for accessing files by key

2017-02-20 Thread Laszlo Ersek
On 02/18/17 07:21, b...@skyportsystems.com wrote:
> From: Ben Warren 
> 
> When resuming from S3, only fw_cfg file keys are known.
> 
> Signed-off-by: Ben Warren 
> ---
>  src/fw/paravirt.c | 23 +++
>  src/fw/paravirt.h |  2 ++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 4618647..e513dd5 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -352,6 +352,17 @@ qemu_cfg_write_file(void *src, struct romfile_s *file, 
> u32 offset, u32 len)
>  return len;
>  }
>  
> +// Bare-bones function for writing a file knowing only its unique
> +// identifying key (select)
> +int
> +qemu_cfg_write_file_simple(void *src, u16 key, u32 offset, u32 len)
> +{
> +qemu_cfg_select(key);
> +qemu_cfg_skip(offset);
> +qemu_cfg_write(src, len);
> +return len;
> +}

Does anything counter-indicate the customization seen in
qemu_cfg_write_file() as well, that is, call qemu_cfg_write_entry() if
offset is zero?

If not, then I think you could even split out that part of
qemu_cfg_write_file() as qemu_cfg_write_file_simple(), and call it from
qemu_cfg_write_file().

> +
>  static void
>  qemu_romfile_add(char *name, int select, int skip, int size)
>  {
> @@ -370,6 +381,18 @@ qemu_romfile_add(char *name, int select, int skip, int 
> size)
>  }
>  
>  u16
> +qemu_get_romfile_key(struct romfile_s *file)
> +{
> +struct qemu_romfile_s *qfile;
> +qfile = container_of(file, struct qemu_romfile_s, file);

If the input pointer "file" was valid, then container_of() cannot
produce a NULL pointer. So I suggest to drop the code that depends on that.

> +if (!qfile) {
> +warn_internalerror();
> +return 0;
> +}
> +return qfile->select;
> +}

This could be reused in qemu_cfg_write_file() too, so that that function
would remain:
- initial checks
- qemu_cfg_write_file_simple( ..., qemu_get_romfile_key(), ...)

I think patch #1 is fine as is, so I'd keep this refactoring in patch #4.

Thanks
Laszlo

> +
> +u16
>  qemu_get_present_cpus_count(void)
>  {
>  u16 smp_count = 0;
> diff --git a/src/fw/paravirt.h b/src/fw/paravirt.h
> index fb220d8..16f3d9a 100644
> --- a/src/fw/paravirt.h
> +++ b/src/fw/paravirt.h
> @@ -56,5 +56,7 @@ void qemu_cfg_init(void);
>  
>  u16 qemu_get_present_cpus_count(void);
>  int qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 
> len);
> +int qemu_cfg_write_file_simple(void *src, u16 key, u32 offset, u32 len);
> +u16 qemu_get_romfile_key(struct romfile_s *file);
>  
>  #endif
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v5 3/5] QEMU fw_cfg: Add command to write back address of file

2017-02-20 Thread Laszlo Ersek
On 02/18/17 07:21, b...@skyportsystems.com wrote:
> From: Ben Warren <b...@skyportsystems.com>
> 
> This command is similar to ADD_POINTER, but instead of patching
> memory, it writes the pointer back to QEMU over the DMA interface.
> 
> Signed-off-by: Ben Warren <b...@skyportsystems.com>
> ---
>  src/fw/romfile_loader.c | 45 +
>  src/fw/romfile_loader.h | 23 ---
>  2 files changed, 65 insertions(+), 3 deletions(-)

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thanks
Laszlo

> diff --git a/src/fw/romfile_loader.c b/src/fw/romfile_loader.c
> index 7737453..30e7b58 100644
> --- a/src/fw/romfile_loader.c
> +++ b/src/fw/romfile_loader.c
> @@ -5,6 +5,7 @@
>  #include "romfile.h" // struct romfile_s
>  #include "malloc.h" // Zone*, _malloc
>  #include "output.h" // warn_*
> +#include "paravirt.h" // qemu_cfg_write_file
>  
>  struct romfile_loader_file {
>  struct romfile_s *file;
> @@ -127,6 +128,46 @@ err:
>  warn_internalerror();
>  }
>  
> +static void romfile_loader_write_pointer(struct romfile_loader_entry_s 
> *entry,
> + struct romfile_loader_files *files)
> +{
> +struct romfile_s *dest_file;
> +struct romfile_loader_file *src_file;
> +unsigned dst_offset = le32_to_cpu(entry->wr_pointer.dst_offset);
> +unsigned src_offset = le32_to_cpu(entry->wr_pointer.src_offset);
> +u64 pointer = 0;
> +
> +/* Writing back to a file that may not be loaded in RAM */
> +dest_file = romfile_find(entry->wr_pointer.dest_file);
> +src_file = romfile_loader_find(entry->wr_pointer.src_file, files);
> +
> +if (!dest_file || !src_file || !src_file->data ||
> +dst_offset + entry->wr_pointer.size < dst_offset ||
> +dst_offset + entry->wr_pointer.size > dest_file->size ||
> +src_offset >= src_file->file->size ||
> +entry->wr_pointer.size < 1 || entry->wr_pointer.size > 8 ||
> +entry->wr_pointer.size & (entry->wr_pointer.size - 1)) {
> +goto err;
> +}
> +
> +pointer = (unsigned long)src_file->data + src_offset;
> +/* Make sure the pointer fits within wr_pointer.size */
> +if ((entry->wr_pointer.size != sizeof(u64)) &&
> +((pointer >> (entry->wr_pointer.size * 8)) > 0)) {
> +goto err;
> +}
> +pointer = cpu_to_le64(pointer);
> +
> +/* Only supported on QEMU */
> +if (qemu_cfg_write_file(, dest_file, dst_offset,
> +entry->wr_pointer.size) != 
> entry->wr_pointer.size) {
> +goto err;
> +}
> +return;
> + err:
> +warn_internalerror();
> +}
> +
>  int romfile_loader_execute(const char *name)
>  {
>  struct romfile_loader_entry_s *entry;
> @@ -161,6 +202,10 @@ int romfile_loader_execute(const char *name)
>  break;
>  case ROMFILE_LOADER_COMMAND_ADD_CHECKSUM:
>  romfile_loader_add_checksum(entry, files);
> +break;
> +case ROMFILE_LOADER_COMMAND_WRITE_POINTER:
> +romfile_loader_write_pointer(entry, files);
> +break;
>  default:
>  /* Skip commands that we don't recognize. */
>  break;
> diff --git a/src/fw/romfile_loader.h b/src/fw/romfile_loader.h
> index bce3719..4dc50ab 100644
> --- a/src/fw/romfile_loader.h
> +++ b/src/fw/romfile_loader.h
> @@ -51,15 +51,32 @@ struct romfile_loader_entry_s {
>  u32 length;
>  } cksum;
>  
> +/*
> + * COMMAND_WRITE_POINTER - Write back to a host file via DMA,
> + * @wr_pointer.dest_file at offset @wr_pointer.dst_offset, a pointer
> + * to the table originating from @wr_pointer.src_file at offset
> + * @wr_pointer.src_offset.
> + * 1,2,4 or 8 byte unsigned addition is used depending on
> + * @wr_pointer.size.
> + */
> +struct {
> +char dest_file[ROMFILE_LOADER_FILESZ];
> +char src_file[ROMFILE_LOADER_FILESZ];
> +u32 dst_offset;
> +u32 src_offset;
> +u8 size;
> +} wr_pointer;
> +
>  /* padding */
>  char pad[124];
>  };
>  };
>  
>  enum {
> -ROMFILE_LOADER_COMMAND_ALLOCATE = 0x1,
> -ROMFILE_LOADER_COMMAND_ADD_POINTER  = 0x2,
> -ROMFILE_LOADER_COMMAND_ADD_CHECKSUM = 0x3,
> +ROMFILE_LOADER_COMMAND_ALLOCATE  = 0x1,
> +ROMFILE_LOADER_COMMAND_ADD_POINTER   = 0x2,
> +ROMFILE_LOADER_COMMAND_ADD_CHECKSUM  = 0x3,
> +ROMFILE_LOADER_COMMAND_WRITE_POINTER = 0x4,
>  };
>  
>  enum {
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH v5 1/5] QEMU DMA: Add DMA write capability

2017-02-20 Thread Laszlo Ersek
On 02/18/17 07:21, b...@skyportsystems.com wrote:
> From: Ben Warren <b...@skyportsystems.com>
> 
> This allows BIOS to write data back to QEMU using the DMA interface and
> provides a higher-level abstraction to write to a fw_cfg file
> 
> Signed-off-by: Ben Warren <b...@skyportsystems.com>
> ---
>  src/fw/paravirt.c | 49 +
>  src/fw/paravirt.h |  3 +++
>  2 files changed, 52 insertions(+)

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thanks
Laszlo

> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c
> index 6de70f6..4618647 100644
> --- a/src/fw/paravirt.c
> +++ b/src/fw/paravirt.c
> @@ -253,6 +253,20 @@ qemu_cfg_read(void *buf, int len)
>  }
>  
>  static void
> +qemu_cfg_write(void *buf, int len)
> +{
> +if (len == 0) {
> +return;
> +}
> +
> +if (qemu_cfg_dma_enabled()) {
> +qemu_cfg_dma_transfer(buf, len, QEMU_CFG_DMA_CTL_WRITE);
> +} else {
> +warn_internalerror();
> +}
> +}
> +
> +static void
>  qemu_cfg_skip(int len)
>  {
>  if (len == 0) {
> @@ -280,6 +294,18 @@ qemu_cfg_read_entry(void *buf, int e, int len)
>  }
>  }
>  
> +static void
> +qemu_cfg_write_entry(void *buf, int e, int len)
> +{
> +if (qemu_cfg_dma_enabled()) {
> +u32 control = (e << 16) | QEMU_CFG_DMA_CTL_SELECT
> +| QEMU_CFG_DMA_CTL_WRITE;
> +qemu_cfg_dma_transfer(buf, len, control);
> +} else {
> +warn_internalerror();
> +}
> +}
> +
>  struct qemu_romfile_s {
>  struct romfile_s file;
>  int select, skip;
> @@ -303,6 +329,29 @@ qemu_cfg_read_file(struct romfile_s *file, void *dst, 
> u32 maxlen)
>  return file->size;
>  }
>  
> +int
> +qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 len)
> +{
> +if ((offset + len) > file->size)
> +return -1;
> +
> +if (!qemu_cfg_dma_enabled() || (file->copy != qemu_cfg_read_file)) {
> +warn_internalerror();
> +return -1;
> +}
> +struct qemu_romfile_s *qfile;
> +qfile = container_of(file, struct qemu_romfile_s, file);
> +if (offset == 0) {
> +/* Do it in one transfer */
> +qemu_cfg_write_entry(src, qfile->select, len);
> +} else {
> +qemu_cfg_select(qfile->select);
> +qemu_cfg_skip(offset);
> +qemu_cfg_write(src, len);
> +}
> +return len;
> +}
> +
>  static void
>  qemu_romfile_add(char *name, int select, int skip, int size)
>  {
> diff --git a/src/fw/paravirt.h b/src/fw/paravirt.h
> index d8eb7c4..fb220d8 100644
> --- a/src/fw/paravirt.h
> +++ b/src/fw/paravirt.h
> @@ -3,6 +3,7 @@
>  
>  #include "config.h" // CONFIG_*
>  #include "biosvar.h" // GET_GLOBAL
> +#include "romfile.h" // struct romfile_s
>  
>  // Types of paravirtualized platforms.
>  #define PF_QEMU (1<<0)
> @@ -43,6 +44,7 @@ static inline int runningOnKVM(void) {
>  #define QEMU_CFG_DMA_CTL_READ0x02
>  #define QEMU_CFG_DMA_CTL_SKIP0x04
>  #define QEMU_CFG_DMA_CTL_SELECT  0x08
> +#define QEMU_CFG_DMA_CTL_WRITE   0x10
>  
>  // QEMU_CFG_DMA ID bit
>  #define QEMU_CFG_VERSION_DMA2
> @@ -53,5 +55,6 @@ void qemu_platform_setup(void);
>  void qemu_cfg_init(void);
>  
>  u16 qemu_get_present_cpus_count(void);
> +int qemu_cfg_write_file(void *src, struct romfile_s *file, u32 offset, u32 
> len);
>  
>  #endif
> 


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


Re: [SeaBIOS] varlow/extrastack vs code

2017-02-17 Thread Laszlo Ersek
On 02/17/17 14:10, Paolo Bonzini wrote:
> 
> 
> On 15/02/2017 16:52, Kevin O'Connor wrote:
>> On Wed, Feb 15, 2017 at 11:01:03AM +0100, Laszlo Ersek wrote:
>>> On 02/14/17 20:14, Kevin O'Connor wrote:
>>>> On Tue, Feb 14, 2017 at 07:52:01PM +0100, Laszlo Ersek wrote:
>>>>> If item (1) is fixed in QEMU, then the above "root cause" goes away, and
>>>>> the workaround in SeaBIOS can be conditionalized. Am I wrong?
>>>>
>>>> I'm not sure.  If I recall correctly, there are different resets on
>>>> the x86 - some only reset the cpu and some do a "full machine reset".
>>>> SeaBIOS attempts a variety of different reset mechanisms to reboot and
>>>> I'm not sure which are supposed to do the full reset.  If seabios does
>>>> a "reset cpu" mechanism before a "reset machine" mechanism, then qemu
>>>> resetting the pam may not help.
>>>
>>> To my knowledge, QEMU implements only one kind of system reset, with
>>> qemu_system_reset_request(). It is supposed to
>>> - reset all VCPUs (it puts all APs back into "wait for
>>>   INIT-SIPI-SIPI"),
>>> - reset all chipset registers,
>>> - reset all devices,
>>> - not touch guest RAM.
>>
>> Thanks Laszlo - I appreciate your very detailed response.
>>
>> Way back in the day, the 286 had no way to return to real mode from
>> protected mode.  So, the BIOS had this funky method of detecting fake
>> reboots to use as a mode switch.  If there is only one kind of QEMU
>> reset then I think resetting the pam register in it would break this
>> type of resume in SeaBIOS.  :-/
>>
>> So, I'm not sure what the right approach is wrt the PAM registers.
> 
> My memories are fuzzy, but I remember discussing whether the PAM
> registers should be preserved or not by S3, and I think the answer was
> that on real hardware they are preserved by S3.
> 
> Now, S3 is mostly the same as a reset from the firmware's point of view.
>  The conclusion then would be that a reset should _not_ be touching the
> PAM registers.

Aha! Indeed, you may have stated sth like the PAM regs were wired to the
same "power rail" (or whatever the heck) as that main memory.

Let me dig... Yes (it was a cross-posted thread between SeaBIOS and
qemu, in 2013):

msgid: 5136469f.7080...@redhat.com
https://www.mail-archive.com/seabios@seabios.org/msg04521.html
https://www.mail-archive.com/qemu-devel@nongnu.org/msg159003.html

(Incredible that these issues just keep popping up!)

Thanks!
Laszlo


___
SeaBIOS mailing list
SeaBIOS@seabios.org
https://www.coreboot.org/mailman/listinfo/seabios


  1   2   3   >