Re: [edk2] [RFC 0/3] Enable runtime serial debug for ArmVirtQemu

2019-03-13 Thread Brian J. Johnson

On 3/12/19 12:28 PM, Andrew Fish via edk2-devel wrote:




On Mar 12, 2019, at 10:05 AM, Laszlo Ersek  wrote:

Hello Heyi,

On 03/12/19 07:56, Heyi Guo wrote:

Hi Laszlo,

I'm thinking about a proposal as below:

1. Reuse the framework of
MdePkg/Library/DxeRuntimeDebugLibSerialPort/DxeRuntimeDebugLibSerialPort.inf

2. The boot time behavior of this module is not changed
3. Switch to status code protocol for runtime debug
4. Use an overridden SerialPortLib instance for
StatusCodeHandlerRuntimeDxe; the instance must support runtime access.

Then we can have below benefits:
1. the boot time firmware log, and the OS log, went to one port; and
there is no dependence for runtime drivers to output serial message at
boot time.
2. By implementing different instances of SerialPortLib for
StatusCodeHandlerRuntimeDxe, we can have enough flexibility to direct
runtime serial message to platform specific serial port.


This looks a bit over-engineered to me. Ultimately our goal is:
- for DEBUGs to reach "a" serial port at runtime -- namely one that
differs from the "normal" serial port.

Given that you'd have to implement a "special" SerialPortLib instance
just for StatusCodeHandlerRuntimeDxe anyway, can we perhaps:

(1) Undo -- dependent on a build flag? -- commit ebfca258f5d7; i.e.,
stick universally with BaseDebugLibSerialPort, for DebugLib,

(2) add a new implementation of "FdtPL011SerialPortLib.inf" with both
(a) runtime behavior and (b) handling of a special serial port?

In other words, push down the "runtime?" distinction from DebugLib (see
commit 7f029e1b31ee) to SerialPortLib. The new SerialPortLib instance
would be used only with DXE_RUNTIME_DRIVERs.

The lib instance would prepare everything in advance for accessing the
"special" serial port at runtime (which could require allocating runtime
memory in advance). Once ExitBootServices() is called, a callback should
switch over the behavior of the lib instance, without allocating further
memory. (The switched-over behavior would not have to be functional
until the virtual address map is set.)



Laszlo,

Runtime does not quite work like that. As soon as the Set Virtual Address map 
fires it is only legal to call things EFI RT from the virtual mapping. The last 
stage of the Set Virtual Address Map call is to reapply the fixups in the 
Runtime Drivers for the virtual address space. That means any absolute address 
reference in the code now points to the virtual address. Also the Set Virtual 
Address event told the Runtime Driver to convert all its pointers to point to 
the new virtual address space.

The blackout and the fact you need to hide the hardware from the OS make doing 
things at EFI Runtime Time hard.

I agree with you that since our logging is really just a serial stream we don't 
require Report Status Code. Report Status Code exists so that the old PC POST 
Cards could still be supported, see: 
https://en.wikipedia.org/wiki/Power-on_self-test#Error_reporting. There is also 
a Report Status Code layer that allows redirecting the stream to multiple 
agents, so for example you could send data to port 80 (POST Card) and write out 
the same values in the serial stream. I find lookup up Report Status Code codes 
very tedious and not really helpful for debugging vs. DEBUG prints.

Thanks,

Andrew Fish


Andrew, wasn't Report Status Code also intended to reduce code size? 
PeiDxeDebugLibReportStatusCode packages up DEBUG() arguments into a 
structure and passes it to the status code handler.  It avoids having to 
link a PrintLib and SerialPortLib instance into essentially every 
module.  At least that was the intent... IIRC at some point a few years 
ago folks realized that VA_LIST wasn't portable across compilers, so now 
PeiDxeDebugLibReportStatusCode converts VA_LIST into BASE_LIST, which 
requires parsing the format string much like PrintLib does.  No idea if 
that actually results in a space savings over BaseDebugLibSerialPort, 
especially for really simple SerialPortLib instances.


+1 for the difficulty of decoding Report Status Code codes... there has 
to be a better way to do that than manually parsing through 
PiStatusCode.h and friends.


Brian Johnson




I've always seen status code reporting cumbersome in comparison to plain
DEBUG messages. My understanding is that status code reporting targets
devices that are even dumber than serial ports. But, we do target a
second serial port. So if we can keep the switchover internal to the
SerialPortLib class, at least for runtime DXE drivers, then I think we
should do that.

Thanks,
Laszlo


On 2019/3/1 23:27, Laszlo Ersek wrote:

+Peter, for the last few paragraphs

On 02/28/19 13:10, Ard Biesheuvel wrote:

On Thu, 28 Feb 2019 at 09:06, Heyi Guo  wrote:

Serial port output is useful when debugging UEFI runtime services in
OS runtime.
The patches are trying to provide a handy method to enable runtime
serial port
debug for ArmVirtQemu.

Cc: Jian J Wang 
Cc: Hao Wu 
Cc: Laszlo Ersek 
Cc: Ard Biesheuvel 

Re: [edk2] [PATCH 06/10] OvmfPkg/Sec: Disable optimizations for TemporaryRamMigration

2019-02-19 Thread Brian J. Johnson

On 2/18/19 3:32 AM, Ard Biesheuvel wrote:

On Mon, 18 Feb 2019 at 10:08, Jordan Justen  wrote:


On 2019-02-17 23:53:01, Ard Biesheuvel wrote:

On Mon, 18 Feb 2019 at 05:12, Jordan Justen  wrote:




This needs an explanation why optimization needs to be disabled.


I'm not sure this is required. The reason I added these patches is to
hopefully prevent the compiler from removing the frame pointer. We
adjust the frame pointer in the code, and that is a little sketchy if
the frame pointer isn't being used.

Unfortunately, it can reasonably be argued that the
TemporaryRamSupport PPI definition ultimately makes it unsafe to write
the migration code in C.

I tried reverting both the EmulatorPkg and OvmfPkg patches for
disabling the optimizations, and with my setup there was no impact. I
think there is a good change that we'd be pretty safe to just drop
these two patches to wait and see if someone encounters a situation
that requires it.

Ok, so based on this explanation, do you think I should add info to
the commit message and keep the patches, or just drop them?



I think 'little sketchy' is an understatement here (as is
setjmp/longjmp in general), but it is the reality we have to deal with
when writing startup code in C. Looking at the code, I agree that the
fact that [re]bp is assigned directly implies that we should not
permit it to be used as a general purpose register, especially when
you throw LTO into the mix, which could produce all kinds of
surprising results when it decides to inline functions being called
from here.

For GCC/Clang, I don't think it is correct to assume that changing the
optimization level will result in -fno-omit-frame-pointer to be set,
so I'd prefer setting that option directly, either via the pragma, or
for the whole file.

For MSVC, I have no idea how to tweak the compiler to force it to emit
frame pointers.



https://docs.microsoft.com/en-us/cpp/build/reference/oy-frame-pointer-omission?view=vs-2017

Brian






Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Jordan Justen 
Cc: Laszlo Ersek 
Cc: Ard Biesheuvel 
Cc: Anthony Perard 
Cc: Julien Grall 
---
  OvmfPkg/Sec/SecMain.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/OvmfPkg/Sec/SecMain.c b/OvmfPkg/Sec/SecMain.c
index 46ac739862..86c22a2ac9 100644
--- a/OvmfPkg/Sec/SecMain.c
+++ b/OvmfPkg/Sec/SecMain.c
@@ -873,6 +873,13 @@ SecStartupPhase2(
CpuDeadLoop ();
  }

+#ifdef __GNUC__
+#pragma GCC push_options
+#pragma GCC optimize ("O0")
+#else
+#pragma optimize ("", off)
+#endif
+
  EFI_STATUS
  EFIAPI
  TemporaryRamMigration (
@@ -946,3 +953,8 @@ TemporaryRamMigration (
return EFI_SUCCESS;
  }

+#ifdef __GNUC__
+#pragma GCC pop_options
+#else
+#pragma optimize ("", on)
+#endif


I can't tell from the context if this is the end of the file, but if
it is not, aren't you turning on optimization here for non-GCC even if
it was not enabled on the command line to begin with?

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] Drop CSM support in OvmfPkg?

2019-01-25 Thread Brian J. Johnson

On 1/24/19 5:30 AM, Laszlo Ersek wrote:

On 01/24/19 10:31, David Woodhouse wrote:

On Thu, 2019-01-24 at 01:48 +, Ni, Ray wrote:

David,
I think we got an agreement here to move CSM components in OvmfPkg.
I prefer we firstly clone the required CSM components in OvmfPkg right no.
Finally I can remove the IntelFrameworkModulePkg/IntelFrameworkPkg in one patch.
(I say "finally" because OVMF CSM dependency is not the only case that prevent 
removing
the two framework packages.)

Would you like to do the clone? Or if you are busy, I can do that.


I keep asking this question, I don't believe I've seen an
answer. Apologies if I've missed it.


I think you haven't. And, I'm curious too. :)

Thanks
Laszlo


Is this code genuinely not going to continue to exist anywhere else in
the Intel ecosystem, any more?

No TianoCore-based images from this point forth are ever going to even
have the option of supporting CSM?

Unless some third party also chooses to fork the CSM support code and
keep it on for themselves?



In fall of 2017, Intel declared their intention to end legacy BIOS 
support on their platforms by 2020.


http://www.uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf

I believe they have stuck to this story at subsequent UEFI plugfests.

--

Brian



   "Occasionally get out of the office, the lab, the computer room,
smell the flowers and take a look at the physical world around you."
   -- Rob Cook
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [Patch 1/3] UefiCpuPkg/RegisterCpuFeaturesLib: Avoid AP calls PeiService.

2018-12-20 Thread Brian J. Johnson
Agreed.  We've seen issues on real platforms with timed-out spinlocks in 
DXE causing calls to GetPerformanceCounter and DebugAssert.  (DXE has 
the same code, with the same issues.)


Note that it's possible to set PcdSpinLockTimeout=0 to work around the 
issue on a particular platform, or in a particular module.  But if you 
have to do that for every module which uses APs, and hence could contend 
for a spinlock, it kind of defeats the point  We're better off 
removing the timeout code.


Thanks,
Brian

On 12/19/18 8:08 PM, Yao, Jiewen wrote:

Yes, I agree, if we don't have any real case.



-Original Message-
From: Ni, Ruiyu
Sent: Thursday, December 20, 2018 10:07 AM
To: Dong, Eric ; Yao, Jiewen
; edk2-devel@lists.01.org
Cc: Laszlo Ersek 
Subject: RE: [edk2] [Patch 1/3] UefiCpuPkg/RegisterCpuFeaturesLib: Avoid
AP calls PeiService.

Can you just change the AcquireSpinLock() behavior to remove the Timeout
PCD consumption?

I haven't seen a real case that the timed acquisition of spin lock is needed.


Thanks/Ray


-Original Message-
From: Dong, Eric 
Sent: Thursday, December 20, 2018 9:23 AM
To: Yao, Jiewen ; edk2-devel@lists.01.org
Cc: Ni, Ruiyu ; Laszlo Ersek 
Subject: RE: [edk2] [Patch 1/3] UefiCpuPkg/RegisterCpuFeaturesLib: Avoid
AP calls PeiService.


Agreed, Maybe it's time to add a new API like
AcquireSpinLockWithoutTimeOut?

Thanks,
Eric

-Original Message-
From: Yao, Jiewen
Sent: Thursday, December 20, 2018 9:19 AM
To: Dong, Eric ; edk2-devel@lists.01.org
Cc: Ni, Ruiyu ; Laszlo Ersek 
Subject: RE: [edk2] [Patch 1/3] UefiCpuPkg/RegisterCpuFeaturesLib:
Avoid AP calls PeiService.

Hi
If we think below code is generic, can we have an API for that?

+  //
+  // Wait for the AP to release the MSR spin lock.
+  //
+  while (!AcquireSpinLockOrFail (>ConsoleLogLock)) {
+CpuPause ();
+  }





-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf
Of Eric Dong
Sent: Thursday, December 20, 2018 9:16 AM
To: edk2-devel@lists.01.org
Cc: Ni, Ruiyu ; Laszlo Ersek 
Subject: [edk2] [Patch 1/3] UefiCpuPkg/RegisterCpuFeaturesLib: Avoid
AP calls PeiService.

In AcquireSpinLock function, it calls GetPerformanceCounter which
final calls PeiService service. This patch avoid to call
AcquireSpinLock function.

BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1411

Cc: Ruiyu Ni 
Cc: Laszlo Ersek 
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong 
---
  UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c |
7
++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git
a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c
b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c
index 624ddee055..a64326239f 100644
---
a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c
+++

b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.

+++ c
@@ -832,7 +832,12 @@ ProgramProcessorRegister (
  RegisterTableEntry = [Index];

  DEBUG_CODE_BEGIN ();
-  AcquireSpinLock (>ConsoleLogLock);
+  //
+  // Wait for the AP to release the MSR spin lock.
+  //
+  while (!AcquireSpinLockOrFail (>ConsoleLogLock)) {
+CpuPause ();
+  }
ThreadIndex = ApLocation->Package *

CpuStatus->MaxCoreCount *

CpuStatus->MaxThreadCount +
ApLocation->Core * CpuStatus->MaxThreadCount +
ApLocation->Thread;
--
2.15.0.windows.1

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [Patch] UefiCpuPkg/MpLib: Fix PEI Services Table pointer on AP

2018-12-06 Thread Brian J. Johnson
n Megatrends, Inc.  This communication is intended to be read only by 
the individual or entity to whom it is addressed or by their designee. If the 
reader of this message is not the intended recipient, you are on notice that 
any distribution of this message, in any form, is strictly prohibited.  Please 
promptly notify the sender by reply e-mail or by telephone at 770-246-8600, and 
then delete or destroy all copies of the transmission.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [edk2-announce] Research Request

2018-11-27 Thread Brian J. Johnson

On 11/27/18 6:53 AM, Laszlo Ersek wrote:

On 11/26/18 22:43, Jeremiah Cox via edk2-devel wrote:

Feedback on GitHub as follows…



1. No Lock-In - What automated data export is available?
We want to be able to leave and take all our data with us. "Data" here
includes: review comments, pull requests / patches (including metadata),
old (rejected) pull requests and metadata, issue tracker entries and
comments (if issue tracker included). This archiving should be
automated, not something we do by hand.

Untested, but might these all be easily satisfied by subscribing a mailing list 
to GitHub notifications?
https://help.github.com/articles/about-notifications/#watching-notifications  
https://help.github.com/articles/about-email-notifications/  

No, they are insufficient.

Following the last link above ("about-email-notifications"), one finds
several other links; and one of those is:

https://help.github.com/articles/about-notifications/

This article says,

 GitHub sends participating notifications when you're directly
 involved in activities or conversations within a repository or a
 team you're a member of. You'll receive a notification when:

 [...]

 - You open, comment on, or close an issue or pull request.

 [...]

This is demonstrably false. I'm a member of the TianoCore organization,
I have commented on, and closed (rejected):

   https://github.com/tianocore/edk2/pull/133

and I *never*  received an email notification about my *own*  comment /
action. I only received the initial email, about the pull request being
opened (attached for reference).


Try going to the "Settings" item under the menu in the top-right corner, 
and clicking on the "Notifications" tab on the left.  Under "Email 
notification preferences" there should be a checkbox for "Include your 
own updates".  That may do what you need.


--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] ** NOTICE ** edk2-devel mailing list configuration changes

2018-10-30 Thread Brian J. Johnson
I'm using Thunderbird 60.2.1 on Linux.  If I respond to Leif's message 
(the one quoted below) with the "reply" button, it's addressed only to 
him.  If I use "reply all", it goes to him, the list, and all other 
recipients of the original message.  (That's what I did on this 
response.)  If I use "reply to mailing list", it goes only to the list.


So exactly what I'd expect.  No issues here.

Brian

On 10/30/18 3:49 AM, Leif Lindholm wrote:

Hi Mike,

That resolves the issue at my end, thanks!. But it would be good to
know how it works for others (does Intel have a default mail client
config, and could someone else verify the behaviour is how you would
normally expect?).

Regards,

Leif

On Mon, Oct 29, 2018 at 10:40:06PM +, Kinney, Michael D wrote:

Hi Leif,

I can put the reply_goes_to_list option back to "Poster".

In that configuration, a user that has a DMARC policy of
reject will still have their from address munged.

But I noticed that the edk2-devel mailing list is not
present when anyone does a Reply-all to an email with
a munged from address.  That implied to me that everyone
would need to check if the edk2-devel mailing has been
removed from a Reply-all and add it back manually.  This
also seems like a non-ideal configuration option.

However, the behavior I am seeing could be due to some
of my client settings.

So I will put the reply_goes_to_list option back to
"Poster".

Mike


-Original Message-
From: Leif Lindholm [mailto:leif.lindh...@linaro.org]
Sent: Monday, October 29, 2018 2:10 PM
To: Kinney, Michael D 
Cc: EDK II Development ;
Cetola, Stephano 
Subject: Re: [edk2] ** NOTICE ** edk2-devel mailing list
configuration changes

Hi Mike,

I could hypothesise about which email client you may be
using :)

But let me instead mention that the two email clients I
have (mutt and
gmail web interface) behave identically - neither adds
the original
sender to cc when the list server forces a reply-to
header.

Regards,

Leif

On Mon, Oct 29, 2018 at 08:49:09PM +, Kinney,
Michael D wrote:

Leif,

Very strange.  When I do the same on that email, it
shows Paul on the To address line.

Mike


-Original Message-
From: Leif Lindholm

[mailto:leif.lindh...@linaro.org]

Sent: Monday, October 29, 2018 1:40 PM
To: Kinney, Michael D 
Cc: EDK II Development ;
Cetola, Stephano 
Subject: Re: [edk2] ** NOTICE ** edk2-devel mailing

list

configuration changes

Hi Mike,

When I try to "reply-to", the email from Paul A

Lohr,

sent 10 minutes
after your one below, he does not show up in either

"to"

or "cc".

OK, I missed the excitement during the plugfest.

I'll go

back and see
what I can find there.

Regards,

Leif

On Mon, Oct 29, 2018 at 08:23:43PM +, Kinney,
Michael D wrote:

Leif,

I have enabled a different configuration setting
that should be better.

Please try some emails and let me know if there
are any impacts.

The reason for these changes is the DMARC related
issue that occurred on 10-19-2018 that required a
number of users to be disabled.  The goal of these
changes is to enable those users to be re-

activated.


Thanks,

Mike


-Original Message-
From: Leif Lindholm

[mailto:leif.lindh...@linaro.org]

Sent: Monday, October 29, 2018 12:54 PM
To: EDK II Development 
Cc: Kinney, Michael D

;

Cetola, Stephano 
Subject: Re: [edk2] ** NOTICE ** edk2-devel

mailing

list

configuration changes

Hi Mike,

On Mon, Oct 29, 2018 at 06:42:44PM +,

Kinney,

Michael D wrote:

Some configuration changes have been made to
the edk2-devel mailing list to handle posts

from

a domain with a DMARC Reject/Quarantine policy
enabled. If this is detected then the from

address

is now munged.

One side effect of this setting is that the
behavior of Reply has changed.  Instead of

being

a reply to the poster of the message, the

Reply

address is the edk2-devel mailing list.


The behaviour looks somewhat broken, since as

far as

I

can tell,
replies now longer include the person you're

replying

to.
(This doesn't happen when replying specifically

to

_you_, because you
cc yourself on everything).


If you wish to send a private reply to only

the

poster of the message, you may have to perform
some manual steps.

Please let me know if you have any concerns

about

these changes or if these configuration

changes

cause any other side effects.


Can we make sure the person being replied to is

at

least

on cc?
Otherwise, we've just broken the workflow for

anyone

filtering on
whether they are on "to" or "cc".

Why was this change necessary?

Regards,

Leif

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] PciSegmentInfoLib instances

2018-06-01 Thread Brian J. Johnson

On 05/28/2018 03:36 PM, Bill Paul wrote:

Of all the gin joints in all the towns in all the world, Ni, Ruiyu had to walk
into mine at 19:55 on Sunday 27 May 2018 and say:


No. There is no such instance.

My understanding:
Segment is just to separate the PCI devices to different groups.
Each group of devices use the continuous BUS/IO/MMIO resource.
Each group has a BASE PCIE address that can be used to access PCIE
configuration in MMIO way.


This makes it sound like an either/or design choice that a hardware designer
can make simply for some kind of convenience, and I don't think that's the
case.

Segments typically indicate completely separate host/PCI interfaces. For
example, I've seen older Intel boards with both 32-bit/33MHz slots and 64-
bit/66MHz slots. This was not done with a bridge: each set of slots was tied
to a completely separate host PCI bridge and hence each was a separate
segment. This was required in order to support legacy 32-bit/33MHz devices
without forcing the 64-bit/66MHz devices down to 33MHz as well.

With PCIe, on platforms other than Intel, each root complex would also be a
separate segment. Each root complex would have its own bus/dev/func namespace,
its own configuration space access method, and its own portion of the physical
address space into which to map BARs. This means that you could have two or
more different devices with the same bus/dev/func identifier tuple, meaning
they are not unique on a platform-wide basis.

At the hardware level, PCIe may be implemented similarly on Intel too, but
they hide some of the details from you. The major difference is that even in
cases where you may have multiple PCIe channels, they all share the same
bus/dev/func namespace so that you can pretend the bus/dev/func tuples are
unique platform-wide. The case where you would need to advertise multiple
segments arises where there's some technical roadblock that prevents
implementing this illusion of a single namespace in a completely transparent
way.

In the case of the 32-bit/64-bit hybrid design I mentioned above, scanning the
bus starting from bus0/dev0/func0 would only allow you to automatically
discover the 32-bit devices because there was no bridge between the 32-bit and
64-bit spaces. The hardware allows you to issue configuration accesses to both
spaces using the same 0xcf8/0xcfc registers, but in order to autodiscover the
64-bit devices, you needed know ahead of time to also scan starting at
bus1/dev0/func0. But the only way to know to do that was to check the
advertised segments in the ACPI device table and honor their starting bus
numbers.
  


FWIW, the scalable X64 machines from SGI/HPE implement multiple 
segments, generally one per socket.  That's the only way for us to 
represent a machine of the size we're interested in.  There are multiple 
config space ranges which aren't necessarily located at contiguous 
addresses, only one of which (on the legacy socket) is accessible via 
the legacy 0xcf8/0xcfc mechanism.  So you need to parse the ACPI tables 
to discover all the I/O.


We really wish all code used segments properly... we have had to convert 
a lot of 3rd party code to use PciSegmentLib rather than the old PciLib. 
 Thankfully OSes are in pretty good shape these days (was not always 
the case in the past.)



So with the above understanding, even a platform which has single segment
can be implemented as a multiple segments platform.


I would speculate this might only be true on Intel. :) Intel is the only
platform that creates the illusion of a single bus/dev/func namespace for
multiple PCI "hoses," and it only does that for backward compatibility
purposes (i.e. to make Windows happy). Without that gimmick, each segment
would be a separate tree rooted at bus0/dev0/func0, and there wouldn't be much
point to doing that if you only had a single root complex.

-Bill
  

Thanks/Ray


-Original Message-
From: edk2-devel  On Behalf Of Laszlo
Ersek
Sent: Wednesday, May 23, 2018 3:38 PM
To: Ni, Ruiyu 
Cc: edk2-devel-01 
Subject: [edk2] PciSegmentInfoLib instances

Hi Ray,

do you know of any open source, non-Null, PciSegmentInfoLib instance?
(Possibly outside of edk2?)

More precisely, it's not the PciSegmentInfoLib instance itself that's of
particular interest, but the hardware and the platform support code that
offer multiple PCIe segments.

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel



--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 3/7] HACK: HobLib: workaround infinite loop

2018-03-05 Thread Brian J. Johnson

On 03/05/2018 12:22 PM, Laszlo Ersek wrote:

PEIMs generally "execute in place" (XIP), i.e. they run from flash, not
RAM. In this status they use "temporary RAM" (e.g. CPU caches configured
like RAM) for stack & heap; whatever HOBs they produce are stored in
"temp RAM" as well. Then one of the PEIMs "discovers permanent RAM"
(basically it programs the memory controller and publishes the RAM
ranges). In turn the PEI core "migrates" PEIMs from temporary to
permanent RAM, including the HOB list.

Before the temporary RAM migration (when still executing in place from
flash), PEIMs cannot have writeable global variables. For example,
dynamic PCDs are also maintained in a HOB (the PCD HOB).

A PEIM normally cannot (and shouldn't) tell whether it is dispatched
before or after permanent RAM is published. If needed, a PEIM can
advertise that it depends on permanent RAM (for example because it needs
a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in
its DEPEX.

Finally, it seems like a PEIM can also express, "I'm fine with being
dispatched from both flash (XIP) vs. permanent RAM, just the PEI core
tell me whichever it is". Apparently, if the PEIM is dispatched from
flash (before permanent RAM is available), its call to
RegisterForShadow() returns EFI_SUCCESS (and then its entry point
function will be invoked for a 2nd time, after the temp RAM migration).
And when a PEIM is dispatched from RAM (either for the very first time,
or for the second time, after being dispatched from flash first), the
same call returns EFI_ALREADY_STARTED.

Honestly, I'm unsure what this is good for (both in general, and
specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for
doing the measurements (which makes sense); I just wonder what exactly
it does so that it cannot simply specify
"gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX.


I haven't looked at this particular PEIM.  But one case where 
registering for shadowing is useful is for improving performance when 
running from permanent RAM, where writable global variables are 
available.  For instance, when running from flash, a ReportStatusCode 
PEIM may need to go through a slow process to locate an output hardware 
device on every PPI call.  This may involve traversing the HOB list, 
consulting other PPIs, even probing hardware addresses.  But once it's 
shadowed to RAM, it can locate the device once, then cache its address 
in a global.  Not to mention that the code itself is far, far faster 
when run from RAM vs. flash.  (That's probably a key difference between 
a real machine and a VM.)


Also, I've personally written a PEIM which saves a bunch of internal 
state in a HOB, since that's the main writable storage when running from 
flash.  That state includes pointers to other data (in flash.)  Once the 
data is all shadowed to RAM, it updates the HOB to point to the data in 
RAM.  That way it's a lot faster to access the data.


I also have other PEIMs which are constrained (via DEPEX) to run *only* 
from RAM, since they have larger memory requirements than can be 
satisfied from temporary cache-as-RAM.  That's certainly a valid 
technique as well.


RegisterForShadow() is a useful tool for making the most of the 
restricted PEI environment.  And having it standardized like this is, as 
Andrew said, a lot better than the hacks people had to use beforehand.


Thanks,
--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [Patch] MdeModulePkg PeiCore: Add error message to describe PEIM load failure

2018-02-15 Thread Brian J. Johnson

On 02/08/2018 04:31 AM, Laszlo Ersek wrote:

On 02/08/18 11:24, Gao, Liming wrote:

Laszlo:
   11p address is big to cover most real address. It can align and save the 
message length.

> Good point, it can go up to 2^44  - 1, covering 16 TB of RAM.


Tiny little 16-banger.  Buy a real computer.  :)  :)

https://www.hpe.com/us/en/product-catalog/servers/mission-critical-x86-servers/pip.hpe-superdome-flex-server.1010323140.html

(With many smilies)
Thanks,
--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [RFC v4 4/6] UefiCpuPkg/CpuExceptionHandlerLib: Add helper to valid memory addresses

2018-01-03 Thread Brian J. Johnson

On 12/28/2017 10:39 PM, Paulo Alcantara wrote:

+  //
+  // Check if paging is disabled
+  //
+  if ((Cr0 & BIT31) == 0) {
+//
+// If CR4.PAE bit is set, then the linear (or physical) address supports
+// only up to 36 bits.
+//
+if (((Cr4 & BIT5) != 0 && (UINT64)LinearAddress > 0xFULL) ||
+LinearAddress > 0x) {
+  return FALSE;
+}
+
+return TRUE;
+  }


Paulo,

The logic there doesn't look quite right:  if LinearAddress is between 
2^32 and 2^36-1, this code will always return FALSE, even if CR4.PAE is 
set.  Shouldn't it be:


   if ((UINT64)LinearAddress > 0xFULL ||
   ((Cr4 & BIT5) == 0 && LinearAddress > 0x)) {
 return FALSE;
   }

(I haven't examined all the code in detail, I just happened to notice 
this issue.)


This bug should get fixed before pushing this series.  I also have some 
more general design questions, which shouldn't hold up pushing the 
series, but I think merit some discussion:


This is great code for validating addresses in general, especially when 
guard pages are in use for NULL pointers, stack overflow, etc.  Thanks 
for adding it!  But for [er]sp and [er]bp validation, don't you really 
just want to know if the address is in the expected stack range?  Maybe 
the code which sets up the stack could communicate the valid range to 
CpuExceptionHandlerLib somehow.  It could use special debug register 
values like 
SourceLevelDebugPkg/Library/PeCoffExtraActionLibDebug/PeCoffExtraActionLib.c 
does.  Or perhaps it could use dynamic PCDs (although I don't know that 
it's a good idea to go looking up PCDs in an exception handler.)  Or 
maybe there's a more straightforward way  It would have to take AP 
stacks into account, and probably SetJump/LongJump as well.  That may or 
may not be simpler than the current code


More generally, I'd like to see some sort of platform-specific callout 
to further validate addresses.  Not all mapped addresses, or addresses 
up to the architectural limit, are safe to access.  For instance, reads 
to SMRAM outside of SMM will cause exceptions.  Also, we wouldn't want 
to go backtracing through MMIO or MMCFG space:  reads there could 
potentially have side effects on the hardware.


The rules can also vary at different points in boot.  For example, 
before memory is initialized, Intel Xeon processors generally execute 
32-bit code in cache-as-RAM mode, where the caches are jury-rigged to 
operate as temporary storage while the memory setup code is running.  In 
CAR mode, only a few address ranges can be accessed without causing 
machine checks:  the cache-as-RAM range containing the stack, heap, and 
HOB list, the architectural firmware range below 4G, and a few specific 
MMCFG and MMIO ranges.


So I'd like to suggest that you define an AddressValidationLib library 
class, which provides a routine which takes an address (or an address 
range?) and an indication of the intended use (memory read, memory 
write, execute/disassemble code, stack dump, IO, ...), and returns a 
value specifying if the access is:

- safe (IsLinearAddressValid() should return TRUE)
- unsafe (IsLinearAddressValid() should return FALSE)
- unknown (IsLinearAddressValid() should perform its other tests)

You can supply a NULL instance which always returns "unknown" for 
platforms which don't want to perform their own validation.


Thanks,
--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise
brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [MdeModulePkg/TerminalDxe] Why do we delay 2s for ESC being pressed?

2017-11-28 Thread Brian J. Johnson

On 11/24/2017 01:21 AM, Heyi Guo wrote:

Hi Brian,


在 11/9/2017 12:00 AM, Brian J. Johnson 写道:

On 11/08/2017 07:34 AM, Heyi Guo wrote:



On 11/08/2017 05:07 PM, Gerd Hoffmann wrote:

On Wed, Nov 08, 2017 at 04:44:37PM +0800, Heyi Guo wrote:


在 11/8/2017 4:34 PM, Ni, Ruiyu 写道:

No.
Even a terminal tool can recognize F10, it still needs to 
translate it into "ESC [ V"

and send the three bytes to firmware.

Got it. But the 2 seconds timeout is not for this situation, right? If
terminal tool could translate and send the key sequence, I think it 
can
complete 3 bytes transfer in a very short time, isn't it? E.g. 9600 
baud / 8

= 1200 Bytes/s (ignore control bits).

So 2 seconds timeout is still for user to enter the sequence "ESC [ V"
manually?

No.  Alot of software has this kind of delay because it is recommended
in some classic unix documentation to avoid mis-interpreting incomplete
terminal control sequences coming from slow terminals.

Where a "slow terminal" which actually would need such a long delay 
is a

physical terminal from the 70ies of the last century, or a virtual
terminal hooked up over a *really* slow network connection.

Reducing the delay from 2 seconds to roughly 0.2 seconds should be
pretty safe, things are not that slow any more these days :)

That will be great if we can make such change :)



You'd be surprised how much delay you can get with a few layers of 
firewalls, VPNs, and cross-country (or intercontinental) WAN links. 
That's the advantage of serial:  you can tunnel it anywhere.


Here's a quick workaround:  if you start typing other text after the 
ESC, the terminal driver will see the new keystrokes and resolve the 
ESC immediately, without the delay.  For instance, at the shell 
prompt, type something, press ESC to clear the line, then just start 
typing new text without waiting for the timeout. The line will be 
cleared and the new text will appear correctly, without delay.


On setup screens, I usually hit escape to return to the previous 
screen, then hit one of the arrow keys to cause the ESC to be 
processed without the timeout.  That works pretty well in practice.


I'd think a PCD to control this delay would be appropriate, though.


Can we really use a PCD in TerminalDxe? I remember some experts in the 
community said that TerminalDxe is a pure UEFI driver; it can't use PCD 
since PCD is not defined in UEFI spec.


Thanks,

Gary (Heyi Guo)



Gary,

I'm not sure what the rules are for PCDs.  I'm just saying that if 
there's a good way to make the delay configurable for each platform, I 
wouldn't be against it.  2 seconds is a long time in some contexts.


--
Brian J. Johnson
Enterprise X86 Lab
Hewlett Packard Enterprise
brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [RFC v2 1/3] UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support

2017-11-16 Thread Brian J. Johnson
ystemContext, );
+
+  //
+  // Dump image module names
+  //
+  DumpImageModuleNames (SystemContext);
+
//
-  // Dump module image base and module entry point by RIP
+  // Dump stack contents
//
-  DumpModuleImageInfo (SystemContext.SystemContextX64->Rip);
+  DumpStackContents (SystemContext.SystemContextX64->Rsp, UnwondStacksCount);
  }




--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [RFC 0/1] Stack trace support in X64 exception handling

2017-11-14 Thread Brian J. Johnson

On 11/14/2017 11:23 AM, Andrew Fish wrote:


On Nov 14, 2017, at 8:33 AM, Brian J. Johnson <brian.john...@hpe.com 
<mailto:brian.john...@hpe.com>> wrote:


On 11/14/2017 09:37 AM, Paulo Alcantara wrote:

Hi Fan,
On 14/11/2017 12:03, Fan Jeff wrote:

Paul,

I like this feature very much. Actually, I did some POC one year ago 
but I did finalize it.


In my POC, I could use EBP to tack the stack frame on IAS32 arch.

But for x64, I tried to use –keepexceptiontable flag to explain 
stack frame from the debug section of image.


I may workson MSFT toolchain, but it did now work well for GCC 
toolchain.


I think Eric could help to verify MSFT for your patch. If it works 
well, that’s will be great!


Say again, I like this feature!!!:-)
Cool! Your help would be really appreciable! If we get this working 
for X64 in both toolchains, that should be easy to port it to IA-32 
as well.

Thank you very much for willing to help on that.
Paulo


Great feature!  You do need some sort of sanity check on the RIP and 
RBP values, though, so if the stack gets corrupted or the RIP is 
nonsense from following a bad pointer, you don't start dereferencing 
garbage addresses and trigger an exception loop.




Brian,

This was a long time ago and my memory might be fuzzy I think we 
talked to some debugger folks about unwinding the stack and they 
mentioned it was common for the C runtime to have a return address or 
frame pointer have a zero value so the unwind logic knows when to stop. 
This is in addition to generic sanity checking.


We got an extra push $0 added to the stack switch to help with stack 
unwind.

https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseLib/X64/SwitchStack.S

If might be a good idea to have a PCD for the max number of stack frames 
to display as a fallback for the error check. For X64 you may also have 
to add a check for a non-cononical address as that will GP fault.




Good idea.

Regarding sanity checks:  I've had good luck validating code locations 
(EIP values) by using a modified PeCoffExtraActionLib to track the top 
and bottom of the range where images have been loaded.  (I've actually 
used two ranges:  one for code executed from firmware space, and one for 
code executed from RAM.)


I'm not sure offhand if there's a platform-independent way to validate 
stack pointer values.  For most PC-like systems, just ensuring that it's 
larger than 1 or 2M (to avoid NULL pointers and the legacy spaces) and 
less than about 3G (or the low memory size, if that's known) may be 
enough to avoid an exception loop.


Brian


Thanks,

Andrew Fish


For at least some versions of Microsoft's IA32 compiler, it's possible 
to compile using EBP as a stack frame base pointer (like gcc) by using 
the "/Oy-" switch.  The proposed unwind code should work in that case. 
The X64 compiler doesn't support this switch, though.


AFAIK the only way to unwind the stack with Microsoft's X64 compilers 
is to parse the unwind info in the .pdata and .xdata sections. 
 Genfw.exe usually strips those sections, but the 
"--keepexceptiontable" flag will preserve them, as Jeff pointed out. 
 I've looked hard for open source code to decode them, but haven't 
found any, even though the format is well documented.  And I haven't 
gotten around to writing it myself.  I'd love it if someone could 
contribute the code!


Another possibility is to use the branch history MSRs available on 
some x86-family processors.  Recent Intel processors can use them as a 
stack, as opposed to a circular list, so they can record a backtrace 
directly. (I'm not familiar with AMD processors' capabilities.)  You 
can enable call stack recording like this:


 #define LBR_ON_FLAG   0x0001
 #define IA32_DEBUGCTL 0x1D9
 #define CALL_STACK_SET_FLAG 0x3C4
 #define CALL_STACK_CLR_FLAG 0xFC7
 #define MSR_LBR_SELECT 0x1C8

 //
 // Enable branch recording
 //
 LbControl = AsmReadMsr64 ((UINT32)IA32_DEBUGCTL);
 LbControl |= LBR_ON_FLAG;
 AsmWriteMsr64 ((UINT32)IA32_DEBUGCTL, LbControl);

 //
 // Configure for call stack
 //
 LbSelect = AsmReadMsr64 ((UINT32)MSR_LBR_SELECT);
 LbSelect &= CALL_STACK_CLR_FLAG;
 LbSelect |= CALL_STACK_SET_FLAG;
 AsmWriteMsr64((UINT32)MSR_LBR_SELECT, LbSelect);

The EIP/RIP values are logged in MSR_SANDY_BRIDGE_LASTBRANCH_n_FROM_IP 
and MSR_SANDY_BRIDGE_LASTBRANCH_n_TO_IP, and the current depth is 
tracked in MSR_LASTBRANCH_TOS.  This works quite well.  Gen10 (Sky 
Lake) processors support 32 LASTBRANCH_n MSR pairs, which is 
sufficient in almost all cases.


Different processor generations have different branch recording 
capabilities, and different numbers of LASTBRANCH_n MSRs; see Intel's 
manuals for details.


Thanks,
Brian



Thanks!

Jeff

*发件人: *Paulo Alcantara <mailto:pca...@zytor.com>
*发送时间: *2017年11月14日21:23
*收件人: *edk2-devel@lists.01.org <mailto:edk2-devel@lists.01.org> 
<mailto:edk2-devel@lists.01.org>
*抄送: *Rick Bramley <mailto:richard

Re: [edk2] [RFC 1/1] UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support

2017-11-14 Thread Brian J. Johnson
mation. ");

+    return;
+  }
+
+  //
+  // Get PDB file name from initial PE/COFF image
+  //
+  GetPdbFileName (ImageBase, NULL, );
+
+  //
+  // Print out back trace
+  //
+  InternalPrintMessage ("\nBack trace:\n");
+
+  for (;;) {
+    //
+    // Print stack frame in the following format:
+    //
+    // #  @ + (RBP) in [ | ]
+    //
+    InternalPrintMessage (
+  "%d 0x%016lx @ 0x%016lx+0x%x (0x%016lx) in %a\n",
+  *UnwondStacksCount,
+  Rip,
+  ImageBase,
+  Rip - ImageBase - 1,
+  Rbp,
+  PdbFileName
+  );
+
+    //
+    // Set RIP with return address from current stack frame
+    //
+    Rip = *(UINT64 *)((UINTN)Rbp + 8);
+
+    //
+    // Check if RIP is within another PE/COFF image base address
+    //
+    if (Rip < ImageBase) {
+  //
+  // Search for the respective PE/COFF image based on RIP
+  //
+  ImageBase = PeCoffSearchImageBase (Rip);
+  if (ImageBase == 0) {
+    //
+    // Stop stack trace
+    //
+    break;
+  }
+
+  //
+  // Get PDB file name
+  //
+  GetPdbFileName (ImageBase, NULL, );
+    }
+
+    //
+    // Unwind the stack
+    //
+    Rbp = *(UINT64 *)(UINTN)Rbp;
+
+    //
+    // Increment count of unwond stacks
+    //
+    (*UnwondStacksCount)++;
+  }
+}
+
+/**
   Display CPU information.

   @param ExceptionType  Exception type.
@@ -254,9 +578,25 @@ DumpImageAndCpuContent (
   IN EFI_SYSTEM_CONTEXT   SystemContext
   )
{
+  INTN UnwondStacksCount;
+
+  //
+  // Dump CPU context
+  //
   DumpCpuContext (ExceptionType, SystemContext);
+
+  //
+  // Dump stack trace
+  //
+  DumpStackTrace (SystemContext, );
+
+  //
+  // Dump image module names
+  //
+  DumpImageModuleNames (SystemContext);
+
   //
-  // Dump module image base and module entry point by RIP
+  // Dump stack contents
   //
-  DumpModuleImageInfo (SystemContext.SystemContextX64->Rip);
+  DumpStackContents (SystemContext.SystemContextX64->Rsp, 
UnwondStacksCount);

}
--
2.11.0

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel



___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel



--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise
brian.john...@hpe.com

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] 答复: [RFC 0/1] Stack trace support in X64 exception handling

2017-11-14 Thread Brian J. Johnson

On 11/14/2017 09:37 AM, Paulo Alcantara wrote:

Hi Fan,

On 14/11/2017 12:03, Fan Jeff wrote:

Paul,

I like this feature very much. Actually, I did some POC one year ago 
but I did finalize it.


In my POC, I could use EBP to tack the stack frame on IAS32 arch.

But for x64, I tried to use –keepexceptiontable flag to explain stack 
frame from the debug section of image.


I may workson MSFT toolchain, but it did now work well for GCC toolchain.

I think Eric could help to verify MSFT for your patch. If it works 
well, that’s will be great!


Say again, I like this feature!!!:-)


Cool! Your help would be really appreciable! If we get this working for 
X64 in both toolchains, that should be easy to port it to IA-32 as well.


Thank you very much for willing to help on that.

Paulo


Great feature!  You do need some sort of sanity check on the RIP and RBP 
values, though, so if the stack gets corrupted or the RIP is nonsense 
from following a bad pointer, you don't start dereferencing garbage 
addresses and trigger an exception loop.


For at least some versions of Microsoft's IA32 compiler, it's possible 
to compile using EBP as a stack frame base pointer (like gcc) by using 
the "/Oy-" switch.  The proposed unwind code should work in that case. 
The X64 compiler doesn't support this switch, though.


AFAIK the only way to unwind the stack with Microsoft's X64 compilers is 
to parse the unwind info in the .pdata and .xdata sections.  Genfw.exe 
usually strips those sections, but the "--keepexceptiontable" flag will 
preserve them, as Jeff pointed out.  I've looked hard for open source 
code to decode them, but haven't found any, even though the format is 
well documented.  And I haven't gotten around to writing it myself.  I'd 
love it if someone could contribute the code!


Another possibility is to use the branch history MSRs available on some 
x86-family processors.  Recent Intel processors can use them as a stack, 
as opposed to a circular list, so they can record a backtrace directly. 
(I'm not familiar with AMD processors' capabilities.)  You can enable 
call stack recording like this:


  #define LBR_ON_FLAG   0x0001
  #define IA32_DEBUGCTL 0x1D9
  #define CALL_STACK_SET_FLAG 0x3C4
  #define CALL_STACK_CLR_FLAG 0xFC7
  #define MSR_LBR_SELECT 0x1C8

  //
  // Enable branch recording
  //
  LbControl = AsmReadMsr64 ((UINT32)IA32_DEBUGCTL);
  LbControl |= LBR_ON_FLAG;
  AsmWriteMsr64 ((UINT32)IA32_DEBUGCTL, LbControl);

  //
  // Configure for call stack
  //
  LbSelect = AsmReadMsr64 ((UINT32)MSR_LBR_SELECT);
  LbSelect &= CALL_STACK_CLR_FLAG;
  LbSelect |= CALL_STACK_SET_FLAG;
  AsmWriteMsr64((UINT32)MSR_LBR_SELECT, LbSelect);

The EIP/RIP values are logged in MSR_SANDY_BRIDGE_LASTBRANCH_n_FROM_IP 
and MSR_SANDY_BRIDGE_LASTBRANCH_n_TO_IP, and the current depth is 
tracked in MSR_LASTBRANCH_TOS.  This works quite well.  Gen10 (Sky Lake) 
processors support 32 LASTBRANCH_n MSR pairs, which is sufficient in 
almost all cases.


Different processor generations have different branch recording 
capabilities, and different numbers of LASTBRANCH_n MSRs; see Intel's 
manuals for details.


Thanks,
Brian





Thanks!

Jeff

*发件人: *Paulo Alcantara 
*发送时间: *2017年11月14日21:23
*收件人: *edk2-devel@lists.01.org 
*抄送: *Rick Bramley ; Laszlo Ersek 
; Andrew Fish ; Eric 
Dong 
*主题: *Re: [edk2] [RFC 0/1] Stack trace support in X64 exception 
handling


Hi,

On 14/11/2017 10:47, Paulo Alcantara wrote:

Hi,

This series adds stack trace support during a X64 CPU exception.

Informations like back trace, stack contents and image module names
(that were part of the call stack) will be dumped out.

We already have such support in ARM/AArch64 (IIRC) exception handling
(thanks to Ard), and then I thought we'd also deserve it in X64 and
IA-32 platforms.

What do you think guys?

BTW, I've tested this only with OVMF (X64 only), using:
    - gcc-6.3.0, GCC5, NOOPT

Any other tests  would be really appreciable.


I've attached a file to show you how the trace would look like.

Thanks!
Paulo



Thanks!
Paulo

Repo:   https://github.com/pcacjr/edk2.git
Branch: stacktrace_x64

Cc: Rick Bramley 
Cc: Andrew Fish 
Cc: Eric Dong 
Cc: Laszlo Ersek 
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Paulo Alcantara 
---

Paulo Alcantara (1):
    UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support

   
UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c 
| 344 +++-

   1 file changed, 342 insertions(+), 2 deletions(-)




___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel



--

  

Re: [edk2] [MdeModulePkg/TerminalDxe] Why do we delay 2s for ESC being pressed?

2017-11-08 Thread Brian J. Johnson

On 11/08/2017 07:34 AM, Heyi Guo wrote:



On 11/08/2017 05:07 PM, Gerd Hoffmann wrote:

On Wed, Nov 08, 2017 at 04:44:37PM +0800, Heyi Guo wrote:


在 11/8/2017 4:34 PM, Ni, Ruiyu 写道:

No.
Even a terminal tool can recognize F10, it still needs to translate 
it into "ESC [ V"

and send the three bytes to firmware.

Got it. But the 2 seconds timeout is not for this situation, right? If
terminal tool could translate and send the key sequence, I think it can
complete 3 bytes transfer in a very short time, isn't it? E.g. 9600 
baud / 8

= 1200 Bytes/s (ignore control bits).

So 2 seconds timeout is still for user to enter the sequence "ESC [ V"
manually?

No.  Alot of software has this kind of delay because it is recommended
in some classic unix documentation to avoid mis-interpreting incomplete
terminal control sequences coming from slow terminals.

Where a "slow terminal" which actually would need such a long delay is a
physical terminal from the 70ies of the last century, or a virtual
terminal hooked up over a *really* slow network connection.

Reducing the delay from 2 seconds to roughly 0.2 seconds should be
pretty safe, things are not that slow any more these days :)

That will be great if we can make such change :)



You'd be surprised how much delay you can get with a few layers of 
firewalls, VPNs, and cross-country (or intercontinental) WAN links. 
That's the advantage of serial:  you can tunnel it anywhere.


Here's a quick workaround:  if you start typing other text after the 
ESC, the terminal driver will see the new keystrokes and resolve the ESC 
immediately, without the delay.  For instance, at the shell prompt, type 
something, press ESC to clear the line, then just start typing new text 
without waiting for the timeout.  The line will be cleared and the new 
text will appear correctly, without delay.


On setup screens, I usually hit escape to return to the previous screen, 
then hit one of the arrow keys to cause the ESC to be processed without 
the timeout.  That works pretty well in practice.


I'd think a PCD to control this delay would be appropriate, though.
--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic.

2017-10-26 Thread Brian J. Johnson

On 10/25/2017 08:13 PM, Dong, Eric wrote:

Laszlo,



-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of
Laszlo Ersek
Sent: Wednesday, October 25, 2017 11:08 PM
To: Dong, Eric ; edk2-devel@lists.01.org
Cc: Ni, Ruiyu ; Paolo Bonzini 
Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for
AP initialization logic.

Hi Eric,

On 10/25/17 07:42, Dong, Eric wrote:

Hi Laszlo,

I think I have clear about your raised issues for Ovmf platform. In this case, I

think your platform not suit for this code change.  How about I do below
change based on the new code:


-  if (CpuMpData->CpuCount == 0) {
 TimedWaitForApFinish (
   CpuMpData,
   PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
   PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
   );
-  }

After this change, for Ovmf can still set
PcdCpuApInitTimeOutInMicroSeconds to MAX_UINT32 to keep old

solution.

For the real platform, it can set a small value to follow the new
solution. For this new change, it just wait one more
PcdCpuApInitTimeOutInMicroSeconds timeout, should not have

performance

impact (This time may also been cost in later while
(CpuMpData->MpCpuExchangeInfo->NumApsExecuting != 0) loop) or have
litter performance impact (not cover by the later loop).

The suggested change will work for OVMF, thanks!

(Just please don't forget to un-indent the TimedWaitForApFinish() call that
will become unconditional again.)

--o--

Do you think Brian's concern should be investigated as well (perhaps in a
separate Bugzilla entry)?


As Jeff has mentioned, only Ovmf can know the exist Ap numbers before
send the Init-sipi-sipi.  So we don't know how many bits needed for this bitmap.
In case we can create the bitmap, we still don't know when all Aps have been
  found(If the begin map bit value same as finish map bit value, we still can't 
get
the conclusion that all Aps have been found, because maybe other Aps not
started yet).



Right, physical platforms don't usually know the number of CPUs up 
front, so they still need a timeout.  That's unavoidable.  But we've 
seen cases where interrupts aren't getting delivered reliably, so not 
all the expected CPUs start up (based on the core counts in the physical 
sockets, as known by the developing engineer, not the firmware.)  To 
debug this failure, it's very useful to have a list of exactly which 
CPUs did start.


Similarly, we've seen cases where a CPU starts, but crashes due to h/w 
or s/w bugs before signaling the BSP that it has finished.  With only a 
counter to indicate how many CPUs are still running, it's impossible to 
tell which CPUs failed.  That makes debugging much more difficult.


The bitmaps would need to be sized according to the maximum number of 
CPUs supported by the platform (or the maximum APIC ID, depending on how 
they are indexed), like other data structures in the MpCpu code.


Bitmaps aren't the only possible implementation of course  My point 
is that it's useful to have a list of which CPUs started executing, and 
which finished.



Thanks,
Eric


Because, I believe, the scheduling pattern that I described earlier could be
possible on physical platforms as well, at least in theory:


(2) After at least one AP has started running, the logic expects
"NumApsExecuting" to monotonically grow for a while, and then to
monotonically decrease, back to zero. For example, if we have 1 BSP
and
7 APs, the BSP logic more or less expects the following values in
"NumApsExecuting":

1; 2; 3; 4; 5; 6; 7;
6; 5; 4; 3; 2; 1; 0


While this may be a valid expectation for physical processors (which
actually run in parallel, in the physical world), in a virtual machine, it is 
not

guaranteed.

Dependent on hypervisor scheduling artifacts, it is possible that,
say, three APs start up *and finish* before the remaining four APs
start up *at all*. The INIT-SIPI-SIPI marks all APs for execution /
scheduling in the hypervisor, yes, but the actual code execution may
commence a lot later. For example, the BSP may witness the following

series of values in "NumApsExecuting":


1; 2; 3;
2; 1; 0;
1; 2; 3; 4;
3; 2; 1; 0

and the BSP could think that there are 3 APs only, when it sees the
first 0 value.


I don't know if such a pattern is "likely", "unlikely", or "extremely unlikely" 
on
physical platforms. I think the pattern is "theoretically possible" at least.

I suggest that we go ahead with the small patch for the OVMF use case first,
and then open a new BZ if Brian thinks there's a real safety problem with the
code.



We also notice this theoretical problem when we implement this change, but
We think this is rarely to happen on physical platforms. We think the value for
the change is much bigger than not do this change because of this theoretical
problem.


I strongly disagree.  Any chance for it to happen on physical platforms 
is too 

Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic.

2017-10-24 Thread Brian J. Johnson

On 10/24/2017 12:40 PM, Laszlo Ersek wrote:

Hi Eric,

On 10/24/17 17:23, Dong, Eric wrote:

Laszlo,


-Original Message-
From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Tuesday, October 24, 2017 6:16 PM
To: Dong, Eric ; edk2-devel@lists.01.org
Cc: Ni, Ruiyu ; Paolo Bonzini 
Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for
AP initialization logic.

CC Paolo

On 10/23/17 09:22, Eric Dong wrote:



diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
index 976af1f..bdfe0d3 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
@@ -40,4 +40,5 @@ EnableExecuteDisableLocation  equLockLocation

+ 30h

  Cr3Location   equLockLocation + 34h
  InitFlagLocation  equLockLocation + 38h
  CpuInfoLocation   equLockLocation + 3Ch
+NumApsExecutingLocation   equLockLocation + 40h

diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 1b9c6a6..2b6c27d 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -86,6 +86,12 @@ Flat32Start:   ; protected 
mode

entry point


  movesi, ebx

+; Increment the number of APs executing here as early as possible
+; This is decremented in C code when AP is finished executing
+movedi, esi
+addedi, NumApsExecutingLocation
+lock inc   dword [edi]
+
  mov edi, esi
  add edi, EnableExecuteDisableLocation
  cmp byte [edi], 0
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c
b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index db923c9..48f930b 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -662,6 +662,7 @@ ApWakeupFunction (
  // AP finished executing C code
  //
  InterlockedIncrement ((UINT32 *) >FinishedCount);
+InterlockedDecrement ((UINT32 *)
+ >MpCpuExchangeInfo->NumApsExecuting);

  //
  // Place AP is specified loop mode @@ -765,6 +766,7 @@
FillExchangeInfoData (

ExchangeInfo->CFunction   = (UINTN) ApWakeupFunction;
ExchangeInfo->ApIndex = 0;
+  ExchangeInfo->NumApsExecuting = 0;
ExchangeInfo->InitFlag= (UINTN) CpuMpData->InitFlag;
ExchangeInfo->CpuInfo = (CPU_INFO_IN_HOB *) (UINTN)

CpuMpData->CpuInfoInHob;

ExchangeInfo->CpuMpData   = CpuMpData;
@@ -934,13 +936,19 @@ WakeUpAP (
  }
  if (CpuMpData->InitFlag == ApInitConfig) {
//
-  // Wait for all potential APs waken up in one specified period
+  // Wait for one potential AP waken up in one specified period
//
-  TimedWaitForApFinish (
-CpuMpData,
-PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
-PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
-);
+  if (CpuMpData->CpuCount == 0) {
+TimedWaitForApFinish (
+  CpuMpData,
+  PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
+  PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
+  );
+  }


I don't understand this change. The new comment says,

   Wait for *one* potential AP waken up in one specified period

However, the second parameter of TimedWaitForApFinish(), namely
"FinishedApLimit", gets the same value as before:

   PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1

It means that all of the (possible) APs are waited-for, just the same
as before.


[[Eric]] This patch changes the collect AP count logic, original
solution always waits for a specific time to let all APs start up. If
the input CPU number (PcdGet32(PcdCpuMaxLogicalProcessorNumber) - 1)
have been found or after a specific time(PcdGet32
(PcdCpuApInitTimeOutInMicroSeconds)). BSP will not wait anymore and use
CpuMpData->CpuCount as the found AP count.

New logic also wait for a specific time, but this time is smaller than
the original one. It just wait for the first AP(any AP) begin to do the
initialization( do CpuMpData->MpCpuExchangeInfo->NumApsExecuting++ means
it begin to do the initialization). When Ap finishes initialization, it
will do CpuMpData->MpCpuExchangeInfo->NumApsExecuting--. So after BSP
waits for a specific time at first, it just needs to check whether
CpuMpData->MpCpuExchangeInfo->NumApsExecuting == 0 to know whether all
Aps have finished initialization. Here we still use the original PCD
(PcdCpuApInitTimeOutInMicroSeconds) for the new time value.

When one AP do the initialization, it will also do
CpuMpData->CpuCount++. So here I check whether CpuMpData->CpuCount != 0
to know whether APs already begin to do the initialization. If yes, I
not need to do the time out waiting anymore, just check the
CpuMpData->MpCpuExchangeInfo->NumApsExecuting to know whether all Aps
have 

Re: [edk2] Improvements to build system etc. for edk2-platforms devel-MinnowBoard3?

2017-02-23 Thread Brian J. Johnson
Sorry if I'm bikeshedding...  NUMBER_OF_PROCESSORS isn't a good default 
for those of us who build on servers with hundreds of threads available. 
 The OS, disks, and build.exe/build.py become bottlenecks.  Maybe we 
could put a cap (say, 20) on the default thread limit, so it uses 
NUMBER_OF_PROCESSORS or 20 threads, whichever is less.


Or just set a small, fixed number of threads by default and document 
better how to change it, as others suggested.


Brian

On 02/22/2017 08:49 PM, Wei, David wrote:

Yes, as Brian said, at this stage,  patches are welcomed, including the python 
script.  You can also file bugs on https://bugzilla.tianocore.org/ for issue 
track and discussion.  Please remember to CC stakeholders.

Specifically for the multi-thread building , I think maybe  
NUMBER_OF_PROCESSORS Windows environment variable could be used as the proper 
thread number.

Thanks,
David  Wei


-Original Message-
From: Richardson, Brian
Sent: Thursday, February 23, 2017 3:10 AM
To: Rebecca Cran <rebe...@bluestop.org>; Gao, Liming <liming@intel.com>; 
edk2-devel@lists.01.org
Cc: Lu, ShifeiX A <shifeix.a...@intel.com>; Zimmer, Vincent <vincent.zim...@intel.com>; 
Andrew Fish <af...@apple.com>; Wei, David <david@intel.com>
Subject: RE: [edk2] Improvements to build system etc. for edk2-platforms 
devel-MinnowBoard3?

We're ok with help fixing issues (yay open source), so thanks for the help. 
Patches are welcome at this time. But we do track them in Bugzilla, so opening 
an issue there is the first step to a solution.

For the processor thread setting, note that we have historically disabled 
processor threading by default because we don't know the build system 
configuration. While we at Intel prefer everyone own an Intel(R) Core(TM) i7 or 
Intel(R) Xeon processor, we know that's not the case ... so keeping it disabled 
has been seen as a safe option. We should definitely do a better job of 
documenting that setting change, but I think we need to consider generically 
changing that setting by default in target.txt (which requires a Bugzilla 
entry).

The .bat/.sh files are required to trigger post-build tools, which are OS 
dependent. Even if the build was triggered by a Python script, we still need to 
do some verification to make sure there are no functional differences when we 
build in a Windows environment versus Linux (work in progress).

Thanks ... br
---
Brian Richardson, Senior Technical Marketing Engineer, Intel Software
brian.richard...@intel.com -- @intel_Brian (Twitter & WeChat)
https://software.intel.com/en-us/meet-the-developers/evangelists/team/brian-richardson

-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Rebecca 
Cran
Sent: Wednesday, February 22, 2017 11:53 AM
To: Richardson, Brian <brian.richard...@intel.com>; Gao, Liming 
<liming@intel.com>; edk2-devel@lists.01.org
Cc: Lu, ShifeiX A <shifeix.a...@intel.com>; Zimmer, Vincent <vincent.zim...@intel.com>; 
Andrew Fish <af...@apple.com>; Wei, David <david@intel.com>
Subject: Re: [edk2] Improvements to build system etc. for edk2-platforms 
devel-MinnowBoard3?

On 2/22/2017 9:34 AM, Richardson, Brian wrote:

Thanks for the input. For future reference, you can use the TianoCore
Bugzilla to report issues on any EDK II feature/platform.
https://github.com/tianocore/tianocore.github.io/wiki/Reporting-Issues

I agree the readme.md should be present, and use markup instead of plain text 
to work better in github. You can open an issue on this in Bugzilla.

Normally, we ask folks to change the number of processor threads based on their 
system configuration. We don't add a larger thread number by default, but it 
might be good to set it '5' by default (assuming a dual core processor with 
hyperthreading) instead of '1' (assuming a single core system w/o threading). I 
don't know if this will cause any compatibility issues on older systems, but 
it's worth a check.

At this time, MinnowBoard 3 build is only validated in Windows. That's why 
there is no equivalent .sh file for BuildBIOS yet, but it will be added once 
Linux build is verified and checked in.


I'm more about _fixing_ issues I find rather than reporting them! Are you 
saying that patches wouldn't be welcome just now?  Is there a reason why you 
don't want to make full use of the CPU while building? And I understand that 
MinnowBoard 3 only builds under Windows at the moment, but if more of it built 
using python (and python is already listed as a prerequisite in the ReadMe 
file) the porting might be simpler.

--
Rebecca
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--
Brian J. Johnson
Enterpris

Re: [edk2] [patch 2/8] FatPkg\EnhancedFatDxe: Initialize variable after declaration

2016-12-20 Thread Brian J. Johnson

On 12/08/2016 05:11 PM, Andrew Fish wrote:

On Dec 8, 2016, at 9:27 AM, Kurt Kennett <kurt.kenn...@microsoft.com> wrote:

This seems kind of silly.
Why isn't this just const data?  This adds code and memory accesses that are 
worthless and happen on every call to the function.


K2,

I think this is an attempt to conform to the coding standard?


I just went looking through the C Coding Standards document 
(https://github.com/tianocore-docs/Docs/raw/master/Specifications/CCS_2_1_Draft.pdf) 
and couldn't actually find this requirement.  Which really surprises 
me... I must not be using the right search terms.


Could someone point out the specific section covering this requirement? 
I have somebody asking me about it internally here.


Sorry to be both dense and pedantic,
--

    Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 2/3] MdeModulePkg/TerminalDxe: Optimize TtyTerm cursor motion

2016-10-27 Thread Brian J. Johnson

On 10/26/2016 10:09 PM, Kinney, Michael D wrote:

Tian Feng,

Unfortunately, this patch that was pushed to edk2/master today
breaks on IA32 VS2015x86 builds with a signed/unsigned mismatch
on 3 lines.  I think the right fix might be:

diff --git a/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c 
b/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
index e9b5ed0..9625f4d 100644
--- a/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
+++ b/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
@@ -790,13 +790,13 @@ TerminalConOutSetCursorPosition (
   // it isn't necessary.
   //
   if (TerminalDevice->TerminalType == TTYTERMTYPE &&
-  Mode->CursorRow == Row) {
-if (Mode->CursorColumn > Column) {
+  (UINTN)Mode->CursorRow == Row) {
+if ((UINTN)Mode->CursorColumn > Column) {
   mCursorBackwardString[FW_BACK_OFFSET + 0] = (CHAR16) ('0' + 
((Mode->CursorColumn - Column) / 10));
   mCursorBackwardString[FW_BACK_OFFSET + 1] = (CHAR16) ('0' + 
((Mode->CursorColumn - Column) % 10));
   String = mCursorBackwardString;
 }
-else if (Column > Mode->CursorColumn) {
+else if (Column > (UINTN)Mode->CursorColumn) {
   mCursorForwardString[FW_BACK_OFFSET + 0] = (CHAR16) ('0' + ((Column - 
Mode->CursorColumn) / 10));
   mCursorForwardString[FW_BACK_OFFSET + 1] = (CHAR16) ('0' + ((Column - 
Mode->CursorColumn) % 10));
   String = mCursorForwardString;

Please see if you can reproduce this issue.

Thanks,

Mike


The changes look fine to me.  Unfortunately, I don't have that compiler 
to test with.


If it's needed,
Reviewed-by: Brian Johnson <bjohn...@sgi.com>




-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Brian J. 
Johnson
Sent: Friday, October 7, 2016 7:54 AM
To: edk2-devel@lists.01.org
Cc: Brian J. Johnson <bjohn...@sgi.com>; Tian, Feng <feng.t...@intel.com>; 
Zeng, Star
<star.z...@intel.com>
Subject: [edk2] [PATCH 2/3] MdeModulePkg/TerminalDxe: Optimize TtyTerm cursor 
motion

For TtyTerm terminals, output a shorter escape sequence when possible
to move the cursor within the current line, and don't print any escape
sequence if the cursor is already at the correct position.  This
removes extra cursor motion activity at the EFI shell prompt,
improving performance.  It also makes it possible in many cases to
successfully use a terminal window which is taller than the driver's
mode setting (eg. 80x25.)

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Brian Johnson <bjohn...@sgi.com>
Cc: Feng Tian <feng.t...@intel.com>
Cc: Star Zeng <star.z...@intel.com>
---
 .../Universal/Console/TerminalDxe/Terminal.h   |  2 ++
 .../Universal/Console/TerminalDxe/TerminalConOut.c | 36 +++---
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
b/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
index 269d2ae..3ee3969 100644
--- a/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
+++ b/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
@@ -2,6 +2,7 @@
   Header file for Terminal driver.

 Copyright (c) 2006 - 2014, Intel Corporation. All rights reserved.
+Copyright (C) 2016 Silicon Graphics, Inc. All rights reserved.
 This program and the accompanying materials
 are licensed and made available under the terms and conditions of the BSD 
License
 which accompanies this distribution.  The full text of the license may be 
found at
@@ -157,6 +158,7 @@ typedef union {
 #define BACKGROUND_CONTROL_OFFSET 11
 #define ROW_OFFSET2
 #define COLUMN_OFFSET 5
+#define FW_BACK_OFFSET2

 typedef struct {
   UINT16  Unicode;
diff --git a/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
b/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
index b11e83f..e9b5ed0 100644
--- a/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
+++ b/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
@@ -83,6 +83,8 @@ CHAR16 mSetModeString[]= { ESC, '[', '=', '3', 
'h', 0 };
 CHAR16 mSetAttributeString[]   = { ESC, '[', '0', 'm', ESC, '[', '4', '0', 
'm',
ESC, '[', '4', '0', 'm', 0 };
 CHAR16 mClearScreenString[]= { ESC, '[', '2', 'J', 0 };
 CHAR16 mSetCursorPositionString[]  = { ESC, '[', '0', '0', ';', '0', '0', 'H', 
0 };
+CHAR16 mCursorForwardString[]  = { ESC, '[', '0', '0', 'C', 0 };
+CHAR16 mCursorBackwardString[] = { ESC, '[', '0', '0', 'D', 0 };

 //
 // Body of the ConOut functions
@@ -755,6 +757,7 @@ TerminalConOutSetCursorPosition (
   UINTN   MaxRow;
   EFI_STATUS  Status;
   TERMINAL_DEV*TerminalDevice;
+  CHAR16  *String;

   TerminalDevice = TERMINAL_CON_OUT_DEV_FROM_THIS (This);

@@ -782,13 +785,36 @@ TerminalConOutSetCursorPosition

Re: [edk2] [PATCH 0/3] MdeModulePkg/TerminalDxe: TtyTerm improvements

2016-10-26 Thread Brian J. Johnson

On 10/18/2016 10:34 AM, Brian J. Johnson wrote:

On 10/14/2016 03:37 PM, Laszlo Ersek wrote:

On 10/14/16 21:39, Brian J. Johnson wrote:

On 10/12/2016 03:17 AM, Ryan Harkin wrote:

On 7 October 2016 at 16:59, Leif Lindholm <leif.lindh...@linaro.org>
wrote:

Roy can now be found at Roy Franz <roy.fr...@hpe.com> (cc:d).

On Fri, Oct 07, 2016 at 05:56:26PM +0200, Laszlo Ersek wrote:

Roy, Ryan,

On 10/07/16 16:53, Brian J. Johnson wrote:

This patch series implements some improvements to the TtyTerm
terminal
type in the TerminalDxe driver.  It fixes an end case with cursor
position tracking, and uses that to optimize cursor motion escape
sequences.  It also adds support for the page up, page down, insert,
home, and end keys on some additional common terminal emulators.

The result is improved performance, especially at the shell prompt,
and better compatibility with common terminal emulators.  In
particular, as a side effect of the optimized cursor motion,
terminal
windows which are taller than the current mode setting (eg. 25
lines)
work much better than before.

Most of these fixes have been in production in some form on SGI's
servers for years.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Brian Johnson <bjohn...@sgi.com>
Cc: Feng Tian <feng.t...@intel.com>
Cc: Star Zeng <star.z...@intel.com>

Brian J. Johnson (3):
  MdeModulePkg/TerminalDxe: Improve TtyTerm cursor position tracking
  MdeModulePkg/TerminalDxe: Optimize TtyTerm cursor motion
  MdeModulePkg/TerminalDxe: Handle more keys with TtyTerm

 .../Universal/Console/TerminalDxe/Terminal.h   |  2 +
 .../Universal/Console/TerminalDxe/TerminalConIn.c  | 24 +++--
 .../Universal/Console/TerminalDxe/TerminalConOut.c | 61
--
 3 files changed, 79 insertions(+), 8 deletions(-)



can you please provide feedback (testing or otherwise) on this
series?



Well, they "work" for me and I'd be happy with them being submitted.

Tested-by: Ryan Harkin <ryan.har...@linaro.org>

The only curious effect I can see is the Print(L"xxx"); lines that
expect the \n to be missing will no longer "work".  For example, I
carry a patch by Daniil Egranov titled
"IntelFrameworkModulePkg/BdsDxe: Show boot timeout message" and it no
longer displays the countdown on the same line each time, it prints
each message on a new line.

However, I don't see that as a blocking point, Daniil's patch could be
changed easily and there are other advantages to this series that make
it worthwhile, IMO, eg, Shell commands with lots of output (like
"help" or "dir fs0:") no longer create an awful mess on the serial
console.



So, is this just waiting for a maintainer's review?


That's my understanding, yes.


Feng or Star, could you please review these changes?


Ping?
--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 2/5] MdePkg: Include Shell/ShellDynamicCommand/ShellParameters definitions

2016-10-18 Thread Brian J. Johnson
Yes, git automatically detects copying and moving files, as long as both 
halves of the action (remove + add) are in the same commit.  There's a 
nice explanation here:


https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository#_git_mv

Brian

On 10/18/2016 12:59 AM, Kinney, Michael D wrote:

Use 'git mv' command.

Mike


-Original Message-
From: Ni, Ruiyu
Sent: Tuesday, October 18, 2016 1:57 PM
To: Carsey, Jaben ; edk2-devel@lists.01.org
Cc: Kinney, Michael D ; Yao, Jiewen

Subject: RE: [edk2] [PATCH 2/5] MdePkg: Include
Shell/ShellDynamicCommand/ShellParameters definitions

I don't want to lose the file history either.
I thought git can auto-detect the copy/move of files. I will investigate 
further.

Thanks/Ray


-Original Message-
From: Carsey, Jaben
Sent: Tuesday, October 18, 2016 2:02 AM
To: Ni, Ruiyu ; edk2-devel@lists.01.org
Cc: Kinney, Michael D ; Yao, Jiewen
; Carsey, Jaben 
Subject: RE: [edk2] [PATCH 2/5] MdePkg: Include
Shell/ShellDynamicCommand/ShellParameters definitions

Can we do a GIT move operation and then merge the content from
ShellBase.h please?  I do not want to lose the file history information.

When we move by "delete" then "add" we lose everything for no reason.

-Jaben


-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of
Ruiyu Ni
Sent: Friday, October 14, 2016 2:44 AM
To: edk2-devel@lists.01.org
Cc: Carsey, Jaben ; Kinney, Michael D
; Yao, Jiewen 
Subject: [edk2] [PATCH 2/5] MdePkg: Include
Shell/ShellDynamicCommand/ShellParameters definitions
Importance: High

Copy Shell/ShellDynamicCommand/ShellParameters definitions from
ShellPkg to MdePkg.
Content of ShellBase.h is moved to Protocol/Shell.h.

The following patches will update ShellPkg to reference all protocol
definition to MdePkg.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ruiyu Ni 
Cc: Jaben Carsey 
Cc: Jiewen Yao 
Cc: Michael D Kinney 
---
 MdePkg/Include/Protocol/Shell.h   | 1268
+
 MdePkg/Include/Protocol/ShellDynamicCommand.h |   85 ++
 MdePkg/Include/Protocol/ShellParameters.h |   60 ++
 MdePkg/MdePkg.dec |   15 +
 4 files changed, 1428 insertions(+)
 create mode 100644 MdePkg/Include/Protocol/Shell.h
 create mode 100644 MdePkg/Include/Protocol/ShellDynamicCommand.h
 create mode 100644 MdePkg/Include/Protocol/ShellParameters.h

diff --git a/MdePkg/Include/Protocol/Shell.h
b/MdePkg/Include/Protocol/Shell.h
new file mode 100644
index 000..cc1fbdc
--- /dev/null
+++ b/MdePkg/Include/Protocol/Shell.h
@@ -0,0 +1,1268 @@
+/** @file
+  EFI Shell protocol as defined in the UEFI Shell 2.0 specification including
errata.
+
+  (C) Copyright 2014 Hewlett-Packard Development Company, L.P.
+  Copyright (c) 2006 - 2016, Intel Corporation. All rights reserved.
+  This program and the accompanying materials
+  are licensed and made available under the terms and conditions of the

BSD

License
+  which accompanies this distribution.  The full text of the license may be
found at
+  http://opensource.org/licenses/bsd-license.php
+
+  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS"
BASIS,
+  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER
EXPRESS OR IMPLIED.
+
+**/
+
+#ifndef __EFI_SHELL_PROTOCOL__
+#define __EFI_SHELL_PROTOCOL__
+
+#include 
+
+#define EFI_SHELL_PROTOCOL_GUID \
+  { \
+  0x6302d008, 0x7f9b, 0x4f30, { 0x87, 0xac, 0x60, 0xc9, 0xfe, 0xf5, 0xda,

0x4e

} \
+  }
+typedef VOID *SHELL_FILE_HANDLE;
+
+typedef enum {
+  ///
+  /// The operation completed successfully.
+  ///
+  SHELL_SUCCESS   = 0,
+
+  ///
+  /// The image failed to load.
+  ///
+  SHELL_LOAD_ERROR= 1,
+
+  ///
+  /// The parameter was incorrect.
+  ///
+  SHELL_INVALID_PARAMETER = 2,
+
+  ///
+  /// The operation is not supported.
+  ///
+  SHELL_UNSUPPORTED   = 3,
+
+  ///
+  /// The buffer was not the proper size for the request.
+  ///
+  SHELL_BAD_BUFFER_SIZE   = 4,
+
+  ///
+  /// The buffer was not large enough to hold the requested data.
+  /// The required buffer size is returned in the appropriate
+  /// parameter when this error occurs.
+  ///
+  SHELL_BUFFER_TOO_SMALL  = 5,
+
+  ///
+  /// There is no data pending upon return.
+  ///
+  SHELL_NOT_READY = 6,
+
+  ///
+  /// The physical device reported an error while attempting the
+  /// operation.
+  ///
+  SHELL_DEVICE_ERROR  = 7,
+
+  ///
+  /// The device cannot be written to.
+  ///
+  SHELL_WRITE_PROTECTED   = 8,
+
+  ///
+  /// The resource has run out.
+  ///
+  SHELL_OUT_OF_RESOURCES  

Re: [edk2] [PATCH 0/3] MdeModulePkg/TerminalDxe: TtyTerm improvements

2016-10-14 Thread Brian J. Johnson

On 10/12/2016 03:17 AM, Ryan Harkin wrote:

On 7 October 2016 at 16:59, Leif Lindholm <leif.lindh...@linaro.org> wrote:

Roy can now be found at Roy Franz <roy.fr...@hpe.com> (cc:d).

On Fri, Oct 07, 2016 at 05:56:26PM +0200, Laszlo Ersek wrote:

Roy, Ryan,

On 10/07/16 16:53, Brian J. Johnson wrote:

This patch series implements some improvements to the TtyTerm terminal
type in the TerminalDxe driver.  It fixes an end case with cursor
position tracking, and uses that to optimize cursor motion escape
sequences.  It also adds support for the page up, page down, insert,
home, and end keys on some additional common terminal emulators.

The result is improved performance, especially at the shell prompt,
and better compatibility with common terminal emulators.  In
particular, as a side effect of the optimized cursor motion, terminal
windows which are taller than the current mode setting (eg. 25 lines)
work much better than before.

Most of these fixes have been in production in some form on SGI's
servers for years.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Brian Johnson <bjohn...@sgi.com>
Cc: Feng Tian <feng.t...@intel.com>
Cc: Star Zeng <star.z...@intel.com>

Brian J. Johnson (3):
  MdeModulePkg/TerminalDxe: Improve TtyTerm cursor position tracking
  MdeModulePkg/TerminalDxe: Optimize TtyTerm cursor motion
  MdeModulePkg/TerminalDxe: Handle more keys with TtyTerm

 .../Universal/Console/TerminalDxe/Terminal.h   |  2 +
 .../Universal/Console/TerminalDxe/TerminalConIn.c  | 24 +++--
 .../Universal/Console/TerminalDxe/TerminalConOut.c | 61 --
 3 files changed, 79 insertions(+), 8 deletions(-)



can you please provide feedback (testing or otherwise) on this series?



Well, they "work" for me and I'd be happy with them being submitted.

Tested-by: Ryan Harkin <ryan.har...@linaro.org>

The only curious effect I can see is the Print(L"xxx"); lines that
expect the \n to be missing will no longer "work".  For example, I
carry a patch by Daniil Egranov titled
"IntelFrameworkModulePkg/BdsDxe: Show boot timeout message" and it no
longer displays the countdown on the same line each time, it prints
each message on a new line.

However, I don't see that as a blocking point, Daniil's patch could be
changed easily and there are other advantages to this series that make
it worthwhile, IMO, eg, Shell commands with lots of output (like
"help" or "dir fs0:") no longer create an awful mess on the serial
console.



So, is this just waiting for a maintainer's review?
--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


[edk2] [PATCH 2/3] MdeModulePkg/TerminalDxe: Optimize TtyTerm cursor motion

2016-10-07 Thread Brian J. Johnson
For TtyTerm terminals, output a shorter escape sequence when possible
to move the cursor within the current line, and don't print any escape
sequence if the cursor is already at the correct position.  This
removes extra cursor motion activity at the EFI shell prompt,
improving performance.  It also makes it possible in many cases to
successfully use a terminal window which is taller than the driver's
mode setting (eg. 80x25.)

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Brian Johnson 
Cc: Feng Tian 
Cc: Star Zeng 
---
 .../Universal/Console/TerminalDxe/Terminal.h   |  2 ++
 .../Universal/Console/TerminalDxe/TerminalConOut.c | 36 +++---
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h 
b/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
index 269d2ae..3ee3969 100644
--- a/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
+++ b/MdeModulePkg/Universal/Console/TerminalDxe/Terminal.h
@@ -2,6 +2,7 @@
   Header file for Terminal driver.
 
 Copyright (c) 2006 - 2014, Intel Corporation. All rights reserved.
+Copyright (C) 2016 Silicon Graphics, Inc. All rights reserved.
 This program and the accompanying materials
 are licensed and made available under the terms and conditions of the BSD 
License
 which accompanies this distribution.  The full text of the license may be 
found at
@@ -157,6 +158,7 @@ typedef union {
 #define BACKGROUND_CONTROL_OFFSET 11
 #define ROW_OFFSET2
 #define COLUMN_OFFSET 5
+#define FW_BACK_OFFSET2
 
 typedef struct {
   UINT16  Unicode;
diff --git a/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c 
b/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
index b11e83f..e9b5ed0 100644
--- a/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
+++ b/MdeModulePkg/Universal/Console/TerminalDxe/TerminalConOut.c
@@ -83,6 +83,8 @@ CHAR16 mSetModeString[]= { ESC, '[', '=', '3', 
'h', 0 };
 CHAR16 mSetAttributeString[]   = { ESC, '[', '0', 'm', ESC, '[', '4', '0', 
'm', ESC, '[', '4', '0', 'm', 0 };
 CHAR16 mClearScreenString[]= { ESC, '[', '2', 'J', 0 };
 CHAR16 mSetCursorPositionString[]  = { ESC, '[', '0', '0', ';', '0', '0', 'H', 
0 };
+CHAR16 mCursorForwardString[]  = { ESC, '[', '0', '0', 'C', 0 };
+CHAR16 mCursorBackwardString[] = { ESC, '[', '0', '0', 'D', 0 };
 
 //
 // Body of the ConOut functions
@@ -755,6 +757,7 @@ TerminalConOutSetCursorPosition (
   UINTN   MaxRow;
   EFI_STATUS  Status;
   TERMINAL_DEV*TerminalDevice;
+  CHAR16  *String;
 
   TerminalDevice = TERMINAL_CON_OUT_DEV_FROM_THIS (This);
 
@@ -782,13 +785,36 @@ TerminalConOutSetCursorPosition (
   //
   // control sequence to move the cursor
   //
-  mSetCursorPositionString[ROW_OFFSET + 0]= (CHAR16) ('0' + ((Row + 1) / 
10));
-  mSetCursorPositionString[ROW_OFFSET + 1]= (CHAR16) ('0' + ((Row + 1) % 
10));
-  mSetCursorPositionString[COLUMN_OFFSET + 0] = (CHAR16) ('0' + ((Column + 1) 
/ 10));
-  mSetCursorPositionString[COLUMN_OFFSET + 1] = (CHAR16) ('0' + ((Column + 1) 
% 10));
+  // Optimize cursor motion control sequences for TtyTerm.  Move
+  // within the current line if possible, and don't output anyting if
+  // it isn't necessary.
+  //
+  if (TerminalDevice->TerminalType == TTYTERMTYPE &&
+  Mode->CursorRow == Row) {
+if (Mode->CursorColumn > Column) {
+  mCursorBackwardString[FW_BACK_OFFSET + 0] = (CHAR16) ('0' + 
((Mode->CursorColumn - Column) / 10));
+  mCursorBackwardString[FW_BACK_OFFSET + 1] = (CHAR16) ('0' + 
((Mode->CursorColumn - Column) % 10));
+  String = mCursorBackwardString;
+}
+else if (Column > Mode->CursorColumn) {
+  mCursorForwardString[FW_BACK_OFFSET + 0] = (CHAR16) ('0' + ((Column - 
Mode->CursorColumn) / 10));
+  mCursorForwardString[FW_BACK_OFFSET + 1] = (CHAR16) ('0' + ((Column - 
Mode->CursorColumn) % 10));
+  String = mCursorForwardString;
+}
+else {
+  String = L"";  // No cursor motion necessary
+}
+  }
+  else {
+mSetCursorPositionString[ROW_OFFSET + 0]= (CHAR16) ('0' + ((Row + 1) / 
10));
+mSetCursorPositionString[ROW_OFFSET + 1]= (CHAR16) ('0' + ((Row + 1) % 
10));
+mSetCursorPositionString[COLUMN_OFFSET + 0] = (CHAR16) ('0' + ((Column + 
1) / 10));
+mSetCursorPositionString[COLUMN_OFFSET + 1] = (CHAR16) ('0' + ((Column + 
1) % 10));
+String = mSetCursorPositionString;
+  }
 
   TerminalDevice->OutputEscChar   = TRUE;
-  Status = This->OutputString (This, mSetCursorPositionString);
+  Status = This->OutputString (This, String);
   TerminalDevice->OutputEscChar = FALSE;
 
   if (EFI_ERROR (Status)) {
-- 
2.7.4

___
edk2-devel mailing list
edk2-devel@lists.01.org

Re: [edk2] What is the right way to print a UINTN?

2016-09-27 Thread Brian J. Johnson

On 09/27/2016 11:47 AM, Andrew Fish wrote:



On Sep 27, 2016, at 9:03 AM, Cohen, Eugene <eug...@hp.com> wrote:


Printing UINTN with %x *or* with %d are equally bugs.

For X64 / AARCH64 / IA64 builds, they are actual bugs (that happen to
work most of the time).


Feel free to file a Bugzilla on the extensive usage of this in
edk2 [ducking and running]. :)


I'm envisioning having to create a slide in the future for UEFI
training about the proper use of UINTNs and describing "If you think
it may exceed 2^32-1 then upcast to UINT64, otherwise don't worry
about it" and it makes me squirm.


It makes me squirm too. I think the slide should recommend the
casting
that I proposed. ;) "There is no conversion specifier dedicated to
UINTN; the portable way to print it is to cast it to UINT64, then print
it with %Lx."


This is reasonable although I expect to get asked why a lot of the
other code doesn't adhere to this recommendation.



I think this is a historical artifact. The older version of %x in
the EDK (and early edk2) implied UINTN. We hit an issue with C
integer math resulting in an int and that seemed to bork some
toolchains. That is when things changed from UINTN to int. I guess
the cleanup was practical vs. pedantic.


Thanks for the historical context, Andrew.  It's interesting to hear,
if very unfortunate.

I've written code in the past which uses a #defined value for the
UINTN format character as a way to work around this issue without
casting everything to 64 bits.  Something like:

// Format string for a naturally-sized unsigned integer
#if defined (MDE_CPU_IA32)
#define UINTN_FMT  "0x%08x"
#elif defined (MDE_CPU_X64)
#define UINTN_FMT  "0x%016lx"
#elif ...
  ...
#endif

UINTN Val;
Val = Foo ();
DEBUG((DEBUG_INFO, "Value is " UINTN_FMT "\n", Val));


I guess it's a matter of opinion if that's preferable to adding casts;
in my particular situation, I had to print values with that particular
format string in a lot of places, so it was convenient to #define it
once.



Thanks,

Andrew Fish


Thanks,

Eugene
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH] MdeModulePkg/PciBusDxe: make OPROM BAR degradation X64 specific

2016-09-15 Thread Brian J. Johnson

On 09/15/2016 10:48 AM, Andrew Fish wrote:



On Sep 15, 2016, at 8:28 AM, Leif Lindholm  wrote:

On Thu, Sep 15, 2016 at 03:54:56PM +0100, Ard Biesheuvel wrote:

The 'universal' PCI bus driver in MdeModulePkg contains a quirk to
degrade 64-bit PCI MMIO BARs to 32-bit in the presence of an option
ROM on the same PCI controller.

This quirk is highly specific to not just the X64 architecture in general,
but to the PC platform in particular, given that only X64 platforms that
require legacy PC BIOS compatibility require it. However, making the
quirk dependent on the presence of the legacy BIOS protocol met with
resistance, due to the fact that it introduces a dependency on the
IntelFrameworkModulePkg package.

So instead, make the quirk dependent on whether we are compiling for the
X64 architecture, by putting the #ifdef MDE_CPU_X64/#endif preprocessor
directives around it.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel 


I'm happy with the change, but it may be worth putting a comment in


Do we have to use an #ifdef? Seems like it should be a PCD. Is this for legacy 
BIOS Option ROM support, or for existing EFI ROMs?

Thanks,

Andrew Fish


I was going to say the same thing... this sounds like exactly the sort 
of thing PCDs are meant for.  And as Leif's comment says, not all X64 
platforms have legacy support or want BAR degradation (ours don't.)


Brian Johnson




the code to the extent that:
---
In order to reliably support legacy Option ROMs, which may not be
64-bit safe, X64 needs to ensure PCI MMIO BARs are forced down to use
only the lower 32 bits.
Other architectures, or indeed X64 without legacy option ROM support,
can safely leave these in normal operation mode (and may require so in
order to function).
---

For what it's worth:
Reviewed-by: Leif Lindholm 


---

Since nobody proposed any alternatives to the solution I proposed,
let's just make the quirk disappear on architectures that have no
use for it.

Note that forcing non-X64 architectures to install a driver that
produces the IncompatiblePciDevice protocol only to make the PCI
bus driver behave normally (and which, incidentally, can only be
produced once per platform, making it difficult to provide a default
implementation that can be widely reused) is not acceptable IMO.

If anything, this should require an IncompatiblePlatform protocol
that informs the PCI bus driver that a special quirk is required,
but I am aware that this is difficult to accomplish without breaking
backward compatibility. Hence this compromise.

MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c 
b/MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c
index b0632d53b82b..b4771a822d65 100644
--- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c
+++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c
@@ -1052,6 +1052,7 @@ DegradeResource (
  IN PCI_RESOURCE_NODE *PMem64Node
  )
{
+#ifdef MDE_CPU_X64
  PCI_IO_DEVICE*PciIoDevice;
  LIST_ENTRY   *ChildDeviceLink;
  LIST_ENTRY   *ChildNodeLink;
@@ -1101,6 +1102,7 @@ DegradeResource (
}
ChildDeviceLink = ChildDeviceLink->ForwardLink;
  }
+#endif

  //
  // If firmware is in 32-bit mode,
--
2.7.4


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--

  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH] PcAtChipsetPkg AcpiTimerLib: Get more accurate TSC Frequency

2016-08-11 Thread Brian J. Johnson

On 08/10/2016 09:37 PM, Star Zeng wrote:

Minimize the code overhead between the two TSC reads by adding
new internal API to calculate TSC Frequency instead of reusing
MicroSecondDelay ().


Why not just use the MSR_PLATFORM_INFO (0xce) register bits 15..8, on 
CPUs where it's available, to read the TSC frequency directly?

--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [Patch] UefiCpuPkg/MtrrLib: Remove MTRRs display from MtrrSetAllMtrrs()

2016-07-13 Thread Brian J. Johnson

On 07/13/2016 11:19 AM, Laszlo Ersek wrote:

On 07/13/16 17:46, Kinney, Michael D wrote:

Laszlo,

I agree that the DEBUG() messages for this are very valuable to debug
MTRR cache settings.

Another option is to add logic to detect if the calling CPU is the BSP or
not and only invoke DEBUG() macros if the caller is the BSP.  That would
not require any changes to the MtrrLib APIs or calling code.


I thought of that, but it would (I believe) introduce a dependency on
the MP protocol / PPI (for the WhoAmI() member), and that seems too much
for a base library like MtrrLib.

I guess two new library instances could be created: one for PEIM and
another for DXE phase client modules. The MtrrLib functions could check
if the MP PPI / protocol were available. Negative answer: assume we are
running on the BSP, and debug-log away. Positive answer: call WhoAmI(),
and act accordingly.

This would be usable -- I think -- for general thread-safety concerns,
not just debugging, but it's quite a bit of churn.

... Although, I'm not even fully sure whether (WhoAmI() == 0) can be
taken as "BSP". Given SwitchBSP() especially.

Do you have better ideas for internally determining whether the
processor is an AP or a BSP? Something with the APIC IDs perhaps?

Thanks
Laszlo



There's the BSP flag (bit 8) in the MSR_IA32_APIC_BASE register.  It 
doesn't seem to be exposed nicely by BaseXApicX2ApicLib, but it's easy 
enough to read directly.

--

        Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] PCIe hotplug into downstream port

2016-06-30 Thread Brian J. Johnson

On 06/30/2016 11:47 AM, Laszlo Ersek wrote:

On 06/30/16 18:39, Marcel Apfelbaum wrote:

On 06/30/2016 07:21 PM, Marcel Apfelbaum wrote:

On 06/30/2016 04:07 PM, Laszlo Ersek wrote:

Hi,

does anyone on this list have experience with $SUBJECT, using physical
UEFI firmware derived from edk2?

The problem we're seeing with OVMF is the following: PCIe hotplug works
with PCIe root ports, but it doesn't work with PCIe downstream ports.
The error messages reported by both Linux and Windows guests indicate
"lack of resources" for the device being plugged.

In my understanding:

* Each PCIe downstream port qualifies as a separate bus / bridge.

* Bridges need IO and MMIO resources for themselves. When a device is
behind a bridge at boot, the device's IO and MMIO BARs are carved out
of the bridge's IO and MMIO apertures. For IO, the standard bridge
aperture size is 4096 (byte-sized) ports.

* If no device behind a bridge needs a specific type of resource (for
example, no devices behind a bridge have IO BARs), then that kind of
aperture is not necessary for the bridge either.

A special case of this is when there are no devices behind a bridge
at all. Example: PCIe downstream port empty at boot. In this case,
the bridge needs no resources at all, at boot.

* Different firmwares seem to follow different policies in the last
case (-> bridge empty at boot). SeaBIOS for example allocates the
0x1000 IO ports nonetheless. The PCI bus driver in edk2 does not: it
allocates no apertures for bridges that have no devices behind them
at boot.

This appears to be a trade-off: if the bridge remains empty for the
entire lifetime of the OS, then any IO ports that have been allocated
in advance will be in vain (wasted). However, if the aperture is not
allocated at boot, then hotplug under the bridge may not work --
resources for the device being hot-plugged cannot be allocated from
nonexistent bridge apertures.



Hi Laszlo,
I fully agree with your assessment.

  From the firmware point of view it seems like a waste to allocate an
IO/MEM range for
an unused PCI/PCIe-bridge indeed. Especially IO.

But what about MEM?
Let's have a look to this special case: PCIe downstream ports on
64-bit systems.

1. PCIe spec requires devices to work correctly even without enabling
their IO BARs.
2. 64-bit Memory space is pretty large, decent CPUs support 46-bit
addresses, so even if the system uses Terabytes of RAM
 and has dozens of switches we still have enough unused MEM space.

My point is we can skip assigning IO for PCIe ports but we can assign
a 4K MEM range in 64-bit memory for them
without being too greedy.


I meant 2M MEM range, sorry.  4k is the min range for IO, of course.




One thing to watch out for is prefetchable vs. non-prefetchable MEM 
ranges.  On Intel chipsets, the high (64-bit) MMIO range is always 
prefetchable, which mainly makes it useful for frame buffers, 
accelerator cards with RAM mapped to MMIO in the host, and the like. 
Only the low (32-bit) MMIO space is non-prefetchable, and suitable for 
registers and things which need precise ordering of loads and stores.


So you can demote a prefetchable BAR into non-prefetchable MMIO space if 
necessary, but not the other way around.  Which is a shame, because the 
32-bit MMIO space is much more limited in size than the 64-bit space.



I've just been looking at the values reserved by SeaBIOS [src/fw/pciinit.c]:

#define PCI_BRIDGE_MEM_MIN(1<<21)  // 2M == hugepage size
#define PCI_BRIDGE_IO_MIN  0x1000  // mandated by pci bridge spec

2MB sounds perfectly fine for MEM, especially given that OVMF supports a
64-bit MMIO aperture (32GB in size by default).

We discussed the IO size earlier -- I think 4K is overkill (despite
being required by the standard). We considered half K, 1K, 2K earlier...
I think I'll take a stab at half K first.


It depends on what your hardware (physical or virtual) supports.  A 
standards-compliant bridge may not be able to map I/O ports to targets 
on less than a 4k boundary.  Yes, 4k is ridiculous overkill, but it may 
be your only choice, depending on the details of the h/w and/or VMM.


We've had to add special cases in BIOS for certain cards we support 
which request IO space, but don't actually use it.  We can safely skip 
allocating IO ports to those cards, saving significant space in the 
overall IO port map.  We've worked with the card vendors on cleaning up 
their PCIe advertisements and eliminating the use of IO ports, but it 
takes time.

--

Brian



  "My software never has bugs.  It just develops random features."
   -- Quoted by Khan Tran
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] PCIe hotplug into downstream port

2016-06-30 Thread Brian J. Johnson

On 06/30/2016 09:35 AM, Laszlo Ersek wrote:

On 06/30/16 16:24, Brian J. Johnson wrote:

On 06/30/2016 08:07 AM, Laszlo Ersek wrote:



- Does PCIe hotplug into downstream ports work with any (phys) firmware
forked from edk2? Brian, Samer, do you guys have experience with this?


Yes, we support it on our UV server line.  Physically, hotplug is only
possible in expansion chassis designed for that purpose.  (That's not a
problem with virtual machines, of course.)  But we've had to do quite a
bit of work on the host bridge driver to allowing scaling to the sizes
of systems we support (hundreds of sockets and PCIe slots, dozens of TB
of RAM), including reserving resources for hotplug only in the specific
cases where it's possible.


Thanks a lot for your answer! Could you perhaps check for me if your
platform firmware implements EFI_PCI_HOT_PLUG_INIT_PROTOCOL, highlighted
by Andrew?



Yes, we do.  We have a customized implementation of this protocol which 
provides padding values optimized for the expansion chassis we support.

--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] PCIe hotplug into downstream port

2016-06-30 Thread Brian J. Johnson

On 06/30/2016 08:07 AM, Laszlo Ersek wrote:

Hi,

does anyone on this list have experience with $SUBJECT, using physical
UEFI firmware derived from edk2?

The problem we're seeing with OVMF is the following: PCIe hotplug works
with PCIe root ports, but it doesn't work with PCIe downstream ports.
The error messages reported by both Linux and Windows guests indicate
"lack of resources" for the device being plugged.

In my understanding:

* Each PCIe downstream port qualifies as a separate bus / bridge.

* Bridges need IO and MMIO resources for themselves. When a device is
   behind a bridge at boot, the device's IO and MMIO BARs are carved out
   of the bridge's IO and MMIO apertures. For IO, the standard bridge
   aperture size is 4096 (byte-sized) ports.

* If no device behind a bridge needs a specific type of resource (for
   example, no devices behind a bridge have IO BARs), then that kind of
   aperture is not necessary for the bridge either.

   A special case of this is when there are no devices behind a bridge
   at all. Example: PCIe downstream port empty at boot. In this case,
   the bridge needs no resources at all, at boot.

* Different firmwares seem to follow different policies in the last
   case (-> bridge empty at boot). SeaBIOS for example allocates the
   0x1000 IO ports nonetheless. The PCI bus driver in edk2 does not: it
   allocates no apertures for bridges that have no devices behind them
   at boot.

   This appears to be a trade-off: if the bridge remains empty for the
   entire lifetime of the OS, then any IO ports that have been allocated
   in advance will be in vain (wasted). However, if the aperture is not
   allocated at boot, then hotplug under the bridge may not work --
   resources for the device being hot-plugged cannot be allocated from
   nonexistent bridge apertures.

And that's what we are experiencing with OVMF.


Laszlo,

I believe that's an accurate summary of PCIe behavior.



My questions are:

- Is the above behavior of the edk2 PCI bus driver intentional?


I'm not familiar with the history of the code  But any PCI bridge 
which wants to support hotplug needs resources allocated to it at boot 
time.  As you say, that's a trade-off between wasting resources 
(especially I/O ports) and allowing hotplug.  The authors of the code 
must have opted for avoiding waste vs. allowing hotplug.




- Does PCIe hotplug into downstream ports work with any (phys) firmware
   forked from edk2? Brian, Samer, do you guys have experience with this?


Yes, we support it on our UV server line.  Physically, hotplug is only 
possible in expansion chassis designed for that purpose.  (That's not a 
problem with virtual machines, of course.)  But we've had to do quite a 
bit of work on the host bridge driver to allowing scaling to the sizes 
of systems we support (hundreds of sockets and PCIe slots, dozens of TB 
of RAM), including reserving resources for hotplug only in the specific 
cases where it's possible.




- What could be the difference between root ports and downstream ports?
   (Hotplug into root ports seems to work fine.) Are OSes entitled to
   allocate any unused address space (MMIO and IO) right when a device is
   hot-plugged into a root port?


I'm not sure what the OS rules for this are
--

Brian



   "The national budget must be balanced.  The public debt must be
reduced; the arrogance of the authorities must be moderated and
controlled.  Payments to foreign governments must be reduced, if
the nation doesn't want to go bankrupt.  People must again learn
to work, instead of living on public assistance."
  -- Marcus Tullius Cicero, 55 BC
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] StdLib usage for drivers?

2016-06-23 Thread Brian J. Johnson

On 06/22/2016 02:38 PM, Marvin H?user wrote:

Dear EDK2 developers,

For an experimental project, I'm currently attempting to write a
library wrapper for the disassembler library 'Capstone' in a similar
manner to CryptoPkg's OpensslLib. ...


Marvin,

If you only need IA32/X64, the Udis86 library 
(http://udis86.sourceforge.net/) may be a good alternative.  I was able 
to write a similar UEFI library wrapper for it quite easily.  I think 
all it needed were integer types, varargs, memset, snprintf, and 
vsnprintf, along with a few other small fixes (like 64-bit shifts and 
replacing "%s" with "%a" in format strings.)

--

        Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2 3/3] MdeModulePkg/TerminalDxe: Set polling rate by serial IO mode

2016-03-29 Thread Brian J. Johnson
, and reduce the interval
by 1/3,
+  // to make polling rate fast enough to avoid serial input
truncation.
+  return DivU64x64Remainder (
+FifoDepth * Mode->DataBits * 1000 * 2,
+Mode->BaudRate * 3,
+NULL
+);
+}

6. Same question. Can you use the original equation #1?


+
+
+/**
+  Update period of polling timer event.
+
+  @param  TerminalDeviceThe terminal device to update.
+**/
+VOID
+UpdatePollingRate (
+  IN TERMINAL_DEV *TerminalDevice
+  )
+{
+  UINT64  NewInterval;
+  EFI_STATUS  Status;
+
+  NewInterval = GetKeyboardTimerInterval
(TerminalDevice->SerialIo->Mode);
+
+  if (TerminalDevice->KeyboardTimerInterval == NewInterval) {
+return;
+  }
+
+  Status = gBS->SetTimer (
+  TerminalDevice->TimerEvent,
+  TimerPeriodic,
+  NewInterval
+  );
+  ASSERT_EFI_ERROR (Status);
+
+  TerminalDevice->KeyboardTimerInterval = NewInterval;
+}
+
+
+/**
Timer handler to poll the key from serial.

@param  EventIndicates the event that
invoke this function.
@@ -560,6 +625,9 @@ TerminalConInTimerHandler (
TerminalDevice->SerialInTimeOut = SerialInTimeOut;
  }
}
+
+  UpdatePollingRate (TerminalDevice);

7. All the parameters (baudrate, databits, stopbits,etc) are stored
in the UART device path.
Any change to these parameters triggers the UART device path
reinstallation, ultimately
triggers the Terminal driver's restart.
So the polling rate doesn't need to update here.


How about FIFO depth change?

Thanks for your review :)

Heyi




+
//
// Check whether serial buffer is empty.
// Skip the key transfer loop only if the SerialIo protocol
instance
--
2.7.0

___
edk2-devel mailing list
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
https://lists.01.org/mailman/listinfo/edk2-devel

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel



--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [patch] MdeModulePkg: Make HII configuration settings available to OS runtime

2016-02-23 Thread Brian J. Johnson

On 02/22/2016 09:00 PM, Dong, Eric wrote:


Hi All,

Thanks for your comments. I add my explanation below:

Only hook ReadyToBoot event is not enough. Different drivers may hook this 
event and some may update string package in their

callback function. The order to call these callback functions is random, so 
only hook this event may miss some changes.




Eric,

Could you enable your HII exporting callbacks in a ReadyToBoot event?
Then you wouldn't needlessly pay the export cost each time a HII element
is added or changed as drivers are loaded.


Brian,

Do you mean we first export HII data at ReadyToBoot event. After it, we also 
hook all HII Database changes action (include add package/ remove package/ 
update packages/ HiiSetString)?
If only export HII data at ReadyToBoot event, it is not enough, because the HII 
data maybe changed later(Through add package/ remove package )?

Thanks,
Eric


Yes.  Good summary.

Thanks,
Brian




This feature is described in UEFI spec 2.6 chapter 31.2.11.1. It required to 
export all HiiDatabae data and current configuration data.


I don't see any language in that section that makes it required. It seems to 
just describe how to do it.



Agreed.  And section 31.2.1.10 says:

=
When it is desired that the forms data be used in the presence of an
O/S, this specification describes
a means by which to support this capability. By being able to
encapsulate the data and export it
through standard means such that an O/S agent (e.g. forms
browser/processor) can retrieve it, O/S-
present usage models can be made available for further value-add
implementations.
=

The wording implies that exporting is not always desired, and that
O/S-present usage of forms data is a value-add feature, which a platform
may choose to support or not.


We add this feature because we strongly encourage the export of config data to 
support manageability of platform config both for pre-

OS and OS runtime operations.




Generally the UEFI spec and the edk2 don't try to force platform policy. For 
example exposing some configuration information could be

considered a security vulnerability on some platforms, so it should not be 
forced on a platform.


Thanks,

Andrew Fish


Also we collect the performance data from our reference platform(Detail see 
below). The boot performance is small and we can

ignore it. This feature cost extra 869KB, but we think the size cost is not 
care because current we use 16M size of bios flash.


HiiDatabase driver data size cost(This feature add in HiiDatabase driver):
Without this feature:
Used data size: 151KB
With this feature
Used data size: 1020 KB
PS: This extra size cost mainly depend on how much Hii packages in your 
platform.

Boot Time cost:
Without this feature:
Boot Time cost:1st:3.353s   2nd: 3.035s
With this feature:
Boot Time cost:1st: 3.125s  2nd: 3.126s

Base on the above analysis, we provide this design and prefer to always enable 
this feature instead of use a PCD to control it.

Thanks,
Eric



My particular concern involves a simulator (far slower than hardware)
for an extremely large, complex server platform with a large amount of
HII data.  On this platform, we end up disabling the largest but least
used setup screens when running on the simulator, simply to avoid the
time overhead of processing the HII data.  This saves many minutes of
execution time.


-Original Message-
From: El-Haj-Mahmoud, Samer [mailto:samer.el-haj-mahm...@hpe.com]
Sent: Wednesday, February 17, 2016 2:14 AM
To: Andrew Fish; Brian J. Johnson
Cc: Bi, Dandan; edk2-devel@lists.01.org; Dong, Eric; Gao, Liming
Subject: RE: [edk2] [patch] MdeModulePkg: Make HII configuration settings 
available to OS runtime

+1

I also would add there may be some HII strings that are hidden from user 
interfaces, and reflect settings for field service or
troubleshooting, and that a mass export to the OS may expose these settings to 
OS runtime code and possibly applications.



-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Andrew 
Fish
Sent: Tuesday, February 16, 2016 10:37 AM
To: Brian J. Johnson <bjohn...@sgi.com>
Cc: Dandan Bi <dandan...@intel.com>; edk2-devel@lists.01.org; Eric Dong 
<eric.d...@intel.com>; Liming Gao

<liming@intel.com>

Subject: Re: [edk2] [patch] MdeModulePkg: Make HII configuration settings 
available to OS runtime



On Feb 16, 2016, at 8:33 AM, Brian J. Johnson <bjohn...@sgi.com> wrote:

On 02/16/2016 12:58 AM, Dandan Bi wrote:

This feature is aimed to allow OS make use of the HII database during
runtime. In this case, the contents of the HII Database is exported
to a buffer. The pointer to the buffer is placed in the EFI System
Configuration Table, where it can be retrieved by an OS application.

Export the configuration data and contents 

Re: [edk2] [Patch V4 4/4] MdeModulePkg: Add generic PciHostBridgeDxe driver.

2016-02-22 Thread Brian J. Johnson

On 02/22/2016 11:44 AM, Laszlo Ersek wrote:

On 02/22/16 18:21, Brian J. Johnson wrote:


Here's another example of a bare metal machine with multiple PCI roots,
although they do not share resources (SGI UV1000, edited for brevity):


[snip an incredible amount of devices]

Does this supercomputer fit in a van? :)



It's just a little 6-socket system we use as a build server.  You should 
see the list on a 256-socket, 4-rack, 64TB machine.  :)


Most of those "devices" are built in to the CPU sockets.  Intel likes to 
expose their configuration registers, including useful things like power 
and temperature monitors, via PCI config space.  Xeons produce rather 
long lspci output.



...
I think (with my admittedly limited PCI "expertise", quotes justified)
that the above driver and library class are a really good abstraction.
(I can praise it; it's not my design. :))

For OVMF we need a few tweaks in the driver code / assumptions (about
non-overlapping MMIO and ioport apertures), but otherwise it looks very
promising to us.

Thanks
Laszlo


Thanks for the summary, I appreciate it.



Thanks,
Brian




  >
  > Regards,
  > Ray
  >
  >
  >> -Original Message-
  >> From: Marcel Apfelbaum [mailto:marcel.apfelb...@gmail.com]
  >> Sent: Monday, February 8, 2016 6:56 PM
  >> To: Ni, Ruiyu <ruiyu...@intel.com>; Laszlo Ersek <ler...@redhat.com>
  >> Cc: Justen, Jordan L <jordan.l.jus...@intel.com>;
edk2-de...@ml01.01.org;
  >> Tian, Feng <feng.t...@intel.com>; Fan, Jeff <jeff@intel.com>
  >> Subject: Re: [edk2] [Patch V4 4/4] MdeModulePkg: Add generic
  >> PciHostBridgeDxe driver.
  >>
  >> Hi,
  >>
  >> I am sorry for the noise, I am re-sending this mail from an e-mail
address
  >> subscribed to the list.
  >>
  >> Thanks,
  >> Marcel
  >>
  >> On 02/08/2016 12:41 PM, Marcel Apfelbaum wrote:
  >>> On 02/06/2016 09:09 AM, Ni, Ruiyu wrote:
  >>>> Marcel,
  >>>> Please see my reply embedded below.
  >>>>
  >>>> On 2016-02-02 19:07, Laszlo Ersek wrote:
  >>>>> On 02/01/16 16:07, Marcel Apfelbaum wrote:
  >>>>>> On 01/26/2016 07:17 AM, Ni, Ruiyu wrote:
  >>>>>>> Laszlo,
  >>>>>>> I now understand your problem.
  >>>>>>> Can you tell me why OVMF needs multiple root bridges support?
  >>>>>>> My understanding to OVMF is it's a firmware which can be used
in a
  >>>>>>> guest VM
  >>>>>>> environment to boot OS.
  >>>>>>> Multiple root bridges requirement currently mainly comes from
  >> high-end
  >>>>>>> servers.
  >>>>>>> Do you mean that the VM guest needs to be like a high-end
server?
  >>>>>>> This may help me to think about the possible solution to your
problem.
  >>>>>> Hi Ray,
  >>>>>>
  >>>>>> Laszlo's explanation is very good, this is not exactly about
high-end VMs,
  >>>>>> we need the extra root bridges to match assigned devices to their
  >>>>>> corresponding NUMA node.
  >>>>>>
  >>>>>> Regarding the OVMF issue, the main problem is that the extra root
  >>>>>> bridges are created dynamically
  >>>>>> for the VMs (command line parameter) and their resources are
  >> computed on
  >>>>>> the fly.
  >>>>>>
  >>>>>> Not directly related to the above, the optimal way to allocate
resources
  >>>>>> for PCI root bridges
  >>>>>> sharing the same PCI domain is to sort devices MEM/IO ranges
from the
  >>>>>> biggest to smallest
  >>>>>> and use this order during allocation.
  >>>>>>
  >>>>>> After the resources allocation is finished we can build the CRS
for each
  >>>>>> PCI root bridge
  >>>>>> and pass it back to firmware/OS.
  >>>>>>
  >>>>>> While for "real" machines we can hard-code the root bridge
resources in
  >>>>>> some ROM and have it
  >>>>>> extracted early in the boot process, for the VM world this would
not be
  >>>>>> possible. Also
  >>>>>> any effort to divide the resources range before the resource
allocation
  >>>>>> would be odd and far from optimal.
  >>
  >> Hi Ray,
  >> Thank you for your response,
  >>
  >>>> Real machine uses hard-code resources for root bridges. But when

Re: [edk2] [Patch V4 4/4] MdeModulePkg: Add generic PciHostBridgeDxe driver.

2016-02-22 Thread Brian J. Johnson
CRS
for each
 >>>>>> PCI root bridge
 >>>>>> and pass it back to firmware/OS.
 >>>>>>
 >>>>>> While for "real" machines we can hard-code the root bridge
resources in
 >>>>>> some ROM and have it
 >>>>>> extracted early in the boot process, for the VM world this would
not be
 >>>>>> possible. Also
 >>>>>> any effort to divide the resources range before the resource
allocation
 >>>>>> would be odd and far from optimal.
 >>
 >> Hi Ray,
 >> Thank you for your response,
 >>
 >>>> Real machine uses hard-code resources for root bridges. But when the
 >> resource
 >>>> cannot meet certain root bridges' requirement, firmware can save
the real
 >> resource
 >>>> requirement per root bridges to NV storage and divide the
resources to
 >> each root
 >>>> bridge in next boot according to the NV settings.
 >>>> The MMIO/IO routine in the real machine I mentioned above needs to be
 >> fixed
 >>>> in a very earlier phase before the PciHostBridgeDxe driver runs.
That's to
 >> say if
 >>>> [2G, 2.8G) is configured to route to root bridge #1, only [2G,
2.8G) is
 >> allowed to
 >>>> assigned to root bride #1.  And the routine cannot be changed
unless a
 >> platform
 >>>> reset is performed.
 >>
 >> I understand.
 >>
 >>>>
 >>>> Based on your description, it sounds like all the root bridges in
OVMF share
 >> the
 >>>> same range of resource and any MMIO/IO in the range can be route
to any
 >> root
 >>>> bridge. For example, every root bridge can use [2G, 3G) MMIO.
 >>>
 >>> Exactly. This is true for "snooping" host-bridges which do not have
their own
 >>> configuration registers (or MMConfig region). They are sniffing
host-bridge
 >> 0
 >>> for configuration cycles and if the are meant for a device on a bus
number
 >>> owned by them, they will forward the transaction to their primary
root bus.
 >>>
 >>> Until in
 >>>> allocation phase, root bridge #1 is assigned to [2G, 2.8G), #2 is
assigned
 >>>> to [2.8G, 2.9G), #3 is assigned to [2.9G, 3G).
 >>
 >> Correct, but the regions do not have to be disjoint in the above
scenario.
 >> root bridge #1 can have [2G,2.4G) and [2.8,3G) while root bridge #1
can have
 >> [2.4,2.8).
 >>
 >> This is so the firmware can distribute the resources in an optimal
way. An
 >> example can be:
 >> - root bridge #1 has a PCI device A with a huge BAR and a PCI
device B
 >> with a little BAR.
 >> - root bridge #2 has  aPCI device C with a medium BAR.
 >> The best way to distribute resources over [2G, 3G) is A BAR, C BAR,
and only
 >> then B BAR.
 >>
 >>>> So it seems that we need a way to tell PciHostBridgeDxe driver
from the
 >> PciHostBridgeLib
 >>>> that all resources are sharable among all root bridges.
 >>
 >> This is exactly what we need, indeed.
 >>
 >>>>
 >>>> The real platform case is the allocation per root bridge and OVMF
case is
 >> the allocation
 >>>> per PCI domain.
 >>
 >> Indeed, bare metal servers use different PCI domain per host bridge,
but I've
 >> actually seen
 >> real servers that have multiple root bridges sharing the same PCI
domain, 0.
 >>
 >>
 >>>> Is my understanding correct?
 >>
 >> It is, and thank you for taking your time to understand the issue,
 >> Marcel
 >>
 >>>>
 >>> [...]


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel



--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [patch] MdeModulePkg: Make HII configuration settings available to OS runtime

2016-02-18 Thread Brian J. Johnson

On 02/17/2016 10:25 PM, Andrew Fish wrote:



On Feb 17, 2016, at 6:50 PM, Dong, Eric <eric.d...@intel.com> wrote:

Hi All,

Thanks for your comments. I add my explanation below:

Only hook ReadyToBoot event is not enough. Different drivers may hook this 
event and some may update string package in their callback function. The order 
to call these callback functions is random, so only hook this event may miss 
some changes.



Eric,

Could you enable your HII exporting callbacks in a ReadyToBoot event? 
Then you wouldn't needlessly pay the export cost each time a HII element 
is added or changed as drivers are loaded.



This feature is described in UEFI spec 2.6 chapter 31.2.11.1. It required to 
export all HiiDatabae data and current configuration data.


I don't see any language in that section that makes it required. It seems to 
just describe how to do it.



Agreed.  And section 31.2.1.10 says:

=
When it is desired that the forms data be used in the presence of an 
O/S, this specification describes
a means by which to support this capability. By being able to 
encapsulate the data and export it
through standard means such that an O/S agent (e.g. forms 
browser/processor) can retrieve it, O/S-
present usage models can be made available for further value-add 
implementations.

=

The wording implies that exporting is not always desired, and that 
O/S-present usage of forms data is a value-add feature, which a platform 
may choose to support or not.



We add this feature because we strongly encourage the export of config data to 
support manageability of platform config both for pre-OS and OS runtime 
operations.



Generally the UEFI spec and the edk2 don't try to force platform policy. For 
example exposing some configuration information could be considered a security 
vulnerability on some platforms, so it should not be forced on a platform.

Thanks,

Andrew Fish


Also we collect the performance data from our reference platform(Detail see 
below). The boot performance is small and we can ignore it. This feature cost 
extra 869KB, but we think the size cost is not care because current we use 16M 
size of bios flash.

HiiDatabase driver data size cost(This feature add in HiiDatabase driver):
Without this feature:
Used data size: 151KB
With this feature
Used data size: 1020 KB
PS: This extra size cost mainly depend on how much Hii packages in your 
platform.

Boot Time cost:
Without this feature:
Boot Time cost:1st:3.353s   2nd: 3.035s
With this feature:
Boot Time cost:1st: 3.125s  2nd: 3.126s

Base on the above analysis, we provide this design and prefer to always enable 
this feature instead of use a PCD to control it.

Thanks,
Eric



My particular concern involves a simulator (far slower than hardware) 
for an extremely large, complex server platform with a large amount of 
HII data.  On this platform, we end up disabling the largest but least 
used setup screens when running on the simulator, simply to avoid the 
time overhead of processing the HII data.  This saves many minutes of 
execution time.



-Original Message-
From: El-Haj-Mahmoud, Samer [mailto:samer.el-haj-mahm...@hpe.com]
Sent: Wednesday, February 17, 2016 2:14 AM
To: Andrew Fish; Brian J. Johnson
Cc: Bi, Dandan; edk2-devel@lists.01.org; Dong, Eric; Gao, Liming
Subject: RE: [edk2] [patch] MdeModulePkg: Make HII configuration settings 
available to OS runtime

+1

I also would add there may be some HII strings that are hidden from user 
interfaces, and reflect settings for field service or
troubleshooting, and that a mass export to the OS may expose these settings to 
OS runtime code and possibly applications.



-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Andrew 
Fish
Sent: Tuesday, February 16, 2016 10:37 AM
To: Brian J. Johnson <bjohn...@sgi.com>
Cc: Dandan Bi <dandan...@intel.com>; edk2-devel@lists.01.org; Eric Dong 
<eric.d...@intel.com>; Liming Gao <liming@intel.com>
Subject: Re: [edk2] [patch] MdeModulePkg: Make HII configuration settings 
available to OS runtime



On Feb 16, 2016, at 8:33 AM, Brian J. Johnson <bjohn...@sgi.com> wrote:

On 02/16/2016 12:58 AM, Dandan Bi wrote:

This feature is aimed to allow OS make use of the HII database during
runtime. In this case, the contents of the HII Database is exported
to a buffer. The pointer to the buffer is placed in the EFI System
Configuration Table, where it can be retrieved by an OS application.

Export the configuration data and contents of HiiDatabase when driver
add package, update package and remove package.
For string and image may also need to update the contents of
HiiDatabase when NewString/SetString, NewImage/SetImage.

Cc: Liming Gao <liming@intel.com>
Cc: Eric Dong <eric.d...@intel.com>
Contributed-under: TianoCore Con

Re: [edk2] [patch] MdeModulePkg: Make HII configuration settings available to OS runtime

2016-02-16 Thread Brian J. Johnson

On 02/16/2016 12:58 AM, Dandan Bi wrote:

This feature is aimed to allow OS make use of the HII database
during runtime. In this case, the contents of the HII Database
is exported to a buffer. The pointer to the buffer is placed
in the EFI System Configuration Table, where it can be retrieved
by an OS application.

Export the configuration data and contents of HiiDatabase when
driver add package, update package and remove package.
For string and image may also need to update the contents of
HiiDatabase when NewString/SetString, NewImage/SetImage.

Cc: Liming Gao <liming@intel.com>
Cc: Eric Dong <eric.d...@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Dandan Bi <dandan...@intel.com>
...


Please make this behavior selectable via a PCD.  HII operations are very 
expensive, especially on simulators.  I don't want to pay the export 
time every time a package is added or string is changed.  Also, 
platforms should be able to decide if they want to offer this data to 
the OS.


Why not just export the data once, using a "ready to boot" event hook?

Thanks,
--

        Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V4 10/13] ArmPlatformPkg: Use SerialDxe in MdeModulePkg instead of EmbeddedPkg

2016-01-18 Thread Brian J. Johnson

On 01/18/2016 05:55 AM, Laszlo Ersek wrote:

On 01/18/16 12:41, Ryan Harkin wrote:

>On 18 January 2016 at 11:33, Laszlo Ersek<ler...@redhat.com>  wrote:

>>On 01/18/16 11:24, Zeng, Star wrote:

>>>[...]
>>>

>>>>>
>>>>>The above analysis is very clear, thanks. I am a little concern about if
>>>>>the code changes below follow the comments in the code.
>>>>>
>>>>>In Terminal.c:
>>>>>  //
>>>>>  // Set the timeout value of serial buffer for
>>>>>  // keystroke response performance issue
>>>>>  //
>>>>>In TerminalConIn.c:
>>>>>//
>>>>>//  if current timeout value for serial device is not identical with
>>>>>//  the value saved in TERMINAL_DEV structure, then recalculate the
>>>>>//  timeout value again and set serial attribute according to this
>>>>>value.
>>>>>//

>>>
>>>Any comments about above concern?

>>
>>Not really.
>>
>>I don't know what the purpose of the Timeout calculation is (what is the
>>"keystroke response performance issue"?), but in any case, it is a good
>>sign that TerminalDxe sets*some*  Timeout explicitly.
>>
>>Plus, it has worked reliably until now, so we shouldn't change the
>>Timeout argument.
>>

>
>I think "reliably" is a bit generous;-)   The most common complaint I
>get about EDK2 is that it can't handle more than a FIFO's worth of
>pasted characters on the serial port.

I tend to think this comes, at least partly, from the fact that UEFI
doesn't do interrupt-driven drivers. There's timer based polling (in
drivers), and there are busy loops (in applications, I guess).

I think you'll see the same behavior with network packet reception.

I said "reliably" beause in my environment I've had practically zero
issues with the serial terminal (beyond the one, very lacking, textual
resolution list -- but I patched that for myself permanently).

Thanks
Laszlo


Can't you just enable hardware flow control (RTS/CTS) on the UART?
--

Brian J. Johnson


  Email:   bjohn...@sgi.com   U.S. Mail: SGI
  Office:  (651) 683-75212750 Blue Water Road
  VNET:6-233-7521Eagan, MN  55121-1400
  Fax: (651) 683-7696
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] Alter OVMF/UEFI iSCSI settings outside of the OVMF UI

2015-11-16 Thread Brian J. Johnson

On 11/13/2015 01:16 PM, Andrew Fish wrote:



On Nov 13, 2015, at 4:39 AM, P.A.M. van Dam (Pascal) 
<pas...@jarmine01.kleefdraakjes.nest> wrote:

Good morning all,

What I would like to do is pre-configure the NVRAM of the Virtual Machines 
(guests) with the
ISCSI config. So have initiator, iSCSI host and target preconfigured in the 
NVRAM before the guest
gets started. This way, RHEL/CENTOS Kickstart should be able to pickup the 
iSCSI config and use
the UEFI discovered and enabled bootdrive in the configuration.

My questions:

1) Is this possible. And yes, what tool can I use?

2) If not, where can I find the documentation about how the NVRAM is built,
  so I can try to create such a tool myself.



The NVRAM format is not standard as it is abstracted by the UEFI Variable services 
that exist at runtime, so the OS can use them too. So the easy answer is run some 
EFI/OS code that calls gRT->SetVariable().

For the edk2 the format is defined here:
https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Include/Guid/VariableFormat.h

This is the DXE Driver that reads and writes the variables.
https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Universal/Variable/RuntimeDxe

Thanks,

Andrew Fish


But I will point out that Intel has an open source utility for modifying 
the varstore embedded in a BIOS image, called Firmware Configuration 
Editor (Intel FCE.)  It works on various server boards they produce.


https://firmware.intel.com/sites/default/files/2015-WW39-FCE.30.zip

I've never tried it with OVMF, YMMV.
--

        Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v4 18/41] OvmfPkg: select LocalApicLib instance with x2apic support if SMM_REQUIRE

2015-11-04 Thread Brian J. Johnson

On 11/04/2015 02:08 PM, Laszlo Ersek wrote:

On 11/04/15 17:55, Kinney, Michael D wrote:

Laszlo,

BaseXApicX2ApicLib is intended to be used by platforms that support more >=256 
CPUs.

If the current system configuration is < 256 CPUs, then the platform will 
typically stay in APIC mode.  If >= 256 CPUs are detected, then X2 APIC mode can 
be enabled using the following API.

VOID
EFIAPI
SetApicMode (
   IN UINTN  ApicMode
   )

So just adding BaseXApicX2ApicLib to the DSC does not enable X2 APIC mode.  You 
have to add logic to enable X2 APIC mode.

I see that QEMU is limited to 255 VCPUs, so the use of BaseXApicLib makes sense.  
Are OVMF configurations supported with >= 256 VCPUs?


I don't think so, but 64-bit Linux guest kernels enable x2apic mode
anyway, apparently regardless of the VCPU count and topology.

This is normally not a problem, but with SMM support, the APIC library
gets used (within SMM) *after* Linux configures x2apic mode. This causes
the "basic" library instance to blow up. Therefore I flipped the library
class resolution to the one instance that supports x2apic mode, but only
for the SMM build of OVMF.

So, the question Paolo and I have is *not* whether we must use the
x2apic-capable library in the SMM build -- that is a fact, forced by the
X64 Linux guest kernel's behavior.

Neither is our question "how could we enable x2apic mode"? About that,
we don't care at the moment. :)

Instead, our question is if we can use BaseXApicX2ApicLib even in the
*non*-SMM build, without fear of regressions. In other words, if
BaseXApicX2ApicLib can safely replace BaseXApicLib in a build where
BaseXApicLib would be otherwise fully sufficient.

Because, if the answer is "yes, it is compatible", then we shouldn't
complicate the library class resolution in the OVMF DSC files, making it
dependent on the build type (i.e., SMM -> BaseXApicX2ApicLib, non-SMM ->
BaseXApicLib). Rather than that, we should indiscriminately use
BaseXApicX2ApicLib.

So... is BaseXApicX2ApicLib compatible with BaseXApicLib in the domain
where the latter would work sufficiently?

Thanks!
Laszlo


All I can say is that BaseXApicX2ApicLib works great on non-virtual 
hardware, in either legacy or X2Apic mode.


Brian






Thanks,

Mike


-Original Message-
From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Wednesday, November 04, 2015 2:41 AM
To: Paolo Bonzini; edk2-de...@ml01.01.org; Kinney, Michael D; Fan, Jeff;
Justen, Jordan L
Subject: Re: [edk2] [PATCH v4 18/41] OvmfPkg: select LocalApicLib instance
with x2apic support if SMM_REQUIRE

On 11/04/15 09:48, Paolo Bonzini wrote:



On 03/11/2015 22:00, Laszlo Ersek wrote:

+
+!if $(SMM_REQUIRE) == TRUE
+

LocalApicLib|UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.inf

+!else
LocalApicLib|UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.inf
+!endif
+


Can we enable BaseXApicX2ApicLib unconditionally?


I think I am technically capable of that :), but should we? We haven't
used BaseXApicX2ApicLib thus far in OVMF, so I can't predict at all if
it could regress stuff or not. If it causes problems with the
SMM_REQUIRE build, so be it, that's new stuff, but I wanted to avoid
such a global change for the traditional build.

I'm not against it, I just don't have experience with BaseXApicX2ApicLib.

Mike, Jeff, Jordan -- what do you think? Why do separate BaseXApicLib
and BaseXApicX2ApicLib instances exist? And why has OvmfPkg always used
the former? If it has only been for simplicity's sake, and
BaseXApicX2ApicLib is otherwise widely used internally at Intel, then I
think Paolo is right, and we should just keep the OVMF DSCs simple.

Thanks
Laszlo


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel




--

  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2 11/41] OvmfPkg: implement EFI_SMM_CONTROL2_PROTOCOL with a DXE_RUNTIME_DRIVER

2015-10-13 Thread Brian J. Johnson

On 10/13/2015 08:26 AM, Laszlo Ersek wrote:


First of all, if the edk2 reference code (in the SMM core and in
PiSmmCpuDxeSmm) depends on such behavior justifiedly, then I think we
have a bug in the PI specification. Namely, version 1.4 thereof does not
seem to require that EFI_SMM_CONTROL2_PROTOCOL.Trigger() raise an SMI on
*all* processors. (See volume 4, section 5.4.)

In particular, the EFI_SMM_CONTROL2_PROTOCOL.Trigger() specification
makes many references to SMI handling and dispatching. But I cannot make
a mental connection between an SMI "broadcast", and "handling and
dispatching" in edk2, since all "handling and dispatching" code in edk2
that is exposed via protocols, is non-reentrant with regard to multiple
VCPUs, to my knowledge.

One could argue that whatever code handles the SMI on the BP is
responsible for bringing the APs into SMM as well, before doing any
sensitive work. I'm not sure.


Traditionally, SMI handling has been global.  If the h/w didn't 
broadcast the SMI to all CPUs, the SMI handler did so itself.  The BSP 
would wait for all APs to "check in" to SMM, then it would do whatever 
work the SMI required, and signal the APs to resume.  That ensured that 
the OS wasn't active on the machine while the BSP was handling the SMI, 
which was required for certain uses of SMI.


However, this (obviously) doesn't scale well, so Intel has been moving 
towards signaling SMI to only a single processor, and avoiding the 
machine-wide rendezvous when it isn't necessary.  BIOS implementations 
may be lagging behind.




Second, if writing to ioport 0xb2 should *automatically* raise an SMI on
all processors, then the QEMU code could be incorrect. However I could
never derive such an "imperative" from the ICH9 spec.


I too am having a hard time finding a clear statement of whether or not 
ioport 0xb2 should *automatically* raise an SMI on all processors. 
Maybe it's platform-specific?

--

            Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2 11/41] OvmfPkg: implement EFI_SMM_CONTROL2_PROTOCOL with a DXE_RUNTIME_DRIVER

2015-10-13 Thread Brian J. Johnson

On 10/13/2015 11:49 AM, Laszlo Ersek wrote:

On 10/13/15 18:35, Brian J. Johnson wrote:

On 10/13/2015 08:26 AM, Laszlo Ersek wrote:


First of all, if the edk2 reference code (in the SMM core and in
PiSmmCpuDxeSmm) depends on such behavior justifiedly, then I think we
have a bug in the PI specification. Namely, version 1.4 thereof does not
seem to require that EFI_SMM_CONTROL2_PROTOCOL.Trigger() raise an SMI on
*all* processors. (See volume 4, section 5.4.)

In particular, the EFI_SMM_CONTROL2_PROTOCOL.Trigger() specification
makes many references to SMI handling and dispatching. But I cannot make
a mental connection between an SMI "broadcast", and "handling and
dispatching" in edk2, since all "handling and dispatching" code in edk2
that is exposed via protocols, is non-reentrant with regard to multiple
VCPUs, to my knowledge.

One could argue that whatever code handles the SMI on the BP is
responsible for bringing the APs into SMM as well, before doing any
sensitive work. I'm not sure.


Traditionally, SMI handling has been global.  If the h/w didn't
broadcast the SMI to all CPUs, the SMI handler did so itself.  The BSP
would wait for all APs to "check in" to SMM, then it would do whatever
work the SMI required, and signal the APs to resume.  That ensured that
the OS wasn't active on the machine while the BSP was handling the SMI,
which was required for certain uses of SMI.

However, this (obviously) doesn't scale well, so Intel has been moving
towards signaling SMI to only a single processor, and avoiding the
machine-wide rendezvous when it isn't necessary.  BIOS implementations
may be lagging behind.


But... when is it necessary? Paolo implied it might not be necessary for
us because MTRR changes are not relevant on our virtual platform --
otherwise all CPUs would have to agree on the MTRR settings --, but
isn't there a security aspect to it as well?


Sometimes the hardware requires it.  For instance, there have been 
situations where a platform needed a periodic SMI to allow the BIOS to 
kick a buggy I/O controller.  The "kick" was not atomic, and normal OS 
driver activity could interfere with it.  So BIOS used a system-wide SMI 
rendezvous to make sure that the OS wasn't running while it tweaked the 
hardware.


Security can be an aspect of it too.  For instance, modern platforms 
support write protecting the system flash, with a bypass for operations 
initiated from SMM.  If flash needs to be updated (eg. by a BIOS flash 
utility, or the EFI variable driver) the code doing the write uses a SMI 
to pass the request to a driver in SMM.  The SMM driver validates the 
request, unlocks the flash, and performs the update.  However, it would 
be a security hole if the OS was allowed to continue executing while the 
flash was unlocked.  So the SMM code triggers a global rendezvous to 
make sure no unknown code is running.


But there are plenty of situations where a global rendezvous isn't 
necessary.  For example, logging and clearing various non-fatal hardware 
errors (think corrected memory errors) can be done by a single thread 
while the rest of the system continues running.  Avoiding the rendezvous 
in those cases saves a *ton* of time in the SMI handler, reducing the 
system-wide performance impact.  As long as the SMI handler is quick 
enough, the OS doesn't mind losing a few cycles on one CPU.


(Or something like that.  SMM isn't my specialty, so I may be off on the 
details.  But that's the general idea.)

--

            Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] UEFI requirements for 32-bit Windows 8.1?

2015-09-03 Thread Brian J. Johnson

On 09/03/2015 05:08 AM, Laszlo Ersek wrote:

Hi,

64-bit Windows 8.1 boots on QEMU + OVMF just fine. (The "pc" (i440fx)
machine type of QEMU has "always" worked, and we recently fixed "q35" too.)

However, 32-bit Windows 8.1 (ie. the installer of it) crashes with a
BSoD on the 32-bit build of OVMF *immediately*. This happens regardless
of the QEMU machine type. The error message I'm getting is:

http://people.redhat.com/~lersek/windows-on-ovmf32/win8-ovmf32.png

According to <https://msdn.microsoft.com/en-us/library/cc704588.aspx>,
the error code 0xc185 means "STATUS_IO_DEVICE_ERROR".

I also tried with Windows 10:

http://people.redhat.com/~lersek/windows-on-ovmf32/win10-ovmf32.png

Here I get 0xc00d, "STATUS_INVALID_PARAMETER".

The Windows ISOs I tried with were:
- en_windows_8.1_pro_n_vl_with_update_x86_dvd_6051127.iso
- en_windows_10_enterprise_2015_ltsb_n_x86_dvd_6848317.iso

Can someone please help me debug this? The difference between x64 and
x86 is "inexplicable".


I've worked through some firmware issues on older MS releases, but never 
Windows 8 or 10.  So this advice may be out of date.  Do you know if 
Windows got through the boot loader and is starting the kernel?  If so, 
you can turn on extra debug messages to show the drivers as they are 
loading.  That can give you some good clues.  If that's not enough, you 
can enable remote debugging and use MS's debuggers (eg. WinDbg) and 
symbol tables to get an idea of the call chain which is failing.  It's 
been a long time since I've done this, so I'm rusty on the specifics... 
searching on msdn.microsoft.com should get you going.


Historically, Windows has been extremely picky about ACPI tables, much 
more so than Linux.  Boot issues often have to do with ACPI details.  It 
has also had some quirks re. what it expects in the EFI memory map, 
although those have mostly related to really large systems (eg. PCIe 
segment layout.)


I see you CC'd some folks at Microsoft.  Hopefully they will be able to 
give you more specific advice.

--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] UEFI requirements for 32-bit Windows 8.1?

2015-09-03 Thread Brian J. Johnson

On 09/03/2015 12:59 PM, Bill Paul wrote:

Of all the gin joints in all the towns in all the world, Brian J. Johnson had
to walk into mine at 08:51:20 on Thursday 03 September 2015 and say:


On 09/03/2015 05:08 AM, Laszlo Ersek wrote:

Hi,

64-bit Windows 8.1 boots on QEMU + OVMF just fine. (The "pc" (i440fx)
machine type of QEMU has "always" worked, and we recently fixed "q35"
too.)

However, 32-bit Windows 8.1 (ie. the installer of it) crashes with a
BSoD on the 32-bit build of OVMF *immediately*. This happens regardless
of the QEMU machine type. The error message I'm getting is:

http://people.redhat.com/~lersek/windows-on-ovmf32/win8-ovmf32.png

According to <https://msdn.microsoft.com/en-us/library/cc704588.aspx>,
the error code 0xc185 means "STATUS_IO_DEVICE_ERROR".

I also tried with Windows 10:

http://people.redhat.com/~lersek/windows-on-ovmf32/win10-ovmf32.png

Here I get 0xc00d, "STATUS_INVALID_PARAMETER".

The Windows ISOs I tried with were:
- en_windows_8.1_pro_n_vl_with_update_x86_dvd_6051127.iso
- en_windows_10_enterprise_2015_ltsb_n_x86_dvd_6848317.iso

Can someone please help me debug this? The difference between x64 and
x86 is "inexplicable".


I've worked through some firmware issues on older MS releases, but never
Windows 8 or 10.  So this advice may be out of date.  Do you know if
Windows got through the boot loader and is starting the kernel?  If so,
you can turn on extra debug messages to show the drivers as they are
loading.  That can give you some good clues.  If that's not enough, you
can enable remote debugging and use MS's debuggers (eg. WinDbg) and
symbol tables to get an idea of the call chain which is failing.  It's
been a long time since I've done this, so I'm rusty on the specifics...
searching on msdn.microsoft.com should get you going.

Historically, Windows has been extremely picky about ACPI tables, much
more so than Linux.


No: historically hardware vendors have been insufficiently picky about
creating their ACPI tables, leading to what I have not-so-affectionately named
the "If It Works With Windows, It Must Be Okay" syndrome.


It cuts all ways:  for vendors which do focus on Linux (such as SGI), 
Windows tends to get broken.  (And it's a lot harder to debug a broken 
Windows boot, as Laszlo is finding out.)  Both OSes have been guilty of 
making assumptions which violate the relevant specs (UEFI, ACPI, etc.) 
And yes, h/w vendors have had plenty of firmware bugs of their own.


Sigh... the world is messy.
--

Brian J. Johnson



  My statements are my own, are not authorized by SGI, and do not
  necessarily represent SGI’s positions.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel