On 23 February 2016 at 13:58, Laszlo Ersek <[email protected]> wrote:
> On 02/23/16 11:42, Yao, Jiewen wrote:
>> Laszlo
>>
>> I do not understand why you only set 4G page table for X64 PEI.
>

Actually, that question is not even relevant. The reality is simply
that the value of PcdDxeIplSwitchToLongMode is not sufficient to
decide whether PEI will have access to all of memory, and we need some
other way to convey the PEI memory limit. Note that, if OVMF
implements it that way, there may be other platforms in the wild that
do the same, so 'fixing' OVMF does not help us at all.

So we are back to a PCD that sets a memory limit for PEI. I suppose
having a simple feature PCD should be sufficient here? Any suggestions
for names?




> Two reasons.
>
> * First, OVMF's PEI has minimal memory needs. The permanent PEI RAM we
> install it is normally capped at 64MB. If the guest has large RAM, then
> we install more PEI RAM, but only so that the DXE IPL can build page
> tables for the entirety of the guest RAM. (In other words, in such cases
> the permanent PEI RAM size is dictated by / dominated by the DXE IPL's
> page table allocation.)
>
> * Second, OVMF is special in that it doesn't need to configure DRAM
> hardware before it becomes usable. Therefore only the reset vector and
> the SEC phase run from flash. SEC decompresses both PEI and DXE firmware
> volumes into normal RAM, and PEI runs from RAM.
>
> In order to make this possible, originally OVMF's earliest page tables
> were pre-built by the build process -- those page tables were statically
> embedded into the read-only flash image, and SEC would only point CR3 at
> them. These page tables mapped RAM up to 4GB.
>
> Later this broke on some KVM versions: even if we pre-set the Accessed
> and Dirty bits in the statically pre-built page table entries, the
> processor wanted to rewrite those PTEs (in read-only flash) when
> accessing the mapped pages themselves. Since the PTEs existed in
> read-only memory, this broke. (I don't recall the exact symptoms, but it
> didn't work.)
>
> So Jordan rewrote the creation of the earliest page tables, for the X64
> SEC/PEI case: it is now being done in the reset vector code. The reset
> vector code and SEC continue to run from flash (located just below 4GB).
> The reset vector creates the page tables (mapping memory up to 4GB)
> dynamically, located in real RAM, at 8MB. (The size of the page tables
> is 24KB.)
>
> The SEC phase is entered with paging enabled (using the tables created
> above), and those same tables are used in PEI as well. (Until the DXE
> IPL maps the full memory.)
>
>> But if X64 PEI only fills 4G page table by default, you can setup page
>> fault handler to dynamic fix the address.
>>
>> That is the way how we handle full memory access in memory constrain
>> environment, like SMM.
>
> I'm certain this is technically solvable, but let's make it clear that
> the burden to make OVMF cope with such a change is not on us.
>
> OvmfPkg lives in the upstream edk2 repository. In any open source
> project, in-tree client code that depends on central infrastructure is
> not left behind when the infrastructure changes, and the burden of
> adapting the client code to the infrastructure is also not on the
> maintainers of the client code. Whoever is updating the infrastructure
> is responsible for bringing forward all dependent client code, with a
> bisectable series.
>
>> Also, this platform can choose not report above 4G memory as tested for
>> DxeCore
>
> OVMF's PlatformPei already reports >=4GB memory as untested.
>
>> and report big enough ACPI/Reserved memory in
>> MemoryTypeInformation table.
>
> How big is big enough?
>
>> Then DxeCore will allocate BIN below 4G and all ACPI/Reserved memory
>> allocation will happen below 4G.
>
> Honestly, I find this way more complicated than a new PCD. A PCD is
> explicit about its purpose.
>
> Whereas your suggestion, if I understand it correctly, would require us
> to measure actual memory usage for the Memory Type Information HOB, and
> keep it up-to-date as development progresses, new drivers are pulled in,
> old drivers are modified, and so on.
>
> Also, the same OVMF binary is expected to run on QEMU when S3 support is
> disabled (on the QEMU command line). In this case, some of the
> foundational S3-related protocols are not installed, which prevents the
> rest of the S3-related drivers from working. This is intentional, and it
> impacts the amount of ACPI / Reserved memory needed.
>
>> We do not want to introduce a new PCD, because it is unnecessary burden
>> to maintain a new one, especially we already have solution to handle
>> most cases.
>
> The way I see it, support for a new platform family is being introduced.
> A platform family that invalidates some of the key assumptions held (and
> hard-coded!) in a number of core modules. I don't think a preexistent,
> well-working platform with a non-negligible user base should be broken
> for the newcomer's sake.
>
> PCDs exist for platform configuration. That's what the name says:
> Platform Configuration Database. If a new platform is being accommodated
> in the core modules, I think it is quite natural to reckon with some
> additional customization and maintenance.
>
> For example, I would think the following library instances present some
> additional maintenance burden too:
>
> - MdePkg/Library/BaseCpuLib/AArch64
> - MdePkg/Library/BaseLib/AArch64
> - MdePkg/Library/BaseSynchronizationLib/AArch64
>
> I don't debate your claim that a new PCD presents new maintenance burden
> -- it does. I do question your claim that the burden is unnecessary.
>
> I've put insane amounts of work into OVMF's S3. If someone submits
> bisectable, regression-conscious patches for OVMF that adapt OVMF to the
> proposed repurposing of PcdDxeIplSwitchToLongMode, I will do my best to
> test and review those patches. If no such patches will be submitted, I
> will continue NACKing the repurposing of PcdDxeIplSwitchToLongMode.
>
> (Obviously, I'm also open to arguments that prove me wrong.)
>
> Thanks
> Laszlo
>
>>
>>
>>
>> Thank you
>>
>> Yao Jiewen
>>
>>
>>
>> *From:*Laszlo Ersek [mailto:[email protected]]
>> *Sent:* Tuesday, February 23, 2016 5:04 PM
>> *To:* Yao, Jiewen; Ard Biesheuvel
>> *Cc:* Ni, Ruiyu; [email protected]; Tian, Feng; Zeng, Star;
>> [email protected]; [email protected]; Fan, Jeff; Gao, Liming
>> *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe:
>> only allocate below 4 GB if needed
>>
>>
>>
>> On 02/23/16 05:31, Yao, Jiewen wrote:
>>> I discussed with Module owner. Below is summary for MdeModulePkg:
>>>
>>> 1) PCD Name: We suggest to use PcdDxeIplSwitchToLongMode as
>>> indicator that PEI is 32 bit and DXE is 64 bit for X86 platform. It
>>> might be not clear stated in PCD description, but it is the intent
>>> when this is introduced. We suggest to define this to be FALSE for
>>> ARM, so that our core module can use it. (Confirmed with Liming Gao)
>>>
>>> 2) AcpiTableDxe: As Star Zeng mentioned, we need make sure FACS is
>>> allocate below 4G when PcdDxeIplSwitchToLongMode is set, because it is
>>> used by S3Resume.
>>
>> This is not good enough. PEI might be restricted to the first 4GB of
>> memory *even though* it is built for X64.
>>
>> Consider a platform with the following characteristics:
>> - PEI is X64
>> - DXE is X64
>> - hence PcdDxeIplSwitchToLongMode is FALSE
>> - PEI has page tables only for the first 4GB of RAM
>> - S3 is supported
>>
>> If AcpiTableDxe keyed off of PcdDxeIplSwitchToLongMode *solely*, in
>> order to see whether the FACS needs to be allocated under 4GB, then said
>> platform would break.
>>
>> The most frequently used build of OVMF is exactly like written above.
>>
>>>
>>> 3) UefiBootManagerLib: As Ruiyu Ni mentioned, we need make allocate
>>> L"PerfDataMemAddr" below 4G for when PcdDxeIplSwitchToLongMode is set,
>>> because it is used by S3Resume.
>>
>> My counter-argument is the same as above, I think.
>>
>>>
>>>
>>>
>>> 4) BootGraphicsResourceTableDxe: There is no need to add 4G
>>> limitation, we can remove that directly. (Confirmed with Chao Zhang)
>>>
>>> 5) FirmwarePerformanceDataTableDxe: We can check
>>> PcdDxeIplSwitchToLongMode to allocate Below4G memory for
>>> mAcpiS3PerformanceTable which is used in S3.
>>
>> Same counter-argument again.
>>
>>> And we can remove 4G
>>> limitation for mAcpiBootPerformanceTable directly. (Confirmed with
>>> Chao
>>> Zhang)
>>>
>>> 6) PiDxeS3BootScriptLib: We can check PcdDxeIplSwitchToLongMode for
>>> S3TableBase and NewS3TableBase which is used in S3. (Confirmed with
>>> Star
>>> Zeng)
>>
>> Same counter-argument.
>>
>>>
>>> 7) BootScriptExecutorDxe: We can check PcdDxeIplSwitchToLongMode for
>>> FfsBuffer and BootScriptExecutorBuffer which is used in S3. (Confirmed
>>> with Star Zeng)
>>
>> Same counter-argument.
>>
>>>
>>> 8) CapsuleRuntimeDxe: Already check PcdDxeIplSwitchToLongMode.No
>>> update needed. (Confirmed with Star Zeng)
>>
>> The only reason this doesn't break under OVMF is because OVMF doesn't
>> implement capsule updates.
>>
>> In summary, PcdDxeIplSwitchToLongMode==TRUE implies, but is not
>> equivalent to, the requirement to allocate under 4GB for S3 purposes. A
>> platform may have *both* S3 *and* a 64-bit PEI phase without access to
>>>= 4GB memory.
>>
>> That's why I proposed a brand new PCD. PcdDxeIplSwitchToLongMode would
>> imply that PCD, but the new PCD wouldn't necessarily imply
>> PcdDxeIplSwitchToLongMode.
>>
>>> BTW: We are trying not use IntelFrameworkModulePkg. If there is
>>> replacement in MdeModulePkg, we suggest use the one in MdeModulePkg,
>>> for example, BDS or ACPI.
>>
>> Absolutely, but migrating a platform from IntelFrameworkModulePkg BDS to
>> MdeModulePkg BDS is not trivial, due to lack of documentation and
>> examples; plus some functionality that is publicly available from
>> GenericBdsLib (and is used by PlatformBdsLib instances) needs to be
>> exported from the new boot manager library.
>>
>> Thanks
>> Laszlo
>>
>>>
>>>
>>>
>>> Thank you
>>>
>>> Yao Jiewen
>>>
>>>
>>>
>>>
>>>
>>> *From:*Ard Biesheuvel [mailto:[email protected]]
>>> *Sent:* Monday, February 22, 2016 8:43 PM
>>> *To:* Yao, Jiewen
>>> *Cc:* Ni, Ruiyu; Laszlo Ersek; [email protected] 
>>> <mailto:[email protected]>; Tian, Feng;
>>> Zeng, Star; [email protected] <mailto:[email protected]>;
>> [email protected] <mailto:[email protected]>; Fan,
>>> Jeff; Gao, Liming
>>> *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe:
>>> only allocate below 4 GB if needed
>>>
>>>
>>>
>>> On 22 February 2016 at 13:40, Yao, Jiewen wrote:
>>>> I did a search on current MdeModulePkg. I found there are more
>>>> modules allocating Below4G memory.
>>>>
>>>> Besides BDS, we have BootGraphicsResourceTableDxe,
>>>> BootScriptExecutorDxe, FirmwarePerformanceDataTableDxe, 
>>>> PiDxeS3BootScriptLib, CapsuleRuntimeDxe.
>>>>
>>>>
>>>>
>>>> BootGraphicsResourceTableDxe – I am not clear.
>>>>
>>>> CapsuleRuntimeDxe – Below4G is designed for PEI capsule. Or we do not
>>>> need such capability.
>>>>
>>>> BootScriptExecutorDxe, FirmwarePerformanceDataTableDxe,
>>>> PiDxeS3BootScriptLib
>>>> - data need to be access in PEI phase.
>>>>
>>>>
>>>>
>>>> I think we might need more clean up for 32bit PEI.
>>>>
>>>
>>> Agreed.
>>>
>>>> I am still thinking if we can reuse old Pcd – we can clarify in PCD
>>>> description – it is only for 32bit PEI and 64bit DXE.
>>>>
>>>> Or a new PCD, and name to Pcd32BitPei? Please allow me collect more
>>>> feedback on that.
>>>>
>>>
>>> OK
>>>
>>>> In order to separate 32bit PEI issue from ACPI issue, I think we can
>>>> check in ACPI patch at first, if you want.
>>>>
>>>
>>> Should I wait for the maintainers of MdeModulePkg to give their
>>> Reviewed-by ?
>>>
>>
>
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to