On 02/23/16 14:27, Ard Biesheuvel wrote: > On 23 February 2016 at 13:58, Laszlo Ersek <[email protected]> wrote: >> On 02/23/16 11:42, Yao, Jiewen wrote: >>> Laszlo >>> >>> I do not understand why you only set 4G page table for X64 PEI. >> > > Actually, that question is not even relevant. The reality is simply > that the value of PcdDxeIplSwitchToLongMode is not sufficient to > decide whether PEI will have access to all of memory, and we need some > other way to convey the PEI memory limit. Note that, if OVMF > implements it that way, there may be other platforms in the wild that > do the same, so 'fixing' OVMF does not help us at all.
Thanks for pointing this out. > So we are back to a PCD that sets a memory limit for PEI. I suppose > having a simple feature PCD should be sufficient here? Any suggestions > for names? PcdFavor64BitPeiMemory? Thanks Laszlo > > > > >> Two reasons. >> >> * First, OVMF's PEI has minimal memory needs. The permanent PEI RAM we >> install it is normally capped at 64MB. If the guest has large RAM, then >> we install more PEI RAM, but only so that the DXE IPL can build page >> tables for the entirety of the guest RAM. (In other words, in such cases >> the permanent PEI RAM size is dictated by / dominated by the DXE IPL's >> page table allocation.) >> >> * Second, OVMF is special in that it doesn't need to configure DRAM >> hardware before it becomes usable. Therefore only the reset vector and >> the SEC phase run from flash. SEC decompresses both PEI and DXE firmware >> volumes into normal RAM, and PEI runs from RAM. >> >> In order to make this possible, originally OVMF's earliest page tables >> were pre-built by the build process -- those page tables were statically >> embedded into the read-only flash image, and SEC would only point CR3 at >> them. These page tables mapped RAM up to 4GB. >> >> Later this broke on some KVM versions: even if we pre-set the Accessed >> and Dirty bits in the statically pre-built page table entries, the >> processor wanted to rewrite those PTEs (in read-only flash) when >> accessing the mapped pages themselves. Since the PTEs existed in >> read-only memory, this broke. (I don't recall the exact symptoms, but it >> didn't work.) >> >> So Jordan rewrote the creation of the earliest page tables, for the X64 >> SEC/PEI case: it is now being done in the reset vector code. The reset >> vector code and SEC continue to run from flash (located just below 4GB). >> The reset vector creates the page tables (mapping memory up to 4GB) >> dynamically, located in real RAM, at 8MB. (The size of the page tables >> is 24KB.) >> >> The SEC phase is entered with paging enabled (using the tables created >> above), and those same tables are used in PEI as well. (Until the DXE >> IPL maps the full memory.) >> >>> But if X64 PEI only fills 4G page table by default, you can setup page >>> fault handler to dynamic fix the address. >>> >>> That is the way how we handle full memory access in memory constrain >>> environment, like SMM. >> >> I'm certain this is technically solvable, but let's make it clear that >> the burden to make OVMF cope with such a change is not on us. >> >> OvmfPkg lives in the upstream edk2 repository. In any open source >> project, in-tree client code that depends on central infrastructure is >> not left behind when the infrastructure changes, and the burden of >> adapting the client code to the infrastructure is also not on the >> maintainers of the client code. Whoever is updating the infrastructure >> is responsible for bringing forward all dependent client code, with a >> bisectable series. >> >>> Also, this platform can choose not report above 4G memory as tested for >>> DxeCore >> >> OVMF's PlatformPei already reports >=4GB memory as untested. >> >>> and report big enough ACPI/Reserved memory in >>> MemoryTypeInformation table. >> >> How big is big enough? >> >>> Then DxeCore will allocate BIN below 4G and all ACPI/Reserved memory >>> allocation will happen below 4G. >> >> Honestly, I find this way more complicated than a new PCD. A PCD is >> explicit about its purpose. >> >> Whereas your suggestion, if I understand it correctly, would require us >> to measure actual memory usage for the Memory Type Information HOB, and >> keep it up-to-date as development progresses, new drivers are pulled in, >> old drivers are modified, and so on. >> >> Also, the same OVMF binary is expected to run on QEMU when S3 support is >> disabled (on the QEMU command line). In this case, some of the >> foundational S3-related protocols are not installed, which prevents the >> rest of the S3-related drivers from working. This is intentional, and it >> impacts the amount of ACPI / Reserved memory needed. >> >>> We do not want to introduce a new PCD, because it is unnecessary burden >>> to maintain a new one, especially we already have solution to handle >>> most cases. >> >> The way I see it, support for a new platform family is being introduced. >> A platform family that invalidates some of the key assumptions held (and >> hard-coded!) in a number of core modules. I don't think a preexistent, >> well-working platform with a non-negligible user base should be broken >> for the newcomer's sake. >> >> PCDs exist for platform configuration. That's what the name says: >> Platform Configuration Database. If a new platform is being accommodated >> in the core modules, I think it is quite natural to reckon with some >> additional customization and maintenance. >> >> For example, I would think the following library instances present some >> additional maintenance burden too: >> >> - MdePkg/Library/BaseCpuLib/AArch64 >> - MdePkg/Library/BaseLib/AArch64 >> - MdePkg/Library/BaseSynchronizationLib/AArch64 >> >> I don't debate your claim that a new PCD presents new maintenance burden >> -- it does. I do question your claim that the burden is unnecessary. >> >> I've put insane amounts of work into OVMF's S3. If someone submits >> bisectable, regression-conscious patches for OVMF that adapt OVMF to the >> proposed repurposing of PcdDxeIplSwitchToLongMode, I will do my best to >> test and review those patches. If no such patches will be submitted, I >> will continue NACKing the repurposing of PcdDxeIplSwitchToLongMode. >> >> (Obviously, I'm also open to arguments that prove me wrong.) >> >> Thanks >> Laszlo >> >>> >>> >>> >>> Thank you >>> >>> Yao Jiewen >>> >>> >>> >>> *From:*Laszlo Ersek [mailto:[email protected]] >>> *Sent:* Tuesday, February 23, 2016 5:04 PM >>> *To:* Yao, Jiewen; Ard Biesheuvel >>> *Cc:* Ni, Ruiyu; [email protected]; Tian, Feng; Zeng, Star; >>> [email protected]; [email protected]; Fan, Jeff; Gao, Liming >>> *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe: >>> only allocate below 4 GB if needed >>> >>> >>> >>> On 02/23/16 05:31, Yao, Jiewen wrote: >>>> I discussed with Module owner. Below is summary for MdeModulePkg: >>>> >>>> 1) PCD Name: We suggest to use PcdDxeIplSwitchToLongMode as >>>> indicator that PEI is 32 bit and DXE is 64 bit for X86 platform. It >>>> might be not clear stated in PCD description, but it is the intent >>>> when this is introduced. We suggest to define this to be FALSE for >>>> ARM, so that our core module can use it. (Confirmed with Liming Gao) >>>> >>>> 2) AcpiTableDxe: As Star Zeng mentioned, we need make sure FACS is >>>> allocate below 4G when PcdDxeIplSwitchToLongMode is set, because it is >>>> used by S3Resume. >>> >>> This is not good enough. PEI might be restricted to the first 4GB of >>> memory *even though* it is built for X64. >>> >>> Consider a platform with the following characteristics: >>> - PEI is X64 >>> - DXE is X64 >>> - hence PcdDxeIplSwitchToLongMode is FALSE >>> - PEI has page tables only for the first 4GB of RAM >>> - S3 is supported >>> >>> If AcpiTableDxe keyed off of PcdDxeIplSwitchToLongMode *solely*, in >>> order to see whether the FACS needs to be allocated under 4GB, then said >>> platform would break. >>> >>> The most frequently used build of OVMF is exactly like written above. >>> >>>> >>>> 3) UefiBootManagerLib: As Ruiyu Ni mentioned, we need make allocate >>>> L"PerfDataMemAddr" below 4G for when PcdDxeIplSwitchToLongMode is set, >>>> because it is used by S3Resume. >>> >>> My counter-argument is the same as above, I think. >>> >>>> >>>> >>>> >>>> 4) BootGraphicsResourceTableDxe: There is no need to add 4G >>>> limitation, we can remove that directly. (Confirmed with Chao Zhang) >>>> >>>> 5) FirmwarePerformanceDataTableDxe: We can check >>>> PcdDxeIplSwitchToLongMode to allocate Below4G memory for >>>> mAcpiS3PerformanceTable which is used in S3. >>> >>> Same counter-argument again. >>> >>>> And we can remove 4G >>>> limitation for mAcpiBootPerformanceTable directly. (Confirmed with >>>> Chao >>>> Zhang) >>>> >>>> 6) PiDxeS3BootScriptLib: We can check PcdDxeIplSwitchToLongMode for >>>> S3TableBase and NewS3TableBase which is used in S3. (Confirmed with >>>> Star >>>> Zeng) >>> >>> Same counter-argument. >>> >>>> >>>> 7) BootScriptExecutorDxe: We can check PcdDxeIplSwitchToLongMode for >>>> FfsBuffer and BootScriptExecutorBuffer which is used in S3. (Confirmed >>>> with Star Zeng) >>> >>> Same counter-argument. >>> >>>> >>>> 8) CapsuleRuntimeDxe: Already check PcdDxeIplSwitchToLongMode.No >>>> update needed. (Confirmed with Star Zeng) >>> >>> The only reason this doesn't break under OVMF is because OVMF doesn't >>> implement capsule updates. >>> >>> In summary, PcdDxeIplSwitchToLongMode==TRUE implies, but is not >>> equivalent to, the requirement to allocate under 4GB for S3 purposes. A >>> platform may have *both* S3 *and* a 64-bit PEI phase without access to >>>> = 4GB memory. >>> >>> That's why I proposed a brand new PCD. PcdDxeIplSwitchToLongMode would >>> imply that PCD, but the new PCD wouldn't necessarily imply >>> PcdDxeIplSwitchToLongMode. >>> >>>> BTW: We are trying not use IntelFrameworkModulePkg. If there is >>>> replacement in MdeModulePkg, we suggest use the one in MdeModulePkg, >>>> for example, BDS or ACPI. >>> >>> Absolutely, but migrating a platform from IntelFrameworkModulePkg BDS to >>> MdeModulePkg BDS is not trivial, due to lack of documentation and >>> examples; plus some functionality that is publicly available from >>> GenericBdsLib (and is used by PlatformBdsLib instances) needs to be >>> exported from the new boot manager library. >>> >>> Thanks >>> Laszlo >>> >>>> >>>> >>>> >>>> Thank you >>>> >>>> Yao Jiewen >>>> >>>> >>>> >>>> >>>> >>>> *From:*Ard Biesheuvel [mailto:[email protected]] >>>> *Sent:* Monday, February 22, 2016 8:43 PM >>>> *To:* Yao, Jiewen >>>> *Cc:* Ni, Ruiyu; Laszlo Ersek; [email protected] >>>> <mailto:[email protected]>; Tian, Feng; >>>> Zeng, Star; [email protected] <mailto:[email protected]>; >>> [email protected] <mailto:[email protected]>; Fan, >>>> Jeff; Gao, Liming >>>> *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe: >>>> only allocate below 4 GB if needed >>>> >>>> >>>> >>>> On 22 February 2016 at 13:40, Yao, Jiewen wrote: >>>>> I did a search on current MdeModulePkg. I found there are more >>>>> modules allocating Below4G memory. >>>>> >>>>> Besides BDS, we have BootGraphicsResourceTableDxe, >>>>> BootScriptExecutorDxe, FirmwarePerformanceDataTableDxe, >>>>> PiDxeS3BootScriptLib, CapsuleRuntimeDxe. >>>>> >>>>> >>>>> >>>>> BootGraphicsResourceTableDxe – I am not clear. >>>>> >>>>> CapsuleRuntimeDxe – Below4G is designed for PEI capsule. Or we do not >>>>> need such capability. >>>>> >>>>> BootScriptExecutorDxe, FirmwarePerformanceDataTableDxe, >>>>> PiDxeS3BootScriptLib >>>>> - data need to be access in PEI phase. >>>>> >>>>> >>>>> >>>>> I think we might need more clean up for 32bit PEI. >>>>> >>>> >>>> Agreed. >>>> >>>>> I am still thinking if we can reuse old Pcd – we can clarify in PCD >>>>> description – it is only for 32bit PEI and 64bit DXE. >>>>> >>>>> Or a new PCD, and name to Pcd32BitPei? Please allow me collect more >>>>> feedback on that. >>>>> >>>> >>>> OK >>>> >>>>> In order to separate 32bit PEI issue from ACPI issue, I think we can >>>>> check in ACPI patch at first, if you want. >>>>> >>>> >>>> Should I wait for the maintainers of MdeModulePkg to give their >>>> Reviewed-by ? >>>> >>> >> _______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

