On 02/23/16 15:06, Yao, Jiewen wrote: > Yes, thanks to point out this. > > > > Now I am thinking the problem becomes much broader. > > For example, do we need support a platform only set 2G page table in PEI > phase? Or 1G? > > Or do we need support a platform only set 0~1G and 3G~4G page table? > > Or even more fragmented? > > > > I do not think a generic core module can handle the case that a platform > only set 0~1G and 3G~4G page table. > > There must be some rule pre-defined in specification, instead of using > PCD. (Or we need a PCD to define the range?) > > > > PI specification might be a good place to clarify that rule. Please > allow me to ask.
I agree that regulating this in the PI spec would be very helpful. Thanks! Laszlo > > > > Thank you > > Yao Jiewen > > > > > > *From:*edk2-devel [mailto:[email protected]] *On Behalf Of > *Laszlo Ersek > *Sent:* Tuesday, February 23, 2016 9:41 PM > *To:* Ard Biesheuvel > *Cc:* Ni, Ruiyu; Tian, Feng; [email protected]; Justen, Jordan > L; [email protected]; [email protected]; Yao, Jiewen; Gao, > Liming; Fan, Jeff; Zeng, Star > *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe: > only allocate below 4 GB if needed > > > > On 02/23/16 14:27, Ard Biesheuvel wrote: >> On 23 February 2016 at 13:58, Laszlo Ersek wrote: >>> On 02/23/16 11:42, Yao, Jiewen wrote: >>>> Laszlo >>>> >>>> I do not understand why you only set 4G page table for X64 PEI. >>> >> >> Actually, that question is not even relevant. The reality is simply >> that the value of PcdDxeIplSwitchToLongMode is not sufficient to >> decide whether PEI will have access to all of memory, and we need some >> other way to convey the PEI memory limit. Note that, if OVMF >> implements it that way, there may be other platforms in the wild that >> do the same, so 'fixing' OVMF does not help us at all. > > Thanks for pointing this out. > >> So we are back to a PCD that sets a memory limit for PEI. I suppose >> having a simple feature PCD should be sufficient here? Any suggestions >> for names? > > PcdFavor64BitPeiMemory? > > Thanks > Laszlo > > >> >> >> >> >>> Two reasons. >>> >>> * First, OVMF's PEI has minimal memory needs. The permanent PEI RAM >>> we install it is normally capped at 64MB. If the guest has large RAM, >>> then we install more PEI RAM, but only so that the DXE IPL can build >>> page tables for the entirety of the guest RAM. (In other words, in >>> such cases the permanent PEI RAM size is dictated by / dominated by >>> the DXE IPL's page table allocation.) >>> >>> * Second, OVMF is special in that it doesn't need to configure DRAM >>> hardware before it becomes usable. Therefore only the reset vector >>> and the SEC phase run from flash. SEC decompresses both PEI and DXE >>> firmware volumes into normal RAM, and PEI runs from RAM. >>> >>> In order to make this possible, originally OVMF's earliest page >>> tables were pre-built by the build process -- those page tables were >>> statically embedded into the read-only flash image, and SEC would >>> only point CR3 at them. These page tables mapped RAM up to 4GB. >>> >>> Later this broke on some KVM versions: even if we pre-set the >>> Accessed and Dirty bits in the statically pre-built page table >>> entries, the processor wanted to rewrite those PTEs (in read-only >>> flash) when accessing the mapped pages themselves. Since the PTEs >>> existed in read-only memory, this broke. (I don't recall the exact >>> symptoms, but it didn't work.) >>> >>> So Jordan rewrote the creation of the earliest page tables, for the >>> X64 SEC/PEI case: it is now being done in the reset vector code. The >>> reset vector code and SEC continue to run from flash (located just below >>> 4GB). >>> The reset vector creates the page tables (mapping memory up to 4GB) >>> dynamically, located in real RAM, at 8MB. (The size of the page >>> tables is 24KB.) >>> >>> The SEC phase is entered with paging enabled (using the tables >>> created above), and those same tables are used in PEI as well. (Until >>> the DXE IPL maps the full memory.) >>> >>>> But if X64 PEI only fills 4G page table by default, you can setup >>>> page fault handler to dynamic fix the address. >>>> >>>> That is the way how we handle full memory access in memory constrain >>>> environment, like SMM. >>> >>> I'm certain this is technically solvable, but let's make it clear >>> that the burden to make OVMF cope with such a change is not on us. >>> >>> OvmfPkg lives in the upstream edk2 repository. In any open source >>> project, in-tree client code that depends on central infrastructure >>> is not left behind when the infrastructure changes, and the burden of >>> adapting the client code to the infrastructure is also not on the >>> maintainers of the client code. Whoever is updating the >>> infrastructure is responsible for bringing forward all dependent >>> client code, with a bisectable series. >>> >>>> Also, this platform can choose not report above 4G memory as tested >>>> for DxeCore >>> >>> OVMF's PlatformPei already reports >=4GB memory as untested. >>> >>>> and report big enough ACPI/Reserved memory in MemoryTypeInformation >>>> table. >>> >>> How big is big enough? >>> >>>> Then DxeCore will allocate BIN below 4G and all ACPI/Reserved memory >>>> allocation will happen below 4G. >>> >>> Honestly, I find this way more complicated than a new PCD. A PCD is >>> explicit about its purpose. >>> >>> Whereas your suggestion, if I understand it correctly, would require >>> us to measure actual memory usage for the Memory Type Information >>> HOB, and keep it up-to-date as development progresses, new drivers >>> are pulled in, old drivers are modified, and so on. >>> >>> Also, the same OVMF binary is expected to run on QEMU when S3 support >>> is disabled (on the QEMU command line). In this case, some of the >>> foundational S3-related protocols are not installed, which prevents >>> the rest of the S3-related drivers from working. This is intentional, >>> and it impacts the amount of ACPI / Reserved memory needed. >>> >>>> We do not want to introduce a new PCD, because it is unnecessary >>>> burden to maintain a new one, especially we already have solution to >>>> handle most cases. >>> >>> The way I see it, support for a new platform family is being introduced. >>> A platform family that invalidates some of the key assumptions held >>> (and >>> hard-coded!) in a number of core modules. I don't think a >>> preexistent, well-working platform with a non-negligible user base >>> should be broken for the newcomer's sake. >>> >>> PCDs exist for platform configuration. That's what the name says: >>> Platform Configuration Database. If a new platform is being >>> accommodated in the core modules, I think it is quite natural to >>> reckon with some additional customization and maintenance. >>> >>> For example, I would think the following library instances present >>> some additional maintenance burden too: >>> >>> - MdePkg/Library/BaseCpuLib/AArch64 >>> - MdePkg/Library/BaseLib/AArch64 >>> - MdePkg/Library/BaseSynchronizationLib/AArch64 >>> >>> I don't debate your claim that a new PCD presents new maintenance >>> burden >>> -- it does. I do question your claim that the burden is unnecessary. >>> >>> I've put insane amounts of work into OVMF's S3. If someone submits >>> bisectable, regression-conscious patches for OVMF that adapt OVMF to >>> the proposed repurposing of PcdDxeIplSwitchToLongMode, I will do my >>> best to test and review those patches. If no such patches will be >>> submitted, I will continue NACKing the repurposing of >>> PcdDxeIplSwitchToLongMode. >>> >>> (Obviously, I'm also open to arguments that prove me wrong.) >>> >>> Thanks >>> Laszlo >>> >>>> >>>> >>>> >>>> Thank you >>>> >>>> Yao Jiewen >>>> >>>> >>>> >>>> *From:*Laszlo Ersek [mailto:[email protected]] >>>> *Sent:* Tuesday, February 23, 2016 5:04 PM >>>> *To:* Yao, Jiewen; Ard Biesheuvel >>>> *Cc:* Ni, Ruiyu; [email protected] <mailto:[email protected]>; >>>> Tian, Feng; > Zeng, Star; >>>> [email protected] <mailto:[email protected]>; > [email protected] <mailto:[email protected]>; Fan, Jeff; > Gao, >>>> Liming >>>> *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe: >>>> only allocate below 4 GB if needed >>>> >>>> >>>> >>>> On 02/23/16 05:31, Yao, Jiewen wrote: >>>>> I discussed with Module owner. Below is summary for MdeModulePkg: >>>>> >>>>> 1) PCD Name: We suggest to use PcdDxeIplSwitchToLongMode as >>>>> indicator that PEI is 32 bit and DXE is 64 bit for X86 platform. It >>>>> might be not clear stated in PCD description, but it is the intent >>>>> when this is introduced. We suggest to define this to be FALSE for >>>>> ARM, so that our core module can use it. (Confirmed with Liming >>>>> Gao) >>>>> >>>>> 2) AcpiTableDxe: As Star Zeng mentioned, we need make sure FACS is >>>>> allocate below 4G when PcdDxeIplSwitchToLongMode is set, because it >>>>> is used by S3Resume. >>>> >>>> This is not good enough. PEI might be restricted to the first 4GB of >>>> memory *even though* it is built for X64. >>>> >>>> Consider a platform with the following characteristics: >>>> - PEI is X64 >>>> - DXE is X64 >>>> - hence PcdDxeIplSwitchToLongMode is FALSE >>>> - PEI has page tables only for the first 4GB of RAM >>>> - S3 is supported >>>> >>>> If AcpiTableDxe keyed off of PcdDxeIplSwitchToLongMode *solely*, in >>>> order to see whether the FACS needs to be allocated under 4GB, then >>>> said platform would break. >>>> >>>> The most frequently used build of OVMF is exactly like written above. >>>> >>>>> >>>>> 3) UefiBootManagerLib: As Ruiyu Ni mentioned, we need make allocate >>>>> L"PerfDataMemAddr" below 4G for when PcdDxeIplSwitchToLongMode is >>>>> set, because it is used by S3Resume. >>>> >>>> My counter-argument is the same as above, I think. >>>> >>>>> >>>>> >>>>> >>>>> 4) BootGraphicsResourceTableDxe: There is no need to add 4G >>>>> limitation, we can remove that directly. (Confirmed with Chao >>>>> Zhang) >>>>> >>>>> 5) FirmwarePerformanceDataTableDxe: We can check >>>>> PcdDxeIplSwitchToLongMode to allocate Below4G memory for >>>>> mAcpiS3PerformanceTable which is used in S3. >>>> >>>> Same counter-argument again. >>>> >>>>> And we can remove 4G >>>>> limitation for mAcpiBootPerformanceTable directly. (Confirmed with >>>>> Chao >>>>> Zhang) >>>>> >>>>> 6) PiDxeS3BootScriptLib: We can check PcdDxeIplSwitchToLongMode for >>>>> S3TableBase and NewS3TableBase which is used in S3. (Confirmed with >>>>> Star >>>>> Zeng) >>>> >>>> Same counter-argument. >>>> >>>>> >>>>> 7) BootScriptExecutorDxe: We can check PcdDxeIplSwitchToLongMode >>>>> for FfsBuffer and BootScriptExecutorBuffer which is used in S3. >>>>> (Confirmed with Star Zeng) >>>> >>>> Same counter-argument. >>>> >>>>> >>>>> 8) CapsuleRuntimeDxe: Already check PcdDxeIplSwitchToLongMode.No >>>>> update needed. (Confirmed with Star Zeng) >>>> >>>> The only reason this doesn't break under OVMF is because OVMF >>>> doesn't implement capsule updates. >>>> >>>> In summary, PcdDxeIplSwitchToLongMode==TRUE implies, but is not >>>> equivalent to, the requirement to allocate under 4GB for S3 >>>> purposes. A platform may have *both* S3 *and* a 64-bit PEI phase >>>> without access to >>>>> = 4GB memory. >>>> >>>> That's why I proposed a brand new PCD. PcdDxeIplSwitchToLongMode >>>> would imply that PCD, but the new PCD wouldn't necessarily imply >>>> PcdDxeIplSwitchToLongMode. >>>> >>>>> BTW: We are trying not use IntelFrameworkModulePkg. If there is >>>>> replacement in MdeModulePkg, we suggest use the one in >>>>> MdeModulePkg, for example, BDS or ACPI. >>>> >>>> Absolutely, but migrating a platform from IntelFrameworkModulePkg >>>> BDS to MdeModulePkg BDS is not trivial, due to lack of documentation >>>> and examples; plus some functionality that is publicly available >>>> from GenericBdsLib (and is used by PlatformBdsLib instances) needs >>>> to be exported from the new boot manager library. >>>> >>>> Thanks >>>> Laszlo >>>> >>>>> >>>>> >>>>> >>>>> Thank you >>>>> >>>>> Yao Jiewen >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:*Ard Biesheuvel [mailto:[email protected]] >>>>> *Sent:* Monday, February 22, 2016 8:43 PM >>>>> *To:* Yao, Jiewen >>>>> *Cc:* Ni, Ruiyu; Laszlo Ersek; [email protected] >>>>> <mailto:[email protected]> >>>>> ; Tian, Feng; Zeng, Star; >>>>> [email protected] <mailto:[email protected]> ; >>>> [email protected] <mailto:[email protected]> ; Fan, >>>>> Jeff; Gao, Liming >>>>> *Subject:* Re: [edk2] [PATCH v3 2/4] IntelFrameworkModulePkg: BdsDxe: >>>>> only allocate below 4 GB if needed >>>>> >>>>> >>>>> >>>>> On 22 February 2016 at 13:40, Yao, Jiewen wrote: >>>>>> I did a search on current MdeModulePkg. I found there are more >>>>>> modules allocating Below4G memory. >>>>>> >>>>>> Besides BDS, we have BootGraphicsResourceTableDxe, >>>>>> BootScriptExecutorDxe, FirmwarePerformanceDataTableDxe, >>>>>> PiDxeS3BootScriptLib, CapsuleRuntimeDxe. >>>>>> >>>>>> >>>>>> >>>>>> BootGraphicsResourceTableDxe – I am not clear. >>>>>> >>>>>> CapsuleRuntimeDxe – Below4G is designed for PEI capsule. Or we do >>>>>> not need such capability. >>>>>> >>>>>> BootScriptExecutorDxe, FirmwarePerformanceDataTableDxe, >>>>>> PiDxeS3BootScriptLib >>>>>> - data need to be access in PEI phase. >>>>>> >>>>>> >>>>>> >>>>>> I think we might need more clean up for 32bit PEI. >>>>>> >>>>> >>>>> Agreed. >>>>> >>>>>> I am still thinking if we can reuse old Pcd – we can clarify in >>>>>> PCD description – it is only for 32bit PEI and 64bit DXE. >>>>>> >>>>>> Or a new PCD, and name to Pcd32BitPei? Please allow me collect >>>>>> more feedback on that. >>>>>> >>>>> >>>>> OK >>>>> >>>>>> In order to separate 32bit PEI issue from ACPI issue, I think we >>>>>> can check in ACPI patch at first, if you want. >>>>>> >>>>> >>>>> Should I wait for the maintainers of MdeModulePkg to give their >>>>> Reviewed-by ? >>>>> >>>> >>> > > _______________________________________________ > edk2-devel mailing list > [email protected] <mailto:[email protected]> > https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

