On 2/15/24 10:31, Gerd Hoffmann wrote: > Add support for splitting Hand-Off data into multiple HOBs. This is > required for VMs with thousands of CPUs. The actual CPU count per HOB > is much smaller (128) for better test coverage. > > Signed-off-by: Gerd Hoffmann <kra...@redhat.com> > --- > UefiCpuPkg/Library/MpInitLib/PeiMpLib.c | 48 ++++++++++++++----------- > 1 file changed, 27 insertions(+), 21 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c > b/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c > index f80e00edcff3..a5710789c8c3 100644 > --- a/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c > @@ -126,35 +126,41 @@ SaveCpuMpData ( > IN CPU_MP_DATA *CpuMpData > ) > { > - UINT64 Data64; > - UINTN Index; > - CPU_INFO_IN_HOB *CpuInfoInHob; > - MP_HAND_OFF *MpHandOff; > - UINTN MpHandOffSize; > + STATIC CONST UINT32 CpusPerHob = 128;
This should be a fixed-at-build PCD. Easy to modify on the build command line, for test coverage, but for production builds, it should be as large as possible. In fact, the code should determine how many CPU entries fit in a single HOB [*], and the PCD should be checked against it: - PCD == 0: use the maximum - PCD > maximum: assert - otherwise: use the PCD as chunking factor [*] See BuildGuidHob() in "MdePkg/Library/PeiHobLib/HobLib.c": | // | // Make sure that data length is not too long. | // | ASSERT (DataLength <= (0xFFF8 - sizeof (EFI_HOB_GUID_TYPE))); So the max permitted payload size is 0xFFE0 bytes, if I count right. CpusPerHob = (0xFFE0 - sizeof (MP_HAND_OFF)) / sizeof (PROCESSOR_HAND_OFF); > + UINT64 Data64; > + UINT32 Index, HobBase; > + CPU_INFO_IN_HOB *CpuInfoInHob; > + MP_HAND_OFF *MpHandOff; > + UINTN MpHandOffSize; > > // > // When APs are in a state that can be waken up by a store operation to a > memory address, > // report the MP_HAND_OFF data for DXE to use. > // > - CpuInfoInHob = (CPU_INFO_IN_HOB *)(UINTN)CpuMpData->CpuInfoInHob; > - MpHandOffSize = sizeof (MP_HAND_OFF) + sizeof (PROCESSOR_HAND_OFF) * > CpuMpData->CpuCount; > - MpHandOff = (MP_HAND_OFF *)BuildGuidHob (&mMpHandOffGuid, > MpHandOffSize); > - ASSERT (MpHandOff != NULL); > - ZeroMem (MpHandOff, MpHandOffSize); > - MpHandOff->ProcessorIndex = 0; > + CpuInfoInHob = (CPU_INFO_IN_HOB *)(UINTN)CpuMpData->CpuInfoInHob; > > - MpHandOff->CpuCount = CpuMpData->CpuCount; > - if (CpuMpData->ApLoopMode != ApInHltLoop) { > - MpHandOff->StartupSignalValue = MP_HAND_OFF_SIGNAL; > - MpHandOff->WaitLoopExecutionMode = sizeof (VOID *); > - } > + for (Index = 0; Index < CpuMpData->CpuCount; Index++) { > + if (Index % CpusPerHob == 0) { > + MpHandOffSize = sizeof (MP_HAND_OFF) + sizeof (PROCESSOR_HAND_OFF) * > CpusPerHob; I don't like that the last HOB will only be partially filled in, most of the time. Especially the max chunking factor -- which strives to create HOBs that are approximately 64KB in size -- might waste ~32KB on average, using a flat multiplication like the above. How about: UINT32 FreeInHob; PROCESSOR_HAND_OFF *Info; FreeInHob = 0; for (Index = 0; Index < CpuMpData->CpuCount; Index++) { if (FreeInHob == 0) { FreeInHob = MIN (CpusPerHob, CpuMpData->CpuCount - Index); MpHandOffSize = sizeof *MpHandOff + FreeInHob * sizeof *Info; MpHandOff = BuildGuidHob (&mMpHandOffGuid, MpHandOffSize); ASSERT (MpHandOff != NULL); ZeroMem (MpHandOff, MpHandOffSize); MpHandOff->ProcessorIndex = Index; MpHandOff->CpuCount = FreeInHob; Info = MpHandOff->Info; } Info->ApicId = CpuInfoInHob[Index].ApicId; Info->Health = CpuInfoInHob[Index].Health; if (CpuMpData->ApLoopMode != ApInHltLoop) { Info->StartupSignalAddress = (UINT64)(UINTN)CpuMpData->CpuData[Index].StartupApSignal; Info->StartupProcedureAddress = (UINT64)(UINTN)&CpuMpData->CpuData[Index].ApFunction; } Info++; FreeInHob--; } And then "HobBase" becomes superfluous. (Well, having a HobBase that carries information between iterations of the loop, versus an Info pointer, is conceptually the same; it's just that the Info pointer allows for shorter expressions.) The UINTN -> UINT32 type change for Index looks fine, however. > + MpHandOff = (MP_HAND_OFF *)BuildGuidHob (&mMpHandOffGuid, > MpHandOffSize); > + ASSERT (MpHandOff != NULL); > + ZeroMem (MpHandOff, MpHandOffSize); > > - for (Index = 0; Index < MpHandOff->CpuCount; Index++) { > - MpHandOff->Info[Index].ApicId = CpuInfoInHob[Index].ApicId; > - MpHandOff->Info[Index].Health = CpuInfoInHob[Index].Health; > + HobBase = Index; > + MpHandOff->ProcessorIndex = HobBase; > + MpHandOff->CpuCount = MIN (CpuMpData->CpuCount - HobBase, > CpusPerHob); Yes, the MIN() here is my key point, but I think we should also let it control the allocation! > + > + if (CpuMpData->ApLoopMode != ApInHltLoop) { > + MpHandOff->StartupSignalValue = MP_HAND_OFF_SIGNAL; > + MpHandOff->WaitLoopExecutionMode = sizeof (VOID *); > + } > + } As noted elsewhere, these fields don't belong in the loop (they don't belong to MP_HAND_OFF, going forward); the two of them together should form a new singleton structure. > + > + MpHandOff->Info[Index-HobBase].ApicId = CpuInfoInHob[Index].ApicId; > + MpHandOff->Info[Index-HobBase].Health = CpuInfoInHob[Index].Health; > if (CpuMpData->ApLoopMode != ApInHltLoop) { > - MpHandOff->Info[Index].StartupSignalAddress = > (UINT64)(UINTN)CpuMpData->CpuData[Index].StartupApSignal; > - MpHandOff->Info[Index].StartupProcedureAddress = > (UINT64)(UINTN)&CpuMpData->CpuData[Index].ApFunction; > + MpHandOff->Info[Index-HobBase].StartupSignalAddress = > (UINT64)(UINTN)CpuMpData->CpuData[Index].StartupApSignal; > + MpHandOff->Info[Index-HobBase].StartupProcedureAddress = > (UINT64)(UINTN)&CpuMpData->CpuData[Index].ApFunction; > } > } > Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#115595): https://edk2.groups.io/g/devel/message/115595 Mute This Topic: https://groups.io/mt/104369848/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-