Re: [edk2] [Patch 1/2] UefiCpuPkg/RegisterCpuFeaturesLib: Separate semaphore container.

2018-11-09 Thread Ni, Ruiyu

Eric,
Patch is good.
Reviewed-by: Ruiyu Ni 

Some modification to commit message to help understanding the changes.
Please rewording if you think some of them is not proper.

You can push the changes if no concerns about the new commit message.

-

In current implementation, core and package level sync uses same 
semaphores. Sharing the semaphore may cause wrong execution order.

For example:
1. Feature A has CPU_FEATURE_CORE_BEFORE dependency with Feature B.
2. Feature C has CPU_FEATURE_PACKAGE_AFTER dependency with Feature B.
The expected feature initialization order is A B C:
A  (Core Depends) > B  (Package Depends) > C

For a CPU has 1 package, 2 cores and 4 threads. The feature 
initialization order may like below:


  Thread#1 Thread#2Thread#3   Thread#4
  [A.Init] [A.Init]   [A.Init]
Release(S1, S2)Release(S1, S2)Release(S3, S4)
Wait(S1) * 2   Wait(S2) * 2<- Core sync
  [B.Init] [B.Init]
Release (S1,S2,S3,S4)
Wait (S1) * 4  <-- Package sync
  Wait(S4 * 2) <- Core sync
  [B.Init]

In above case, for thread#4, when it syncs in core level, Wait(S4) * 2 
isn't blocked and [B.Init] runs. But [A.Init] hasn't run in thread#3. 
It's wrong! Thread#4 should execute [B.Init] after thread#3 executes 
[A.Init] because B core level depends on A.


The reason is of the wrong execution order is that S4 is released in 
thread#1 by calling Release (S1, S2, S3, S4) and in thread #4 by calling 
Release (S3, S4).


To fix this issue, core level sync and package level sync should use 
separate semaphores.


In above example, the S4 released in Release (S1, S2, S3, S4) should not 
be the same semaphore as that in Release (S3, S4).


Thanks,
Ray
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


[edk2] [Patch 1/2] UefiCpuPkg/RegisterCpuFeaturesLib: Separate semaphore container.

2018-11-07 Thread Eric Dong
In current implementation, core level semaphore use same container
with package level semaphore. This design will let the core level
semaphore not works as expected in below case:
1. Feature A has CPU_FEATURE_CORE_BEFORE dependence with Feature B.
2. Feature C has CPU_FEATURE_PACKAGE_AFTER dependence with Feature B.
in this case an core level semaphore will be add between A and B, and
an package level semaphore will be add between B and C.

For a CPU has one package, two cores and 4 threads. Execute like below:

  Thread 1  Thread 2. Thread 4
ReleaseSemaph(1,2)  -|
WaitForSemaph(1(2)) -|<---These two are Core Semaph
  ReleaseSemaph(1,2) -|
  WaitForSemaph(2)   -| <---  Core Semaph

ReleaseSemaph (1,2,3,4) -|
WaitForSemaph (1(4))-| <  Package Semaph

  ReleaseSemaph(3,4)
  WaitForSemaph(4(2)) <- Core Semaph

In above case, for thread 4, when it executes a core semaphore, i will
found WaitForSemaph(4(2)) is met because Thread 1 has execute a package
semaphore and ReleaseSemaph(4) for it before. This is not an expect
behavior. Thread 4 should wait for thread 3 to do this.

Fix this issue by separate the semaphore container for core level and
package level.

Cc: Laszlo Ersek 
Cc: Ruiyu Ni 
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong 
---
 .../Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c   | 9 ++---
 UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h  | 7 ---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c 
b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c
index 7f208dbe6a..4bed0ce3a4 100644
--- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c
+++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c
@@ -269,8 +269,10 @@ CpuInitDataInitialize (
 DEBUG ((DEBUG_INFO, "Package: %d, Valid Core : %d\n", Index, 
ValidCoreCountPerPackage[Index]));
   }
 
-  CpuFeaturesData->CpuFlags.SemaphoreCount = AllocateZeroPool (sizeof (UINT32) 
* CpuStatus->PackageCount * CpuStatus->MaxCoreCount * 
CpuStatus->MaxThreadCount);
-  ASSERT (CpuFeaturesData->CpuFlags.SemaphoreCount != NULL);
+  CpuFeaturesData->CpuFlags.CoreSemaphoreCount = AllocateZeroPool (sizeof 
(UINT32) * CpuStatus->PackageCount * CpuStatus->MaxCoreCount * 
CpuStatus->MaxThreadCount);
+  ASSERT (CpuFeaturesData->CpuFlags.CoreSemaphoreCount != NULL);
+  CpuFeaturesData->CpuFlags.PackageSemaphoreCount = AllocateZeroPool (sizeof 
(UINT32) * CpuStatus->PackageCount * CpuStatus->MaxCoreCount * 
CpuStatus->MaxThreadCount);
+  ASSERT (CpuFeaturesData->CpuFlags.PackageSemaphoreCount != NULL);
 
   //
   // Get support and configuration PCDs
@@ -933,9 +935,9 @@ ProgramProcessorRegister (
   //  V(0...n)   V(0...n)  ...   V(0...n)
   //  n * P(0)   n * P(1)  ...   n * P(n)
   //
-  SemaphorePtr = CpuFlags->SemaphoreCount;
   switch (RegisterTableEntry->Value) {
   case CoreDepType:
+SemaphorePtr = CpuFlags->CoreSemaphoreCount;
 //
 // Get Offset info for the first thread in the core which current 
thread belongs to.
 //
@@ -956,6 +958,7 @@ ProgramProcessorRegister (
 break;
 
   case PackageDepType:
+SemaphorePtr = CpuFlags->PackageSemaphoreCount;
 ValidCoreCountPerPackage = (UINT32 
*)(UINTN)CpuStatus->ValidCoreCountPerPackage;
 //
 // Get Offset info for the first thread in the package which current 
thread belongs to.
diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h 
b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h
index b4c8ab777e..4898a80827 100644
--- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h
+++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h
@@ -60,9 +60,10 @@ typedef struct {
 // Flags used when program the register.
 //
 typedef struct {
-  volatile UINTN   ConsoleLogLock;   // Spinlock used to control 
console.
-  volatile UINTN   MemoryMappedLock; // Spinlock used to program 
mmio
-  volatile UINT32  *SemaphoreCount;  // Semaphore used to program 
semaphore.
+  volatile UINTN   ConsoleLogLock;  // Spinlock used to 
control console.
+  volatile UINTN   MemoryMappedLock;// Spinlock used to 
program mmio
+  volatile UINT32  *CoreSemaphoreCount; // Semaphore containers 
used to program Core semaphore.
+  volatile UINT32  *PackageSemaphoreCount;  // Semaphore containers 
used to program Package semaphore.
 } PROGRAM_CPU_REGISTER_FLAGS;
 
 typedef struct {
-- 
2.15.0.windows.1

___
edk2-devel mailing list
edk2-devel@lists.01.org