Hi Laszlo, 2. Yes, I can change commit message/comments to separate split lock and #AC. 3. Yes it’s a close platform that is enabling #AC which hits double fault because split lock inside CpuExceptionHandlerLib.
Code: I was wondering same thing, why are they using locking mechanism. I wasn’t sure so that’s why keep xchg instead of changing to mov. I have yet to think of any reasons. I think it's a good idea to use mov instead it accomplishes the same thing and easier to understand. Thank you, John >-----Original Message----- >From: Laszlo Ersek [mailto:ler...@redhat.com] >Sent: Friday, September 13, 2019 9:32 AM >To: devel@edk2.groups.io; Lofgren, John E <john.e.lofg...@intel.com> >Cc: Ni, Ray <ray...@intel.com>; Gao, Liming <liming....@intel.com>; Dong, >Eric <eric.d...@intel.com> >Subject: Re: [edk2-devel] [Patch V2] UefiCpuPkg/CpuExceptionHandlerLib: Fix >#AC split lock > >On 09/09/19 20:40, John E Lofgren wrote: >> REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2150 >> >> V2 changes: >> Add xchg 16 bit instructions to handle sgdt and sidt base >> 63:48 bits and 47:32 bits. >> Add comment to explain why xchg 64bit isnt being used >> >> Fix #AC split lock's caused by seperating base and limit from sgdt and >> sidt by changing xchg operands to 32-bit to stop from crossing >> cacheline. >> >> Signed-off-by: John E Lofgren <john.e.lofg...@intel.com> >> --- >> >> >UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nas >m >> | 20 ++++++++++++++------ >> 1 file changed, 14 insertions(+), 6 deletions(-) > >The commit message (and the bug report) are very difficult to understand. > >This is the first time I hear about "split lock" and "alignment check >exception", >so please bear with me. This is my take on the problem. > >(1) "split lock" is explained well here: > >https://lwn.net/Articles/784864/ > >In short, we can consider it a performance anti-pattern. A locking instruction >(such as XCHG) is invoked on incorrectly aligned data, which causes >performance degradation for the whole system. In some cases this can be a >security issue even, because less privileged code can block (slow down) more >privileged code running on a different CPU. > >(2) Alignment Check Exception is a way to detect the occurrence of "split >lock". It must be configured explicitly, when the system software wishes to be >informed explicitly about a split lock event. > >Therefore, the "#AC split lock" expression in the commit message is very >confusing. Those are two different things. One is the problem ("split lock"), >and the other is the (kind of) solution ("alignment check exception"). > >We don't care about #AC (the exception) here. My understanding is that the >open source edk2 tree does not enable #AC. The following include file: > > MdePkg/Include/Register/Intel/Msr/P6Msr.h > >defines MSR_P6_TEST_CTL (value 0x33), which the above LWN article >references as MSR_TEST_CTL. The article refers to bit 29, but that seems to be >part of the "Reserved1" bit-field in the MSR_P6_TEST_CTL_REGISTER. > >There seems to be a bit field that could be related (Disable_LOCK, bit#31), but >again, there is no reference to Disable_LOCK in the edk2 codebase. > >So my whole point here is that we should clearly separate "#AC" from "split >lock" in the commit message (and in the code comment). Those are separate >things. And "split lock" may apply to edk2, but "#AC" does not (to the open >source tree anyway). > >(3) OK, assuming this code indeed triggers a "split lock" event. Why is that a >problem? Yes, it may degrade performance for the system. Why do we care? >We are *already* handling an exception. That should not be a very frequent >event, for any processor (BSP or AP) in the system. > >Is the problem that a closed source platform enables #AC, and then the fault >handler in CpuExceptionHandlerLib -- which gets invoked due to an >independent fault -- runs into a *double* fault, due to the split lock raising >#AC? > >This should be clearly explained in the commit message. I'm not prying at >proprietary platform details, but the circumstances / symptoms of the issue >should be clearly described. > >More comments below, regarding the original code: > >> >> diff --git >> >a/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.na >> sm >> >b/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.na >> sm >> index 4db1a09f28..7b7642b290 100644 >> --- >> >a/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.na >> sm >> +++ >b/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAs >> +++ m.nasm >> @@ -180,21 +180,29 @@ HasErrorCode: >> push qword [rbp + 24] >> >> ;; UINT64 Gdtr[2], Idtr[2]; >> + ; sidt and sgdt saves 10 bytes to memory, 8 bytes = base and 2 bytes = >limit. >> + ; To avoid #AC split lock when separating base and limit into their >> + ; own separate 64 bit memory, we can’t use 64 bit xchg since base >> [63:48] >bits >> + ; may cross the cache line. >> xor rax, rax >> push rax >> push rax > >So, the contents of RAX is 0 now, and we've made 16 bytes (filled with >0) room on the stack. > >> sidt [rsp] > >This instruction writes 10 bytes at the base of that "room" on the stack. >Offsets 0 and 1 contain the "limit", offsets 2-9 (inclusive) contain the "base >address". > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |L|L|B|B|B|B|B|B|B|B|0|0|0|0|0|0| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > >> - xchg rax, [rsp + 2] > >(I guess this is the instruction that splits the lock.) > >Now RAX has the base address, and the stack looks like: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |L|L|0|0|0|0|0|0|0|0|0|0|0|0|0|0| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > >> - xchg rax, [rsp] > >Now RAX has the limit, and the stack is: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |B|B|B|B|B|B|B|B|0|0|0|0|0|0|0|0| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > >> - xchg rax, [rsp + 8] > >Now RAX is zero again, and the stack is: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |B|B|B|B|B|B|B|B|L|L|0|0|0|0|0|0| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > >This is very "clever" and all, but why do we have to use a locking instruction? >We save RBX on the stack, earlier in CommonInterruptEntry, and we restore it >in the end. So what's wrong with: > > sidt [rsp] > mov bx, word [rsp] ; read limit into bx > mov rax, qword [rsp + 2] ; read base address into rax > mov qword [rsp], rax ; write base address to qword#0 > mov word [rsp + 8], bx ; write limit to qword#1 > >The second MOV instruction may straddle a cache line, yes, but there is no >locking at all. > >Thanks, >Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#47232): https://edk2.groups.io/g/devel/message/47232 Mute This Topic: https://groups.io/mt/34083544/21656 Mute #ac: https://groups.io/mk?hashtag=ac&subid=3846945 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-