Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
Hi Andrew, On 2016.09.22 21:27, Andrew Fish wrote: It seems like tracking the PUSHes would work? First of all, as got pointed out by Ard, I need to mention that much of my earlier analysis results, which you quoted, were due to a misread of the specs, as it does mandates pushing 32 bit parameters as native. So the output I gave for AARCH64 and X64 was erroneous, since it was a result of using PUSH32, and the problematic output disappears when using PUSHn. The only issue therefore, is with 32 bit ARM. We could update the spec to require this behavior. If the idea is to add the tracking when processing POP/PUSH on Arm (which is how I originally saw it), please note that we may also have to track stack manipulation with MOV against R0, the EBC stack pointer, if someone issues something like 'MOV R0, R0(-1,0)' (equivalent to PUSHn) or 'MOV R0, R0(0,-8)' (equivalent to PUSH64) for some 32 or 64 bit optional parameter... However, this brings us back to our original issue if there exists a succession of optional 32 and 64 bit parameters, which the application doesn't care about filling, and that may be handled with a single 'MOV R0, R0(-1,-8)' in the assembly. If we are faced with this in our tracking, then we still have no way of knowing which of the 32 or 64 bit parameter comes first. So, notwithstanding other issues, we'd have to mandate something very specific against doing this in the UEFI/EBC specs (such as only using individual PUSHes)... Regards, /Pete ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
> On Sep 22, 2016, at 4:05 AM, Pete Batardwrote: > > On 2016.09.22 11:06, Ard Biesheuvel wrote: >> However, there is a fundamental issue with EBC on ARM that has not >> been addressed yet, which makes EBC support problematic: >> ARM uses natural alignment for 64-bit types, which means it leaves >> gaps in the stack frame, and the thunking code has no way of dealing >> with that. > > I was hoping you would comment on this, as I believe the issue is larger than > Arm (which is why I thought the Arm patch could be integrated), since I ran > into something similar for X64 and AARCH64, with the conversion from stack to > register parameters. > > Let me start by showing a real-life example of what the current EBC > implementation does, on different architectures, if you CALLEX from EBC into > a native function call such as: > > VOID MultiParamNative( >UINT32, >UINT64, >UINT64, >UINT64, >UINT32, >UINT32, >UINT64 > ); > > with values: > >0x1C1C1C1C, >0x2B2B2B2B2A2A2A2A, >0x3B3B3B3B3A3A3A3A, >0x4B4B4B4B4A4A4A4A, >0x5C5C5C5C, >0x6C6C6C6C, >0x7B7B7B7B7A7A7A7A > > If you do that, then the parameter values seen by each Arch will be as > follows: > > IA32: > p1 = 0x1C1C1C1C > p2 = 0x2B2B2B2B2A2A2A2A > p3 = 0x3B3B3B3B3A3A3A3A > p4 = 0x4B4B4B4B4A4A4A4A > p5 = 0x5C5C5C5C > p6 = 0x6C6C6C6C > p7 = 0x7B7B7B7B7A7A7A7A > > X64: > p1 = 0x1C1C1C1C > p2 = 0x3A3A3A3A2B2B2B2B > p3 = 0x4A4A4A4A3B3B3B3B > p4 = 0x5C5C5C5C4B4B4B4B > p5 = 0x6C6C6C6C > p6 = 0x7B7B7B7B > p7 = 0x06F23E4012345678 > > ARM: > p1 = 0x1C1C1C1C > p2 = 0x3A3A3A3A2B2B2B2B > p3 = 0x4A4A4A4A3B3B3B3B > p4 = 0x5C5C5C5C4B4B4B4B > p5 = 0x6C6C6C6C > p6 = 0x7A7A7A7A > p7 = 0x446EEC467B7B7B7B > > AA64: > p1 = 0x1C1C1C1C > p2 = 0x3A3A3A3A2B2B2B2B > p3 = 0x4A4A4A4A3B3B3B3B > p4 = 0x5C5C5C5C4B4B4B4B > p5 = 0x6C6C6C6C > p6 = 0x7B7B7B7B > p7 = 0xFF91 > > Note that these are real-life results gotten from a native set of drivers [1] > + EBC sample [2], specifically designed to test the above. > > So, as you can see, only IA32 currently retrieves the parameters with their > expected values. All the other platforms, and not just Arm, have an issue > with parameter retrieval. > > I too performed some analysis [3], to understand the problem, the result of > which can be summarized as follows: > > Let's say you have native protocol function: > > ProtocolCall(UINT32, UINT64, UINT64) > > to which you want to pass values: > > (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A) > > With the EBC VM, the parameters then get stacked as (little endian, CDECL and > using 32-bit longwords to represent the stack): > > ++ > |1C1C1C1C| > ++ > |2A2A2A2A| > ++ > |2B2B2B2B| > ++ > |3A3A3A3A| > ++ > |3B3B3B3B| > ++ > ++ > ++ > > Now, if you are calling into an x86_32 arch, this is no issue, as the native > call reads the parameters off the stack, and finds each one it its expected > location. > > But if, say, you are calling into native Arm, then the calling convention > dictates that the first four 32 bits parameters must be placed into Arm > registers r0-r3, rather than on the stack, and what's more, that if there > exist 64 bit parameters among the register ones, they must start with an even > register (r0 or r2). > > What this means is that, with the current EBC handling, which simply maps the > top of the stack onto registers for native CALLEX (as the VM cannot guess the > parameter signature of the function it is calling into, and therefore will > not do anything else but a blind mapping of the stack onto registers), the > native Arm function effectively gets called with the following parameter > mapping: > > ++ > |1C1C1C1C| -> r0 (32-bit first parameter) > ++ > |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit) > ++ > |2B2B2B2B| -> r2 (lower half of 64-bit second parameter) > ++ > |3A3A3A3A| -> r3 (upper half of 64-bit second parameter) > ++ > |3B3B3B3B| -> lower half of 64-bit third parameter (stack) > ++ > ++ -> upper half of 64-bit third parameter (stack) > ++ > > The end result is that, the Arm call ends up with these values: > > (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x3B3B3B3B) > > However, while we used Arm for this example, this is not an Arm specific > issue, as x86_64 and Arm64 also expect any of the first eight parameters to a > native call, that are smaller than 64-bit, to get passed as a 64-bit > register, which means they too have the same issue as the one illustrated > above. > > Now, I'm not sure what the solution to that issue would be. I tend to agree > that, short of including a parameter signature for function calls, this > function argument marshalling issue between EBC and native will be difficult > to solve.
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
On 22 September 2016 at 12:26, Pete Batardwrote: > On 2016.09.22 12:14, Ard Biesheuvel wrote: >> >> For X64 and AARCH64, the issue does not exist because the EBC spec >> mandates that all function arguments are widened to the native word >> size. So when executing on a 64-bit architecture, the EBC stack looks >> differently from what you describe above, and maps seamlessly onto the >> register assignment mandated by the respective calling conventions. > > > Ah, I see that you're right, and that I was trying to solve an issue that > shouldn't exist: > > From UEFI 2.6, paragraph 21.9.3: > > "32-bit integers are pushed as natural size (since they > should be passed as 64-bit parameter values on 64-bit machines)." > > I must admit I was a bit curious as to why this problem wouldn't have been > picked before. > > > So that leaves only the issue you mentioned. But then I'm not too hopeful > with the timeframe for Arm/EBC integration when you say "we need language > spec and compiler updates before we can fully support this"... > > Do you know if work has been started on this? Or are we just going to > consider that this is too troublesome a problem to fix? > We are debating this internally. Even on AArch64, there are numerous issues with EBC drivers, even if the code itself executes fine (this is mainly related to PCI drivers that don't bother to enable support for 64-bit DMA since this is never needed on Intel platforms) So while EBC is of high importance to many Linaro members, I am not sure if that includes 32-bit ARM support. -- Ard. ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
On 2016.09.22 12:14, Ard Biesheuvel wrote: For X64 and AARCH64, the issue does not exist because the EBC spec mandates that all function arguments are widened to the native word size. So when executing on a 64-bit architecture, the EBC stack looks differently from what you describe above, and maps seamlessly onto the register assignment mandated by the respective calling conventions. Ah, I see that you're right, and that I was trying to solve an issue that shouldn't exist: From UEFI 2.6, paragraph 21.9.3: "32-bit integers are pushed as natural size (since they should be passed as 64-bit parameter values on 64-bit machines)." I must admit I was a bit curious as to why this problem wouldn't have been picked before. So that leaves only the issue you mentioned. But then I'm not too hopeful with the timeframe for Arm/EBC integration when you say "we need language spec and compiler updates before we can fully support this"... Do you know if work has been started on this? Or are we just going to consider that this is too troublesome a problem to fix? Regards, /Pete ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
On 22 September 2016 at 12:05, Pete Batardwrote: > On 2016.09.22 11:06, Ard Biesheuvel wrote: >> >> However, there is a fundamental issue with EBC on ARM that has not >> been addressed yet, which makes EBC support problematic: >> ARM uses natural alignment for 64-bit types, which means it leaves >> gaps in the stack frame, and the thunking code has no way of dealing >> with that. > > > I was hoping you would comment on this, as I believe the issue is larger > than Arm (which is why I thought the Arm patch could be integrated), since I > ran into something similar for X64 and AARCH64, with the conversion from > stack to register parameters. > > Let me start by showing a real-life example of what the current EBC > implementation does, on different architectures, if you CALLEX from EBC into > a native function call such as: > > VOID MultiParamNative( > UINT32, > UINT64, > UINT64, > UINT64, > UINT32, > UINT32, > UINT64 > ); > > with values: > > 0x1C1C1C1C, > 0x2B2B2B2B2A2A2A2A, > 0x3B3B3B3B3A3A3A3A, > 0x4B4B4B4B4A4A4A4A, > 0x5C5C5C5C, > 0x6C6C6C6C, > 0x7B7B7B7B7A7A7A7A > > If you do that, then the parameter values seen by each Arch will be as > follows: > > IA32: > p1 = 0x1C1C1C1C > p2 = 0x2B2B2B2B2A2A2A2A > p3 = 0x3B3B3B3B3A3A3A3A > p4 = 0x4B4B4B4B4A4A4A4A > p5 = 0x5C5C5C5C > p6 = 0x6C6C6C6C > p7 = 0x7B7B7B7B7A7A7A7A > > X64: > p1 = 0x1C1C1C1C > p2 = 0x3A3A3A3A2B2B2B2B > p3 = 0x4A4A4A4A3B3B3B3B > p4 = 0x5C5C5C5C4B4B4B4B > p5 = 0x6C6C6C6C > p6 = 0x7B7B7B7B > p7 = 0x06F23E4012345678 > > ARM: > p1 = 0x1C1C1C1C > p2 = 0x3A3A3A3A2B2B2B2B > p3 = 0x4A4A4A4A3B3B3B3B > p4 = 0x5C5C5C5C4B4B4B4B > p5 = 0x6C6C6C6C > p6 = 0x7A7A7A7A > p7 = 0x446EEC467B7B7B7B > > AA64: > p1 = 0x1C1C1C1C > p2 = 0x3A3A3A3A2B2B2B2B > p3 = 0x4A4A4A4A3B3B3B3B > p4 = 0x5C5C5C5C4B4B4B4B > p5 = 0x6C6C6C6C > p6 = 0x7B7B7B7B > p7 = 0xFF91 > > Note that these are real-life results gotten from a native set of drivers > [1] + EBC sample [2], specifically designed to test the above. > > So, as you can see, only IA32 currently retrieves the parameters with their > expected values. All the other platforms, and not just Arm, have an issue > with parameter retrieval. > > I too performed some analysis [3], to understand the problem, the result of > which can be summarized as follows: > > Let's say you have native protocol function: > > ProtocolCall(UINT32, UINT64, UINT64) > > to which you want to pass values: > > (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A) > > With the EBC VM, the parameters then get stacked as (little endian, CDECL > and using 32-bit longwords to represent the stack): > >++ >|1C1C1C1C| >++ >|2A2A2A2A| >++ >|2B2B2B2B| >++ >|3A3A3A3A| >++ >|3B3B3B3B| >++ >++ >++ > > Now, if you are calling into an x86_32 arch, this is no issue, as the native > call reads the parameters off the stack, and finds each one it its expected > location. > > But if, say, you are calling into native Arm, then the calling convention > dictates that the first four 32 bits parameters must be placed into Arm > registers r0-r3, rather than on the stack, and what's more, that if there > exist 64 bit parameters among the register ones, they must start with an > even register (r0 or r2). > > What this means is that, with the current EBC handling, which simply maps > the top of the stack onto registers for native CALLEX (as the VM cannot > guess the parameter signature of the function it is calling into, and > therefore will not do anything else but a blind mapping of the stack onto > registers), the native Arm function effectively gets called with the > following parameter mapping: > >++ >|1C1C1C1C| -> r0 (32-bit first parameter) >++ >|2A2A2A2A| -> (r1/unused, since first parameter is 32-bit) >++ >|2B2B2B2B| -> r2 (lower half of 64-bit second parameter) >++ >|3A3A3A3A| -> r3 (upper half of 64-bit second parameter) >++ >|3B3B3B3B| -> lower half of 64-bit third parameter (stack) >++ >++ -> upper half of 64-bit third parameter (stack) >++ > > The end result is that, the Arm call ends up with these values: > > (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x3B3B3B3B) > > However, while we used Arm for this example, this is not an Arm specific > issue, as x86_64 and Arm64 also expect any of the first eight parameters to > a native call, that are smaller than 64-bit, to get passed as a 64-bit > register, which means they too have the same issue as the one illustrated > above. > > Now, I'm not sure what the solution to that issue would be. I tend to agree > that, short of including a parameter signature for function calls, this > function argument marshalling issue between EBC and native will be
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
On 2016.09.22 11:06, Ard Biesheuvel wrote: However, there is a fundamental issue with EBC on ARM that has not been addressed yet, which makes EBC support problematic: ARM uses natural alignment for 64-bit types, which means it leaves gaps in the stack frame, and the thunking code has no way of dealing with that. I was hoping you would comment on this, as I believe the issue is larger than Arm (which is why I thought the Arm patch could be integrated), since I ran into something similar for X64 and AARCH64, with the conversion from stack to register parameters. Let me start by showing a real-life example of what the current EBC implementation does, on different architectures, if you CALLEX from EBC into a native function call such as: VOID MultiParamNative( UINT32, UINT64, UINT64, UINT64, UINT32, UINT32, UINT64 ); with values: 0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A, 0x4B4B4B4B4A4A4A4A, 0x5C5C5C5C, 0x6C6C6C6C, 0x7B7B7B7B7A7A7A7A If you do that, then the parameter values seen by each Arch will be as follows: IA32: p1 = 0x1C1C1C1C p2 = 0x2B2B2B2B2A2A2A2A p3 = 0x3B3B3B3B3A3A3A3A p4 = 0x4B4B4B4B4A4A4A4A p5 = 0x5C5C5C5C p6 = 0x6C6C6C6C p7 = 0x7B7B7B7B7A7A7A7A X64: p1 = 0x1C1C1C1C p2 = 0x3A3A3A3A2B2B2B2B p3 = 0x4A4A4A4A3B3B3B3B p4 = 0x5C5C5C5C4B4B4B4B p5 = 0x6C6C6C6C p6 = 0x7B7B7B7B p7 = 0x06F23E4012345678 ARM: p1 = 0x1C1C1C1C p2 = 0x3A3A3A3A2B2B2B2B p3 = 0x4A4A4A4A3B3B3B3B p4 = 0x5C5C5C5C4B4B4B4B p5 = 0x6C6C6C6C p6 = 0x7A7A7A7A p7 = 0x446EEC467B7B7B7B AA64: p1 = 0x1C1C1C1C p2 = 0x3A3A3A3A2B2B2B2B p3 = 0x4A4A4A4A3B3B3B3B p4 = 0x5C5C5C5C4B4B4B4B p5 = 0x6C6C6C6C p6 = 0x7B7B7B7B p7 = 0xFF91 Note that these are real-life results gotten from a native set of drivers [1] + EBC sample [2], specifically designed to test the above. So, as you can see, only IA32 currently retrieves the parameters with their expected values. All the other platforms, and not just Arm, have an issue with parameter retrieval. I too performed some analysis [3], to understand the problem, the result of which can be summarized as follows: Let's say you have native protocol function: ProtocolCall(UINT32, UINT64, UINT64) to which you want to pass values: (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A) With the EBC VM, the parameters then get stacked as (little endian, CDECL and using 32-bit longwords to represent the stack): ++ |1C1C1C1C| ++ |2A2A2A2A| ++ |2B2B2B2B| ++ |3A3A3A3A| ++ |3B3B3B3B| ++ ++ ++ Now, if you are calling into an x86_32 arch, this is no issue, as the native call reads the parameters off the stack, and finds each one it its expected location. But if, say, you are calling into native Arm, then the calling convention dictates that the first four 32 bits parameters must be placed into Arm registers r0-r3, rather than on the stack, and what's more, that if there exist 64 bit parameters among the register ones, they must start with an even register (r0 or r2). What this means is that, with the current EBC handling, which simply maps the top of the stack onto registers for native CALLEX (as the VM cannot guess the parameter signature of the function it is calling into, and therefore will not do anything else but a blind mapping of the stack onto registers), the native Arm function effectively gets called with the following parameter mapping: ++ |1C1C1C1C| -> r0 (32-bit first parameter) ++ |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit) ++ |2B2B2B2B| -> r2 (lower half of 64-bit second parameter) ++ |3A3A3A3A| -> r3 (upper half of 64-bit second parameter) ++ |3B3B3B3B| -> lower half of 64-bit third parameter (stack) ++ ++ -> upper half of 64-bit third parameter (stack) ++ The end result is that, the Arm call ends up with these values: (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x3B3B3B3B) However, while we used Arm for this example, this is not an Arm specific issue, as x86_64 and Arm64 also expect any of the first eight parameters to a native call, that are smaller than 64-bit, to get passed as a 64-bit register, which means they too have the same issue as the one illustrated above. Now, I'm not sure what the solution to that issue would be. I tend to agree that, short of including a parameter signature for function calls, this function argument marshalling issue between EBC and native will be difficult to solve. A possible half-workaround I thought of could be to keep track of all the PUSHes having been carried out before a CALLEX, and *assume* (or mandate in the specs) that all the arguments were pushed individually and that the size of the PUSH matches the desired size for a register argument, but
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
On 22 September 2016 at 11:06, Ard Biesheuvelwrote: > On 22 September 2016 at 10:43, Pete Batard wrote: >> Hi, >> >> The following is an updated/fixed version of the patch(es), put forward by >> Ard Biesheuvel on August 9 ([1], [2]), and re-submitted for formal >> inclusion, so that the EDK2 can provide EBC functionality for all of IA32, >> IA64, X64, AARCH64 and ARM at last. >> >> This updated patch now includes the necessary corollary dsc/fdf updates as >> well as fixes to the assembly's EbcLLCALLEXNative, as I found the following >> issues there: >> - At least gcc5 didn't seem to like the manually optimized branching for all >> register args ("sub r1, r1, r3, lsr #1"), and one can never be sure of the >> actual size instructions will be assembled into, in case of assembler >> internal alignment/optimization, so I broke it down into actual labelled >> branches. There are only 4 of those anyway. >> - For register + stack calls, while 8 x 64 bit registers on AARCH64 do >> equate to #64 bytes that need to be taken off the stack, on ARM the 4 x 32 >> bit registers equate to #16 bytes, not #32 >> - Even after fixing the above, I found some issues with the manual stack >> duplication assembly code, so I switched to using a call to CopyMem(), like >> IA32 does. >> >> With these changes, I believe that the ARM/EBC feature should be fully >> functional, especially as I have heavily tested multiparameter calls from >> EBC into native, using an fasmg-based EBC assembler [3], to confirm that >> they performed just as well with ARM as with AARCH64, IA32 or X64. >> > > Hello Pete, > > Thanks a lot for this contribution. I had spotted (and fixed) some of > the above issues as well. > > However, there is a fundamental issue with EBC on ARM that has not > been addressed yet, which makes EBC support problematic: > ARM uses natural alignment for 64-bit types, which means it leaves > gaps in the stack frame, and the thunking code has no way of dealing > with that. > > I am pasting my analysis below, which I sent out internally a couple > of weeks ago. In summary, we need language spec and compiler updates > before we can fully support this on 32-bit ARM. > BTW, the EDK2 tree has an EBC version of the FAT filesystem driver, which is what I have been using to test EBC. I have a Frankenstein version of the 32-bit ARM one (shared below) that deals with the padding of known protocol methods that contain UINT64 arguments at odd positions, but it is not pretty, and a clear example why the spec needs to be updated to accommodate ARM https://git.linaro.org/people/ard.biesheuvel/uefi-next.git/shortlog/refs/heads/ebc3 ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
On 22 September 2016 at 10:43, Pete Batardwrote: > Hi, > > The following is an updated/fixed version of the patch(es), put forward by > Ard Biesheuvel on August 9 ([1], [2]), and re-submitted for formal > inclusion, so that the EDK2 can provide EBC functionality for all of IA32, > IA64, X64, AARCH64 and ARM at last. > > This updated patch now includes the necessary corollary dsc/fdf updates as > well as fixes to the assembly's EbcLLCALLEXNative, as I found the following > issues there: > - At least gcc5 didn't seem to like the manually optimized branching for all > register args ("sub r1, r1, r3, lsr #1"), and one can never be sure of the > actual size instructions will be assembled into, in case of assembler > internal alignment/optimization, so I broke it down into actual labelled > branches. There are only 4 of those anyway. > - For register + stack calls, while 8 x 64 bit registers on AARCH64 do > equate to #64 bytes that need to be taken off the stack, on ARM the 4 x 32 > bit registers equate to #16 bytes, not #32 > - Even after fixing the above, I found some issues with the manual stack > duplication assembly code, so I switched to using a call to CopyMem(), like > IA32 does. > > With these changes, I believe that the ARM/EBC feature should be fully > functional, especially as I have heavily tested multiparameter calls from > EBC into native, using an fasmg-based EBC assembler [3], to confirm that > they performed just as well with ARM as with AARCH64, IA32 or X64. > Hello Pete, Thanks a lot for this contribution. I had spotted (and fixed) some of the above issues as well. However, there is a fundamental issue with EBC on ARM that has not been addressed yet, which makes EBC support problematic: ARM uses natural alignment for 64-bit types, which means it leaves gaps in the stack frame, and the thunking code has no way of dealing with that. I am pasting my analysis below, which I sent out internally a couple of weeks ago. In summary, we need language spec and compiler updates before we can fully support this on 32-bit ARM. Thanks, Ard. -- This compares the EBC argument stack with the argument assignment across registers and stack expected by the respective Procedure Call Standards for AArch64 and AArch32. Since the EBC thunking layer is not aware of the actual prototype signature of the function that is being called (it does not even know which part of the stack frame consists of outgoing arguments, and so it needs to assume that the entire stack frame needs to be copied into arguments and the native stack), the calls can only execute correctly if the EBC stack frame happens to align all arguments natively, in which case the AAPCS happens to agree with the EBC cdecl calling conventions (although the first 8 resp 4 arguments are passed via registers). In the diagrams below, this is the case if the diagrams line up horizontally. In summary, EBC on AArch64 seems to be OK (although more testing is needed), as long as we don't pass arguments whose size exceeds 64 bits (which the EBC compiler is unlikely to support anyway) EBC on AArch32 happens to work as long as no UINT64 values appear as the return value or as an odd-numbered argument. Since it is impossible to infer from EBC bytecode whether any such function calls are being performed, the only way to fix this is to update the EBC spec (and the compiler) to insert hints into the bytecode when such problematic values occur. Below is a comparison between the stack frame layouts of various protocol entry points that are relevant to EBC drivers, i.e., PCI I/O, block I/O and network I/O) typedef EFI_STATUS (EFIAPI *EFI_BLOCK_READ)( IN EFI_BLOCK_IO_PROTOCOL *This, IN UINT32 MediaId, IN EFI_LBALba, IN UINTN BufferSize, OUT VOID *Buffer ); Executing on 64-bit (ok) EBC stack AArch64 registers 0x00 ++ ++ | This | x0 | This | 0x08 ++ ++ | MediaId| x1 | MediaId| 0x10 ++ ++ | Lba | x2 | Lba | 0x18 ++ ++ | BufferSize | x3 | BufferSize | 0x20 ++ ++ | Buffer | x4 | Buffer | 0x28 ++ ++ : : ++ ++ R7 | Return value | x0 | Return value | ++ ++ Executing on 32-bit (ok)