> On Sep 22, 2016, at 4:05 AM, Pete Batard <p...@akeo.ie> wrote:
> 
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
> 
> I was hoping you would comment on this, as I believe the issue is larger than 
> Arm (which is why I thought the Arm patch could be integrated), since I ran 
> into something similar for X64 and AARCH64, with the conversion from stack to 
> register parameters.
> 
> Let me start by showing a real-life example of what the current EBC 
> implementation does, on different architectures, if you CALLEX from EBC into 
> a native function call such as:
> 
> VOID MultiParamNative(
>    UINT32,
>    UINT64,
>    UINT64,
>    UINT64,
>    UINT32,
>    UINT32,
>    UINT64
> );
> 
> with values:
> 
>    0x1C1C1C1C,
>    0x2B2B2B2B2A2A2A2A,
>    0x3B3B3B3B3A3A3A3A,
>    0x4B4B4B4B4A4A4A4A,
>    0x5C5C5C5C,
>    0x6C6C6C6C,
>    0x7B7B7B7B7A7A7A7A
> 
> If you do that, then the parameter values seen by each Arch will be as 
> follows:
> 
> IA32:
>  p1 = 0x1C1C1C1C
>  p2 = 0x2B2B2B2B2A2A2A2A
>  p3 = 0x3B3B3B3B3A3A3A3A
>  p4 = 0x4B4B4B4B4A4A4A4A
>  p5 = 0x5C5C5C5C
>  p6 = 0x6C6C6C6C
>  p7 = 0x7B7B7B7B7A7A7A7A
> 
> X64:
>  p1 = 0x1C1C1C1C
>  p2 = 0x3A3A3A3A2B2B2B2B
>  p3 = 0x4A4A4A4A3B3B3B3B
>  p4 = 0x5C5C5C5C4B4B4B4B
>  p5 = 0x6C6C6C6C
>  p6 = 0x7B7B7B7B
>  p7 = 0x06F23E4012345678
> 
> ARM:
>  p1 = 0x1C1C1C1C
>  p2 = 0x3A3A3A3A2B2B2B2B
>  p3 = 0x4A4A4A4A3B3B3B3B
>  p4 = 0x5C5C5C5C4B4B4B4B
>  p5 = 0x6C6C6C6C
>  p6 = 0x7A7A7A7A
>  p7 = 0x446EEC467B7B7B7B
> 
> AA64:
>  p1 = 0x1C1C1C1C
>  p2 = 0x3A3A3A3A2B2B2B2B
>  p3 = 0x4A4A4A4A3B3B3B3B
>  p4 = 0x5C5C5C5C4B4B4B4B
>  p5 = 0x6C6C6C6C
>  p6 = 0x7B7B7B7B
>  p7 = 0x00000000FFFFFF91
> 
> Note that these are real-life results gotten from a native set of drivers [1] 
> + EBC sample [2], specifically designed to test the above.
> 
> So, as you can see, only IA32 currently retrieves the parameters with their 
> expected values. All the other platforms, and not just Arm, have an issue 
> with parameter retrieval.
> 
> I too performed some analysis [3], to understand the problem, the result of 
> which can be summarized as follows:
> 
> Let's say you have native protocol function:
> 
>  ProtocolCall(UINT32, UINT64, UINT64)
> 
> to which you want to pass values:
> 
>  (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
> 
> With the EBC VM, the parameters then get stacked as (little endian, CDECL and 
> using 32-bit longwords to represent the stack):
> 
>   +--------+
>   |1C1C1C1C|
>   +--------+
>   |2A2A2A2A|
>   +--------+
>   |2B2B2B2B|
>   +--------+
>   |3A3A3A3A|
>   +--------+
>   |3B3B3B3B|
>   +--------+
>   +????????+
>   +--------+
> 
> Now, if you are calling into an x86_32 arch, this is no issue, as the native 
> call reads the parameters off the stack, and finds each one it its expected 
> location.
> 
> But if, say, you are calling into native Arm, then the calling convention 
> dictates that the first four 32 bits parameters must be placed into Arm 
> registers r0-r3, rather than on the stack, and what's more, that if there 
> exist 64 bit parameters among the register ones, they must start with an even 
> register (r0 or r2).
> 
> What this means is that, with the current EBC handling, which simply maps the 
> top of the stack onto registers for native CALLEX (as the VM cannot guess the 
> parameter signature of the function it is calling into, and therefore will 
> not do anything else but a blind mapping of the stack onto registers), the 
> native Arm function effectively gets called with the following parameter 
> mapping:
> 
>   +--------+
>   |1C1C1C1C|  -> r0 (32-bit first parameter)
>   +--------+
>   |2A2A2A2A|  -> (r1/unused, since first parameter is 32-bit)
>   +--------+
>   |2B2B2B2B|  -> r2 (lower half of 64-bit second parameter)
>   +--------+
>   |3A3A3A3A|  -> r3 (upper half of 64-bit second parameter)
>   +--------+
>   |3B3B3B3B|  -> lower half of 64-bit third parameter (stack)
>   +--------+
>   +????????+  -> upper half of 64-bit third parameter (stack)
>   +--------+
> 
> The end result is that, the Arm call ends up with these values:
> 
>  (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
> 
> However, while we used Arm for this example, this is not an Arm specific 
> issue, as x86_64 and Arm64 also expect any of the first eight parameters to a 
> native call, that are smaller than 64-bit, to get passed as a 64-bit 
> register, which means they too have the same issue as the one illustrated 
> above.
> 
> Now, I'm not sure what the solution to that issue would be. I tend to agree 
> that, short of including a parameter signature for function calls, this 
> function argument marshalling issue between EBC and native will be difficult 
> to solve. A possible half-workaround I thought of could be to keep track of 
> all the PUSHes having been carried out before a CALLEX, and *assume* (or 
> mandate in the specs) that all the arguments were pushed individually and 
> that the size of the PUSH matches the desired size for a register argument, 
> but even that would still add a lot of complexity and be prone to breakage...
> 

It seems like tracking the PUSHes would work? We could update the spec to 
require this behavior. But it brings up another issue in that you don't know 
how many PUSHes to go back? What is a calling convention PUSH vs. a state save 
push? Maybe we could also add a max number of args that are supported? 

Thanks,

Andrew Fish

PS I'm guessing that from a clang/LLVM point of view enforcing the push rules 
in the code gen should not be hard if the bit code gets optimized and then is 
converted to EBC. As the bit code for the external code call will contain all 
the info EBC needs to know about. For example:
%7 = call i32 @MultiParamNative(i32 %4, i64 %5, i32 %6)

Example:
~/work/Compiler/llvm>cat callit.c
typedef unsigned int UINT32;
typedef unsigned long long UINT64;

int MultiParamNative(UINT32, UINT64, UINT32);

int Example (UINT32 A, UINT64 B, UINT32 C)
{
  return MultiParamNative (A, B, C);
}
~/work/Compiler/llvm>clang -S -emit-llvm  callit.c 
~/work/Compiler/llvm>cat callit.ll
; ModuleID = 'callit.c'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.12.0"

; Function Attrs: nounwind ssp uwtable
define i32 @Example(i32 %A, i64 %B, i32 %C) #0 {
  %1 = alloca i32, align 4
  %2 = alloca i64, align 8
  %3 = alloca i32, align 4
  store i32 %A, i32* %1, align 4
  store i64 %B, i64* %2, align 8
  store i32 %C, i32* %3, align 4
  %4 = load i32* %1, align 4
  %5 = load i64* %2, align 8
  %6 = load i32* %3, align 4
  %7 = call i32 @MultiParamNative(i32 %4, i64 %5, i32 %6)
  ret i32 %7
}

declare i32 @MultiParamNative(i32, i64, i32) #1

attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" 
"no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" 
"no-infs-fp-math"="false" "no-nans-fp-math"="false" 
"stack-protector-buffer-size"="8" "unsafe-fp-math"="false" 
"use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" 
"no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" 
"no-nans-fp-math"="false" "stack-protector-buffer-size"="8" 
"unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53) (based on 
LLVM 3.6.0svn)"}


> The other solution of course is to ask EBC code developers to be aware of and 
> compensate for the calling convention issue, and pad the stack as required 
> depending on the ISA they are calling into, which is how I made my 
> protocol.asm sample work [4], but this is still rather inconvenient, 
> especially if not coding in EBC assembly, and defeats the point of trying to 
> write Arch independent code.
> 
> Considering this, I'm not entirely convinced ARM EBC integration should be 
> held back, as the problem we are faced with is part of a larger issue that 
> we'll need to resolve for all archs (except IA32), and not just ARM...
> 
> Regards,
> 
> /Pete
> 
> [1] https://github.com/pbatard/fasmg-ebc/tree/master/Protocol
> [2] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm
> [3] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L113-L177
> [4] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L211-L219
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to