> On May 12, 2016, at 8:49 AM, Kinney, Michael D <michael.d.kin...@intel.com> 
> wrote:
> 
> Andrew,
> 
> I recall an issues with var args in the past (not CLANG related) that 
> requires the
> Use of EFIAPI on all var args functions even internal ones.
> 

If that is part of the coding standard that would help. 

It is starting to make sense now. 
BaseTools/Conf(master)>git grep EFIAPI
tools_def.template:4356:DEFINE GCC44_X64_CC_FLAGS            = 
DEF(GCC44_ALL_CC_FLAGS) -m64 -fno-stack-protector 
"-DEFIAPI=__attribute__((ms_abi))" -DNO_BUILTIN_VA_FUNCS -mno-red-zone 
-Wno-address -mcmodel=large -fno-asynchronous-unwind-tables

So when you override EFIAPI on the compiler command line you also need to set 
NO_BUILTIN_VA_FUNCS, and that uses the edk2 made up VA_* definitions that match 
the EFIABI.  Was Steven missing that step? This scheme forces all var arg 
functions to be EFIAPI, I'm not sure how portable to other architectures that 
is, but maybe that is OK if all the compilers support native EFIAPI for that 
arch. Thus GCC44 suport is forcing all var arg functions to use EFIAPI. 

I pinged our compiler team to get the pedantic answer of how to do this kind of 
thing. Maybe the answer is don't do it :). 

Also as I mentioned in my other mail there may be a cross compiler target in 
clang that can produce EFIAPI natively. 

Thanks,

Andrew Fish

> This is similar to the rule that all functions implemented in assembly must 
> be 
> EFIAPI to the register calling convention is matched.
> 
> Mike
> 
> 
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of 
>> Andrew Fish
>> Sent: Thursday, May 12, 2016 8:15 AM
>> To: Shi, Steven <steven....@intel.com>
>> Cc: edk2-devel@lists.01.org
>> Subject: Re: [edk2] edk2 llvm branch
>> 
>> 
>>> On May 12, 2016, at 7:51 AM, Shi, Steven <steven....@intel.com> wrote:
>>> 
>>> Hi Andrew,
>>> Below is my clang3.8 output:
>>> $ clang --version
>>> clang version 3.8.0 (tags/RELEASE_380/final)
>>> Target: x86_64-unknown-linux-gnu
>>> Thread model: posix
>>> InstalledDir: /usr/local/bin
>>> 
>>> $ clang -Os -flto ms_abi.c
>>> ms_abi.c:19:3: error: 'va_start' used in Win64 ABI function
>>> VA_START (Marker, len);
>>> ^
>>> ms_abi.c:6:38: note: expanded from macro 'VA_START'
>>> #define VA_START(Marker, Parameter)  __builtin_va_start (Marker, Parameter)
>>>                                    ^
>>> 1 error generated.
>>> 
>>> You can see the ms_abi va_list issue has been simply solved in LLVM3.8. Now 
>>> both
>> LLVM and GCC have a workaround to support va_list with MS ABI on x64 
>> platform.
>>> 
>>> Use
>>> __builtin_ms_va_list ap;
>>> __builtin_ms_va_start (ap, n);
>>> __builtin_ms_va_end (ap);
>>> 
>>> instead of
>>> 
>>> __builtin_va_list ap;
>>> __builtin_va_start (ap, n);
>>> __builtin_va_end (ap);
>>> 
>> 
>> Steven,
>> 
>> I don't think that fixed the problem it moved the problem.  Before int call 
>> (int a,
>> ...) worked and __attribute__((ms_abi))  call (int a, ...)  did not. I think 
>> you just
>> flipped it so that  int call (int a, ...) no longer works, but
>> __attribute__((ms_abi))  call (int a, ...)  does. That has the side effect 
>> of making
>> the libs work as they are all EFIAPI. If an application/driver had an 
>> internal var
>> arg function that was NOT EFIAPI it might fail?
>> 
>> Try my example with your fix (__builtin_ms_va_start) but remove
>> __attribute__((ms_abi)) and see if you get the correct answer.
>> 
>> I'll ping our compiler team to see if there is a better answer.
>> 
>> Thanks,
>> 
>> Andrew Fish
>> 
>>> In my current edk2 llvm branch, I've already use above workaround. See it 
>>> in lines
>> 494~502 of  
>> https://github.com/shijunjing/edk2/blob/llvm/MdePkg/Include/Base.h , and
>> it works. You can see how I push to fix this issue in below links:
>>> Clang: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093778.html
>>> GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818
>>> CLANG3.8 is better than GCC7.0 on this issue because CLANG3.8 will give a 
>>> compiler
>> error message like above to explicitly ban the va_list builtins with MS ABI, 
>> but GCC
>> has no warning and continue to confuse user.
>>> 
>>> Maybe XCode need sync the clang3.8 fix for the ms_abi va_list issue.
>>> 
>>> Below is to apply the ms_va_list workaround in your code, and it looks work 
>>> now.
>>> $ cat ms_abi.c
>>> #include <stdio.h>
>>> 
>>> typedef __builtin_ms_va_list VA_LIST;
>>> typedef unsigned long long  UINTN;
>>> 
>>> #define VA_START(Marker, Parameter)  __builtin_ms_va_start (Marker, 
>>> Parameter)
>>> 
>>> #define VA_ARG(Marker, TYPE)         ((sizeof (TYPE) < sizeof (UINTN)) ?
>> (TYPE)(__builtin_va_arg (Marker, UINTN)) : (TYPE)(__builtin_va_arg (Marker, 
>> TYPE)))
>>> 
>>> #define VA_END(Marker)               __builtin_ms_va_end (Marker)
>>> 
>>> #define VA_COPY(Dest, Start)         __builtin_ms_va_copy (Dest, Start)
>>> 
>>> __attribute__((ms_abi)) int EFI_printf(const int len, ...)
>>> {
>>> VA_LIST Marker;
>>> int i;
>>> 
>>> VA_START (Marker, len);
>>> for (i=0; i < len; i++) {
>>>   printf ("%d - %d\n", i, VA_ARG(Marker, int));
>>> }
>>> 
>>> VA_END(Marker);
>>> return len;
>>> }
>>> 
>>> int
>>> main ()
>>> {
>>> EFI_printf (8, 10, 11, 12, 12, 14, 15, 16, 17);
>>> return 0;
>>> }
>>> 
>>> $ clang -flto -Wl,-Os ms_abi.c
>>> $ ./a.out
>>> 0 - 10
>>> 1 - 11
>>> 2 - 12
>>> 3 - 12
>>> 4 - 14
>>> 5 - 15
>>> 6 - 16
>>> 7 - 17
>>> 
>>> So, I believe my current Clang3.8 LTO not stable issue is a new one, and I 
>>> will
>> continue to debug it. I will let you know if I make progress.
>>> 
>>> 
>>> Steven Shi
>>> Intel\SSG\STO\UEFI Firmware
>>> 
>>> Tel: +86 021-61166522
>>> iNet: 821-6522
>>> 
>>>> -----Original Message-----
>>>> From: af...@apple.com [mailto:af...@apple.com]
>>>> Sent: Thursday, May 12, 2016 2:42 AM
>>>> To: Shi, Steven <steven....@intel.com>
>>>> Cc: edk2-devel@lists.01.org
>>>> Subject: Re: [edk2] edk2 llvm branch
>>>> 
>>>> 
>>>>> On May 11, 2016, at 9:38 AM, Shi, Steven <steven....@intel.com> wrote:
>>>>> 
>>>>> Hi Andrew,
>>>>> 
>>>>> Attachment and below are my build map files and code size for your
>>>> suggested two modules with CLANGLTO38 and VS2013x86. Maybe I should
>>>> use latest VS2015x86 for the comparing next time.
>>>>> 
>>>> 
>>>> Steven,
>>>> 
>>>> Thanks for the data.
>>>> 
>>>>> 
>>>>> 
>>>>> *         CLANGLTO38:
>>>>> 
>>>>>> build -a IA32 -t CLANGLTO38 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>> IntelFrameworkModulePkg/Universal/StatusCode/Pei/StatusCodePei.inf -b
>>>> RELEASE
>>>>> 
>>>>> 1,184
>>>>> 
>>>>>> build -a IA32 -t CLANGLTO38 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>> IntelFrameworkModulePkg/Universal/StatusCode/Pei/StatusCodePei.inf -b
>>>> DEBUG -DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>> 7,648
>>>>> 
>>>>>> build -a X64 -t CLANGLTO38 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>> PcAtChipsetPkg/8254TimerDxe/8254Timer.inf -b RELEASE
>>>>> 
>>>>> 1,600
>>>>> 
>>>>>> build -a X64 -t CLANGLTO38 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>> PcAtChipsetPkg/8254TimerDxe/8254Timer.inf -b DEBUG -
>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>> 11,712 (with -fno-lto to disable lto in
>>>> MdePkg\Library\BasePrintLib\BasePrintLib.inf, which is to work around CPU
>>>> exception in PrintLib during boot time)
>>>>> 
>>>>> 8,736 (with lto enalbed in BasePrintLib.inf)
>>>>> 
>>>>> 
>>>>> 
>>>>> *         VS2013x86:
>>>>> 
>>>>>> build -a IA32 -t VS2013x86 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>> IntelFrameworkModulePkg/Universal/StatusCode/Pei/StatusCodePei.inf -b
>>>> RELEASE
>>>>> 
>>>>> 1,280
>>>>> 
>>>>>> build -a IA32 -t VS2013x86 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>> IntelFrameworkModulePkg/Universal/StatusCode/Pei/StatusCodePei.inf -b
>>>> DEBUG -DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>> 8,576
>>>>> 
>>>>>> build -a X64 -t VS2013x86 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>> PcAtChipsetPkg/8254TimerDxe/8254Timer.inf -b RELEASE
>>>>> 
>>>>> 1,760
>>>>> 
>>>>>> build -a X64 -t VS2013x86 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>> PcAtChipsetPkg/8254TimerDxe/8254Timer.inf -b DEBUG -
>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>> 9,760
>>>>> 
>>>>> 
>>>>> 
>>>>> I believe the clang3.8 with LTO enabled is good enough on code size. My
>>>> current biggest trouble is the Clang3.8 LTO not stable on GNU ld when
>>>> generate X64 code, and worse on gold (even cannot finish build). I've
>>>> reported two bugs about LTO against GNU ld and gold in below, FYI.
>>>>> 
>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=20062
>>>>> 
>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=20070
>>>>> 
>>>>> 
>>>> 
>>>> Are those the linker command line failures? What about the code generation
>>>> issues?
>>>> 
>>>> On the code gen issues I would guess they have to do with
>>>> __attribute__((ms_abi)) (call x86_64_win64cc in bit code)and var args. You
>>>> might try building some simple examples at the command line and see if you
>>>> can find an error. So for example you could write a simple Unix command
>>>> line app to test calling a __attribute__((ms_abi)) var arg function that 
>>>> prints
>>>> out information.
>>>> 
>>>> What I usually do is try to make the simplest EFI case possible and then I
>>>> locally define the EFI types. That makes it really easy to reproduce the 
>>>> issue
>>>> for the compiler team.
>>>> 
>>>> ~/work/Compiler>cat ms_abi.c
>>>> #include <stdio.h>
>>>> 
>>>> typedef __builtin_va_list VA_LIST;
>>>> typedef unsigned long long  UINTN;
>>>> 
>>>> #define VA_START(Marker, Parameter)  __builtin_va_start (Marker,
>>>> Parameter)
>>>> 
>>>> #define VA_ARG(Marker, TYPE)         ((sizeof (TYPE) < sizeof (UINTN)) ?
>>>> (TYPE)(__builtin_va_arg (Marker, UINTN)) : (TYPE)(__builtin_va_arg (Marker,
>>>> TYPE)))
>>>> 
>>>> #define VA_END(Marker)               __builtin_va_end (Marker)
>>>> 
>>>> #define VA_COPY(Dest, Start)         __builtin_va_copy (Dest, Start)
>>>> 
>>>> __attribute__((ms_abi)) int EFI_printf(const int len, ...)
>>>> {
>>>> VA_LIST Marker;
>>>> int i;
>>>> 
>>>> VA_START (Marker, len);
>>>> for (i=0; i < len; i++) {
>>>>   printf ("%d - %d\n", i, VA_ARG(Marker, int));
>>>> }
>>>> 
>>>> VA_END(Marker);
>>>> return len;
>>>> }
>>>> 
>>>> int
>>>> main ()
>>>> {
>>>> EFI_printf (8, 10, 11, 12, 12, 14, 15, 16, 17);
>>>> return 0;
>>>> }~/work/Compiler>clang -Os -flto ms_abi.c
>>>> ~/work/Compiler>./a.out
>>>> 0 - 10
>>>> 1 - 11
>>>> 2 - 12
>>>> 3 - 12
>>>> 4 - 14
>>>> 5 - 15
>>>> 6 - 10
>>>> 7 - 11
>>>> ~/work/Compiler>
>>>> 
>>>> Yikes I think I just reproduced your bug in the Xcode clang. Can you try 
>>>> this
>>>> example on your toolchain and report the issue if you see it.
>>>> 
>>>>> 
>>>>> BTW, does XCODE linker have linux version? If yes, I'd like to try it to 
>>>>> co-
>>>> work with clang 3.8 as CC compiler.
>>>>> 
>>>>> 
>>>> 
>>>> No it is Mac only, and only supports Mach-O not ELF. It is my understanding
>>>> that the Xcode  linker just links the bit code like normal (pulls in the 
>>>> symbols
>>>> that are needed).  Then this linked bit code blob is sent to an LLVM 
>>>> dynamic
>>>> library to do the code gen. Maybe different version of LLVM are used by the
>>>> different linkers in your case, or maybe the LLVM stuff is compiled in?
>>>> 
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Andrew Fish
>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Steven Shi
>>>>> 
>>>>> Intel\SSG\STO\UEFI Firmware
>>>>> 
>>>>> 
>>>>> 
>>>>> Tel: +86 021-61166522
>>>>> 
>>>>> iNet: 821-6522
>>>>> 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>> 
>>>>>> From: af...@apple.com [mailto:af...@apple.com]
>>>>> 
>>>>>> Sent: Wednesday, May 11, 2016 11:09 PM
>>>>> 
>>>>>> To: Shi, Steven <steven....@intel.com>
>>>>> 
>>>>>> Cc: edk2-devel@lists.01.org
>>>>> 
>>>>>> Subject: Re: [edk2] edk2 llvm branch
>>>>> 
>>>>>> 
>>>>> 
>>>>>> 
>>>>> 
>>>>>>> On May 11, 2016, at 5:08 AM, Shi, Steven
>>>> <steven....@intel.com<mailto:steven....@intel.com>> wrote:
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Hi Andrew,
>>>>> 
>>>>>>> From your data, it looks the XCode LTO is not enabled correctly for 
>>>>>>> IA32,
>>>>> 
>>>>>> but correct for X64. Attachment has my build map files, and below are my
>>>>> 
>>>>>> build commands. FYI.
>>>>> 
>>>>>>> 
>>>>> 
>>>>>> 
>>>>> 
>>>>>> Sorry had a typo in the tools_def.txt, here are the numbers with -flto
>>>>> 
>>>>>> correctly added to the CC_FLAGS:
>>>>> 
>>>>>> 
>>>>> 
>>>>>>> build -a IA32 -t XCODE5 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Pei/PeiMain.inf -b RELEASE
>>>>> 
>>>>>> 25K
>>>>> 
>>>>>>> build -a IA32 -t XCODE5 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Pei/PeiMain.inf -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>> 45K
>>>>> 
>>>>>>> build -a X64 -t XCODE5 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Dxe/DxeMain.inf -b RELEASE
>>>>> 
>>>>>> 103K
>>>>> 
>>>>>>> build -a X64  -t XCODE5 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Dxe/DxeMain.inf -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>> 133K
>>>>> 
>>>>>> 
>>>>> 
>>>>>> When doing size optimizations it is often easier to start with smaller
>>>> drivers.
>>>>> 
>>>>>> Can you send the sizes/map files for:
>>>>> 
>>>>>>> build -a IA32 -t XCODE5 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> IntelFrameworkModulePkg/Universal/StatusCode/Pei/StatusCodePei.inf -
>>>> b
>>>>> 
>>>>>> RELEASE
>>>>> 
>>>>>> 1220
>>>>> 
>>>>>>> build -a IA32 -t XCODE5 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> IntelFrameworkModulePkg/Universal/StatusCode/Pei/StatusCodePei.inf -
>>>> b
>>>>> 
>>>>>> DEBUG -DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>> 8228
>>>>> 
>>>>>>> build -a X64 -t XCODE5 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> PcAtChipsetPkg/8254TimerDxe/8254Timer.inf -b RELEASE
>>>>> 
>>>>>> 2464
>>>>> 
>>>>>>> build -a X64 -t XCODE5 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> PcAtChipsetPkg/8254TimerDxe/8254Timer.inf -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>> 9088
>>>>> 
>>>>>> 
>>>>> 
>>>>>> 
>>>>> 
>>>>>> 
>>>>> 
>>>>>> I forgot to mention that X64 XCODE5/clang has a 5 byte overhead per
>>>>> 
>>>>>> function vs.VS2013x86 due to supporting stack unwind. The upside is the
>>>>> 
>>>>>> unexpected exception handler can print out a stack trace. VS2013x86
>>>>> 
>>>>>> requires the debugger with symbols to unwind the stack.
>>>>> 
>>>>>> 
>>>>> 
>>>>>> ~/work/Compiler>cat call.c
>>>>> 
>>>>>> int main ()
>>>>> 
>>>>>> {
>>>>> 
>>>>>> return 0;
>>>>> 
>>>>>> }
>>>>> 
>>>>>> ~/work/Compiler>clang -Os call.c
>>>>> 
>>>>>> ~/work/Compiler>lldb a.out
>>>>> 
>>>>>> (lldb) target create "a.out"
>>>>> 
>>>>>> Current executable set to 'a.out' (x86_64).
>>>>> 
>>>>>> (lldb) dis -m -b -n main
>>>>> 
>>>>>> a.out`main
>>>>> 
>>>>>> a.out`main:
>>>>> 
>>>>>> a.out[0x100000f98] <+0>: 55        pushq  %rbp
>>>>> 
>>>>>> a.out[0x100000f99] <+1>: 48 89 e5  movq   %rsp, %rbp
>>>>> 
>>>>>> a.out[0x100000f9c] <+4>: 31 c0     xorl   %eax, %eax
>>>>> 
>>>>>> a.out[0x100000f9e] <+6>: 5d        popq   %rbp
>>>>> 
>>>>>> a.out[0x100000f9f] <+7>: c3        retq
>>>>> 
>>>>>> 
>>>>> 
>>>>>> Thanks,
>>>>> 
>>>>>> 
>>>>> 
>>>>>> Andrew Fish
>>>>> 
>>>>>> 
>>>>> 
>>>>>>> VS2013x86:
>>>>> 
>>>>>>> build -a IA32 -t VS2013x86 -p OvmfPkg\OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg\Core\Pei\PeiMain.inf -b RELEASE
>>>>> 
>>>>>>> build -a IA32 -t VS2013x86 -p OvmfPkg\OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg\Core\Pei\PeiMain.inf -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>>> build -a X64 -t VS2013x86 -p OvmfPkg\OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg\Core\Dxe\DxeMain.inf  -b RELEASE
>>>>> 
>>>>>>> build -a X64 -t VS2013x86 -p OvmfPkg\OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg\Core\Dxe\DxeMain.inf  -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> CLANGLTO38:
>>>>> 
>>>>>>> build -a IA32 -t CLANGLTO38 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Pei/PeiMain.inf  -b RELEASE
>>>>> 
>>>>>>> build -a IA32 -t CLANGLTO38 -p OvmfPkg/OvmfPkgIa32.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Pei/PeiMain.inf  -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>>> build -a X64 -t CLANGLTO38 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Dxe/DxeMain.inf  -b RELEASE
>>>>> 
>>>>>>> build -a X64 -t CLANGLTO38 -p OvmfPkg/OvmfPkgX64.dsc -n 5 -m
>>>>> 
>>>>>> MdeModulePkg/Core/Dxe/DxeMain.inf  -b DEBUG -
>>>>> 
>>>>>> DDEBUG_ON_SERIAL_PORT
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Steven Shi
>>>>> 
>>>>>>> Intel\SSG\STO\UEFI Firmware
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Tel: +86 021-61166522
>>>>> 
>>>>>>> iNet: 821-6522
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> From: af...@apple.com<mailto:af...@apple.com>
>>>> [mailto:af...@apple.com]
>>>>> 
>>>>>>> Sent: Wednesday, May 11, 2016 2:03 AM
>>>>> 
>>>>>>> To: Shi, Steven <steven....@intel.com<mailto:steven....@intel.com>>
>>>>> 
>>>>>>> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
>>>>> 
>>>>>>> Subject: Re: [edk2] edk2 llvm branch
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> On May 10, 2016, at 8:05 AM, Shi, Steven
>>>>> 
>>>>>> 
>>>> <steven....@intel.com<mailto:steven....@intel.com<mailto:steven.shi@int
>>>> el.com%3cmailto:steven....@intel.com>>> wrote:
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Hi Andrew,
>>>>> 
>>>>>>> Thank you for the suggestion. I will try your suggestion and response
>>>> other
>>>>> 
>>>>>> questions in your email later. I don't have XCODE5 environment, but could
>>>> do
>>>>> 
>>>>>> me a favor and  let me know what current XCODE5 code size for
>>>> PeiCore.efi
>>>>> 
>>>>>> and DxeCore.efi in your side? In my side, as below data show, I see the
>>>> LTO
>>>>> 
>>>>>> can bring big code size improvement which is quite important for
>>>> firmware in
>>>>> 
>>>>>> many scenarios.
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> I forgot to mention. LTO or not it is good to check to make sure the
>>>>> 
>>>>>> assembly files are getting dead stripped. For example check to make sure
>>>> you
>>>>> 
>>>>>> are not getting all the assembly functions in the BaseLib included in 
>>>>>> your
>>>>> 
>>>>>> executable.  Some of the assembly is .S and some is .nasm so you may see
>>>>> 
>>>>>> different behavior depending on which assembler was used.
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> It is also useful to start looking at the smallest PEIM/DXE drivers 1st 
>>>>>>> as it
>>>>> 
>>>>>> may be easier to spot what is different.
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Maybe it is also a good idea to enable LTO in XCODE.
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> For Xcode you add -object_path_lto
>>>>> 
>>>>>> $(DEST_DIR_DEBUG)/$(BASE_NAME).lto to   *_XCODE5_*_DLINK_FLAGS.
>>>>> 
>>>>>> This places the intermediate link code gen in the Build/ director vs. a 
>>>>>> temp
>>>>> 
>>>>>> director and is important for source level debugging. To turn LTO on and
>>>> off
>>>>> 
>>>>>> you add -flto to *_XCODE5_*_CC_FLAGS .
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> We ended up making LTO a configurable build option, so we control it in
>>>>> 
>>>>>> the DSC file. git
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> [BuildOptions]
>>>>> 
>>>>>>> !if $(PEI_LTO_ENABLE)
>>>>> 
>>>>>>> XCODE:*_*_IA32_PLATFORM_FLAGS = -flto
>>>>> 
>>>>>>> !endif
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> !if $(DXE_LTO_ENABLE)
>>>>> 
>>>>>>> XCODE:*_*_X64_PLATFORM_FLAGS = -flto
>>>>> 
>>>>>>> !endif
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> I included the Xcode 6.3.2 Numbers:
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> IA32 DEBUG PeiCore.efi on Ovmf build code size example:
>>>>> 
>>>>>>> ToolChainName                      PeiCore.efi file size
>>>>> 
>>>>>>> VS2013x86:                                         40KB
>>>>> 
>>>>>>> CLANGLTO38:                                    42KB
>>>>> 
>>>>>>> Xcode                                                      61K
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> GCCLTO53:                                          44KB
>>>>> 
>>>>>>> GCC49:                                                  55KB
>>>>> 
>>>>>>> CLANG38:                                            60KB
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> IA32 RELEASE PeiCore.efi on Ovmf build code size example:
>>>>> 
>>>>>>> ToolChainName                      PeiCore.efi file size
>>>>> 
>>>>>>> VS2013x86:                                         20KB
>>>>> 
>>>>>>> GCCLTO53:                                          23KB
>>>>> 
>>>>>>> CLANGLTO38:                                    24KB
>>>>> 
>>>>>>> Xcode                            31K
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> GCC49:                                27KB
>>>>> 
>>>>>>> Clang38:                               29KB
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> X64 DEBUG DxeCore.efi on Ovmf build code size example:
>>>>> 
>>>>>>> ToolChainName                      .efi file size                  LZMA
>> Compressed size
>>>>> 
>>>>>>> VS2013x86:                                          137KB
>> 57KB
>>>>> 
>>>>>>> CLANGLTO38:                                    145KB
>> 61KB
>>>>> 
>>>>>>> Xcode                            157K              68K
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> GCCLTO53:                                          161KB
>> 63KB
>>>>> 
>>>>>>> GCC49:                                273KB                        69KB
>>>>> 
>>>>>>> CLANG38:                                            205KB
>> 72KB
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> X64 RELEASE DxeCore.efi on Ovmf build code size example:
>>>>> 
>>>>>>> ToolChainName                      .efi file size                  LZMA
>> Compressed size
>>>>> 
>>>>>>> VS2013x86:                                         95KB
>> 44KB
>>>>> 
>>>>>>> GCCLTO53:                                          101KB
>> 46KB
>>>>> 
>>>>>>> CLANGLTO38:                                    107KB
>> 48KB
>>>>> 
>>>>>>> Xcode                            104K              49K
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> GCC49:                                184KB                        52KB
>>>>> 
>>>>>>> CLANG38:                                            133KB
>> 53KB
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Can you send my linker map files for VS2013 & CLANGLTO38 off list.
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Thanks,
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Andrew Fish
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Steven Shi
>>>>> 
>>>>>>> Intel\SSG\STO\UEFI Firmware
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> Tel: +86 021-61166522
>>>>> 
>>>>>>> iNet: 821-6522
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>>> -----Original Message-----
>>>>> 
>>>>>>>> From:
>>>> af...@apple.com<mailto:af...@apple.com<mailto:af...@apple.com%3cmai
>>>> lto:af...@apple.com>>
>>>>> 
>>>>>> [mailto:af...@apple.com]
>>>>> 
>>>>>>>> Sent: Tuesday, May 10, 2016 1:12 PM
>>>>> 
>>>>>>>> To: Shi, Steven
>>>> <steven....@intel.com<mailto:steven....@intel.com<mailto:steven.shi@int
>>>> el.com%3cmailto:steven....@intel.com>>>
>>>>> 
>>>>>>>> Cc: edk2-devel@lists.01.org<mailto:edk2-
>>>> de...@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-
>>>> de...@lists.01.org>>
>>>>> 
>>>>>>>> Subject: Re: [edk2] edk2 llvm branch
>>>>> 
>>>>>>>> 
>>>>> 
>>>>>>>> 
>>>>> 
>>>>>>> 
>>>>> 
>>>>>>> _______________________________________________
>>>>> 
>>>>>>> edk2-devel mailing list
>>>>> 
>>>>>>> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
>>>>> 
>>>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> edk2-devel mailing list
>>>>> edk2-devel@lists.01.org
>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>> 
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>> 
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to