On Jan 23, 2014, at 11:02 AM, Andrew Fish <[email protected]> wrote:
>
> On Jan 23, 2014, at 10:27 AM, Olivier Martin <[email protected]> wrote:
>
>> I have only tried with GCC.
>> In the use case I used in my previous email,
>> _gPcd_FixedAtBuild_PcdRelocateVectorTable is initialized by another C file.
>> I am not a compiler expert, but I guess your compiler would need at least
>> two passes to optimise this specific code once he knows the value of ‘const
>> BOOLEAN _gPcd_FixedAtBuild_PcdRelocateVectorTable’ from another compilation
>> unit.
>>
>
> Olivier,
>
> The 2nd pass is link time code generation. Visual Studio does it via linker
> flags, and you can turn it on in clang with -flto. For clang this tells the
> compiler to emit LLVM bitcode objects (kind of a machine independent assembly
> language that gets assembled). The linker combines all the bitcode together
> in the link stage and only at that point is native code generated. Thus whole
> program optimization is possible so the dead striping works.
>
This is what bitcode looks like in case you are interested. It would work the
same way for IA32, X64, Aarch64, etc, since the code gen happens in the linker.
Basically the linker links the bitcode object together and dead strips
unneeded functions/globals, and then it passes the linked bitcode object into a
shared library produced by clang for the code generation to happen.
~/work/Compiler>cat main.c
#include <stdio.h>
int
foo ()
{
return 0;
}
int
main (int argc, char **argv)
{
int Test[100];
for (;;) {
printf ("[0x%02x]", getchar());
}
return argc;
}
~/work/Compiler>clang -flto -S main.c
~/work/Compiler>cat main.S
; ModuleID = 'main.c'
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.9.0"
@.str = private unnamed_addr constant [9 x i8] c"[0x%02x]\00", align 1
define i32 @foo() nounwind ssp uwtable {
ret i32 0
}
define i32 @main(i32 %argc, i8** %argv) nounwind ssp uwtable {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i8**, align 8
%Test = alloca [100 x i32], align 16
store i32 0, i32* %1
store i32 %argc, i32* %2, align 4
store i8** %argv, i8*** %3, align 8
br label %4
; <label>:4 ; preds = %4, %0
%5 = call i32 @getchar()
%6 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([9 x i8]*
@.str, i32 0, i32 0), i32 %5)
br label %4
; No predecessors!
%8 = load i32* %1
ret i32 %8
}
declare i32 @printf(i8*, ...)
declare i32 @getchar()
~/work/Compiler>
Thanks,
Andrew Fish
> As far as I know GCC does not support this.
>
> Thanks,
>
> Andrew Fish
>
>> We have recently done some investigation to add two build passes in
>> BaseTools to take advantage of the ARM linker feedback
>> (seehttp://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/CHDFJGBE.html).
>> But we realized that would not be easy to maintain in build_rules.txt.
>>
>>
>> From: Tim Lewis [mailto:[email protected]]
>> Sent: 23 January 2014 18:09
>> To: [email protected]
>> Subject: Re: [edk2] Need for FixedFeaturePcdGet() ?
>>
>> Olivier –
>>
>> I think this came about because VS always removes the dead code in the
>> release build.
>>
>> However, I agree about the “doesn’t appear in the .DSC” problem. It has
>> forced us on a number of occasions to add dummy .DSC entries that have the
>> exact same default value as the .DEC.
>>
>> Tim
>>
>> From: Olivier Martin [mailto:[email protected]]
>> Sent: Thursday, January 23, 2014 10:00 AM
>> To: [email protected]
>> Subject: [edk2] Need for FixedFeaturePcdGet() ?
>>
>> Feature PCDs are mainly (only?) used to disable features (ie: removing code)
>> at build type.
>> We found that it actually never removes code. Here is an example
>> (ArmPkg/Drivers/CpuDxe/ArmV6/Exception.c):
>>
>> --------------
>> ArmDisableFiq ();
>>
>> if (FeaturePcdGet(PcdRelocateVectorTable) == TRUE) {
>> (...)
>> } else {
>> // The Vector table must be 32-byte aligned
>> ASSERT(((UINT32)ExceptionHandlersStart & ((1 << 5)-1)) == 0);
>>
>> // We do not copy the Exception Table at
>> PcdGet32(PcdCpuVectorBaseAddress). We just set Vector Base Address to point
>> into CpuDxe code.
>> ArmWriteVBar ((UINT32)ExceptionHandlersStart);
>> }
>> --------------
>>
>> If we build it with the upstream UEFI. The C code becomes after
>> pre-preprocessing:
>> --------------
>> ArmDisableFiq ();
>>
>> if (_gPcd_FixedAtBuild_PcdRelocateVectorTable == ((BOOLEAN)(1==1))) {
>> (...)
>> --------------
>>
>> And the dissassembly:
>> --------------
>> bl ArmDisableFiq
>> .LVL19:
>> .loc 1 147 0
>> ldr r3, .L44+4 ; Get the address of
>> _gPcd_FixedAtBuild_PcdRelocateVectorTable
>> ldrb r3, [r3] ; Load its value
>> cmp r3, #1 ; if (_gPcd_FixedAtBuild_PcdRelocateVectorTable ==
>> ((BOOLEAN)(1==1))) {
>> bne .L19
>> .loc 1 151 0
>> ldr r3, .L44+8
>> --------------
>>
>> Now, let's have a look at the AutoGen.h for this PCD:
>> --------------
>> #define _PCD_TOKEN_PcdRelocateVectorTable 104U
>> #define _PCD_VALUE_PcdRelocateVectorTable ((BOOLEAN)0U)
>> extern const BOOLEAN _gPcd_FixedAtBuild_PcdRelocateVectorTable;
>> #define _PCD_GET_MODE_BOOL_PcdRelocateVectorTable
>> _gPcd_FixedAtBuild_PcdRelocateVectorTable
>> //#define _PCD_SET_MODE_BOOL_PcdRelocateVectorTable ASSERT(FALSE) // It is
>> not allowed to set value for a FIXED_AT_BUILD PCD
>> --------------
>>
>> And the definition of FeaturePcdGet() in PcdLib.h:
>> --------------
>> #define FeaturePcdGet(TokenName) _PCD_GET_MODE_BOOL_##TokenName
>> --------------
>>
>>
>> Workaround:
>>
>> I hacked PcdLib to used its value instead of accessing the global variable:
>> --------------
>> #define FeaturePcdGet(TokenName) _PCD_VALUE_##TokenName
>> --------------
>>
>> But this change was not enough because PCDs that were not set in DSC did not
>> have their _PCD_VALUE_ value.
>>
>> Example of PcdVerifyNodeInList:
>> --------------
>> #define _PCD_TOKEN_PcdVerifyNodeInList 5U
>> extern const BOOLEAN _gPcd_FixedAtBuild_PcdVerifyNodeInList;
>> #define _PCD_GET_MODE_BOOL_PcdVerifyNodeInList
>> _gPcd_FixedAtBuild_PcdVerifyNodeInList
>> //#define _PCD_SET_MODE_BOOL_PcdVerifyNodeInList ASSERT(FALSE) // It is
>> not allowed to set value for a FIXED_AT_BUILD PCD
>> --------------
>>
>> So, I hacked BaseTools to always add the value of PCDs:
>> --------------
>> --- a/BaseTools/Source/Python/AutoGen/GenC.py
>> +++ b/BaseTools/Source/Python/AutoGen/GenC.py
>> @@ -1077,6 +1077,8 @@ def CreateLibraryPcdCode(Info, AutoGenC, AutoGenH,
>> Pcd):
>>
>> if PcdItemType == TAB_PCDS_FIXED_AT_BUILD and key in Info.ConstPcd:
>> AutoGenH.Append('#define _PCD_VALUE_%s %s\n' %(TokenCName,
>> Pcd.DefaultValue))
>> + else:
>> + AutoGenH.Append('#define _PCD_VALUE_%s %s\n' %(TokenCName,
>> Pcd.DefaultValue)
>> --------------
>>
>> After rebuilding my platform
>> (ArmPlatformPkg/ArmVExpressPkg/ArmVExpress-RTSM-A9x4.dsc) in RELEASE build,
>> here is the result:
>> --------------
>> bl ArmDisableFiq
>> .LVL19:
>> .loc 1 215 0
>> ldr r0, .L25+4
>> bl ArmWriteVBar
>> --------------
>> We can now see the dead code has been removed.
>>
>> In term of size:
>>
>> Before:
>> FVMAIN_SEC [5%Full] 524288 total, 27232 used, 497056 free
>> FVMAIN_COMPACT [18%Full] 2621440 total, 481760 used, 2139680 free
>> FVMAIN [99%Full] 1161088 total, 1161056 used, 32 free
>>
>> After:
>> FVMAIN_SEC [5%Full] 524288 total, 27232 used, 497056 free
>> FVMAIN_COMPACT [18%Full] 2621440 total, 478632 used, 2142808 free
>> FVMAIN [99%Full] 1154432 total, 1154400 used, 32 free
>>
>> So, it saved 6656 bytes in the non-compressed FV.
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk_______________________________________________
>> edk2-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/edk2-devel
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel