It actually came back to my mind that GCC has a similar feature ... which has 
the same name LTO for 'Link Time Optimization': 
http://gcc.gnu.org/wiki/LinkTimeOptimization

I have just tried it using a quite recent toolchain (arm-none-eabi-gcc (GNU 
Tools for ARM Embedded Processors) 4.8.3 20131129 (release)).

And it saved me 8 bytes!

Without -lto:
FVMAIN_SEC [9%Full] 524288 total, 47648 used, 476640 free
FVMAIN_COMPACT [28%Full] 2621440 total, 748992 used, 1872448 free
FVMAIN [99%Full] 1941376 total, 1941344 used, 32 free

With -lto:
FVMAIN_SEC [9%Full] 524288 total, 47648 used, 476640 free
FVMAIN_COMPACT [28%Full] 2621440 total, 748984 used, 1872456 free
FVMAIN [99%Full] 1941376 total, 1941344 used, 32 free

I am now wondering if the 8 bytes comes from the change of date from Thursday 
to Friday...

________________________________________
From: Andrew Fish [[email protected]]
Sent: 23 January 2014 19:15
To: [email protected]
Subject: Re: [edk2] Need for FixedFeaturePcdGet() ?

On Jan 23, 2014, at 11:02 AM, Andrew Fish 
<[email protected]<mailto:[email protected]>> wrote:


On Jan 23, 2014, at 10:27 AM, Olivier Martin 
<[email protected]<mailto:[email protected]>> wrote:

I have only tried with GCC.
In the use case I used in my previous email, 
_gPcd_FixedAtBuild_PcdRelocateVectorTable is initialized by another C file.
I am not a compiler expert, but I guess your compiler would need at least two 
passes to optimise this specific code once he knows the value of ‘const BOOLEAN 
_gPcd_FixedAtBuild_PcdRelocateVectorTable’ from another compilation unit.


Olivier,

The 2nd pass is link time code generation. Visual Studio does it via linker 
flags, and you can turn it on in clang with -flto. For clang this tells the 
compiler to emit LLVM bitcode objects (kind of a machine independent assembly 
language that gets assembled). The linker combines all the bitcode together in 
the link stage and only at that point is native code generated. Thus whole 
program optimization is possible so the dead striping works.


This is what bitcode looks like in case you are interested. It would work the 
same way for IA32, X64, Aarch64, etc, since the code gen happens in the linker. 
 Basically the linker links the bitcode object together and dead strips 
unneeded functions/globals, and then it passes the linked bitcode object into a 
shared library produced by clang for the code generation to happen.

~/work/Compiler>cat main.c
#include <stdio.h>

int
foo ()
{
  return 0;
}

int
main (int argc, char **argv)
{
  int Test[100];

  for (;;) {
    printf ("[0x%02x]", getchar());
  }

  return argc;
}
~/work/Compiler>clang -flto -S main.c
~/work/Compiler>cat main.S
; ModuleID = 'main.c'
target datalayout = 
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.9.0"

@.str = private unnamed_addr constant [9 x i8] c"[0x%02x]\00", align 1

define i32 @foo() nounwind ssp uwtable {
  ret i32 0
}

define i32 @main(i32 %argc, i8** %argv) nounwind ssp uwtable {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  %3 = alloca i8**, align 8
  %Test = alloca [100 x i32], align 16
  store i32 0, i32* %1
  store i32 %argc, i32* %2, align 4
  store i8** %argv, i8*** %3, align 8
  br label %4

; <label>:4                                       ; preds = %4, %0
  %5 = call i32 @getchar()
  %6 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([9 x i8]* 
@.str, i32 0, i32 0), i32 %5)
  br label %4
                                                  ; No predecessors!
  %8 = load i32* %1
  ret i32 %8
}

declare i32 @printf(i8*, ...)

declare i32 @getchar()
~/work/Compiler>

Thanks,

Andrew Fish

As far as I know GCC does not support this.

Thanks,

Andrew Fish

We have recently done some investigation to add two build passes in BaseTools 
to take advantage of the ARM linker feedback 
(seehttp://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/CHDFJGBE.html).
But we realized that would not be easy to maintain in build_rules.txt.


From: Tim Lewis [mailto:[email protected]]
Sent: 23 January 2014 18:09
To: [email protected]<mailto:[email protected]>
Subject: Re: [edk2] Need for FixedFeaturePcdGet() ?

Olivier –

I think this came about because VS always removes the dead code in the release 
build.

However, I agree about the “doesn’t appear in the .DSC” problem. It has forced 
us on a number of occasions to add dummy .DSC entries that have the exact same 
default value as the .DEC.

Tim

From: Olivier Martin [mailto:[email protected]]
Sent: Thursday, January 23, 2014 10:00 AM
To: [email protected]<mailto:[email protected]>
Subject: [edk2] Need for FixedFeaturePcdGet() ?

Feature PCDs are mainly (only?) used to disable features (ie: removing code) at 
build type.
We found that it actually never removes code. Here is an example 
(ArmPkg/Drivers/CpuDxe/ArmV6/Exception.c):

--------------
  ArmDisableFiq ();

  if (FeaturePcdGet(PcdRelocateVectorTable) == TRUE) {
    (...)
  } else {
    // The Vector table must be 32-byte aligned
    ASSERT(((UINT32)ExceptionHandlersStart & ((1 << 5)-1)) == 0);

    // We do not copy the Exception Table at PcdGet32(PcdCpuVectorBaseAddress). 
We just set Vector Base Address to point into CpuDxe code.
    ArmWriteVBar ((UINT32)ExceptionHandlersStart);
  }
--------------

If we build it with the upstream UEFI. The C code becomes after 
pre-preprocessing:
--------------
  ArmDisableFiq ();

  if (_gPcd_FixedAtBuild_PcdRelocateVectorTable == ((BOOLEAN)(1==1))) {
    (...)
--------------

And the dissassembly:
--------------
     bl   ArmDisableFiq
.LVL19:
     .loc 1 147 0
     ldr  r3, .L44+4  ; Get the address of 
_gPcd_FixedAtBuild_PcdRelocateVectorTable
     ldrb r3, [r3]    ; Load its value
     cmp  r3, #1      ; if (_gPcd_FixedAtBuild_PcdRelocateVectorTable == 
((BOOLEAN)(1==1))) {
     bne  .L19
     .loc 1 151 0
     ldr  r3, .L44+8
--------------

Now, let's have a look at the AutoGen.h for this PCD:
--------------
#define _PCD_TOKEN_PcdRelocateVectorTable  104U
#define _PCD_VALUE_PcdRelocateVectorTable  ((BOOLEAN)0U)
extern const  BOOLEAN  _gPcd_FixedAtBuild_PcdRelocateVectorTable;
#define _PCD_GET_MODE_BOOL_PcdRelocateVectorTable  
_gPcd_FixedAtBuild_PcdRelocateVectorTable
//#define _PCD_SET_MODE_BOOL_PcdRelocateVectorTable  ASSERT(FALSE)  // It is 
not allowed to set value for a FIXED_AT_BUILD PCD
--------------

And the definition of FeaturePcdGet() in PcdLib.h:
--------------
#define FeaturePcdGet(TokenName)            _PCD_GET_MODE_BOOL_##TokenName
--------------


Workaround:

I hacked PcdLib to used its value instead of accessing the global variable:
--------------
#define FeaturePcdGet(TokenName)            _PCD_VALUE_##TokenName
--------------

But this change was not enough because PCDs that were not set in DSC did not 
have their _PCD_VALUE_ value.

Example of PcdVerifyNodeInList:
--------------
#define _PCD_TOKEN_PcdVerifyNodeInList  5U
extern const BOOLEAN _gPcd_FixedAtBuild_PcdVerifyNodeInList;
#define _PCD_GET_MODE_BOOL_PcdVerifyNodeInList  
_gPcd_FixedAtBuild_PcdVerifyNodeInList
//#define _PCD_SET_MODE_BOOL_PcdVerifyNodeInList  ASSERT(FALSE)  // It is not 
allowed to set value for a FIXED_AT_BUILD PCD
--------------

So, I hacked BaseTools to always add the value of PCDs:
--------------
--- a/BaseTools/Source/Python/AutoGen/GenC.py
+++ b/BaseTools/Source/Python/AutoGen/GenC.py
@@ -1077,6 +1077,8 @@ def CreateLibraryPcdCode(Info, AutoGenC, AutoGenH, Pcd):

         if PcdItemType == TAB_PCDS_FIXED_AT_BUILD and key in Info.ConstPcd:
             AutoGenH.Append('#define _PCD_VALUE_%s %s\n' %(TokenCName, 
Pcd.DefaultValue))
+        else:
+            AutoGenH.Append('#define _PCD_VALUE_%s %s\n' %(TokenCName, 
Pcd.DefaultValue)
--------------

After rebuilding my platform 
(ArmPlatformPkg/ArmVExpressPkg/ArmVExpress-RTSM-A9x4.dsc) in RELEASE build, 
here is the result:
--------------
     bl   ArmDisableFiq
.LVL19:
     .loc 1 215 0
     ldr  r0, .L25+4
     bl   ArmWriteVBar
--------------
We can now see the dead code has been removed.

In term of size:

Before:
FVMAIN_SEC [5%Full] 524288 total, 27232 used, 497056 free
FVMAIN_COMPACT [18%Full] 2621440 total, 481760 used, 2139680 free
FVMAIN [99%Full] 1161088 total, 1161056 used, 32 free

After:
FVMAIN_SEC [5%Full] 524288 total, 27232 used, 497056 free
FVMAIN_COMPACT [18%Full] 2621440 total, 478632 used, 2142808 free
FVMAIN [99%Full] 1154432 total, 1154400 used, 32 free

So, it saved 6656 bytes in the non-compressed FV.
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk_______________________________________________
edk2-devel mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/edk2-devel



-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Reply via email to