Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Ard Biesheuvel
On 15 July 2016 at 12:37, Laszlo Ersek  wrote:
> On 07/15/16 12:15, Ard Biesheuvel wrote:
>> On 15 July 2016 at 11:51, Laszlo Ersek  wrote:
>
>>> What I'd be most interested in is
>>> enabling optimization (saving space) for -b DEBUG -- I don't do source
>>> level debugging, but I want the debug messages and the ASSERT()s. Saving
>>> space with those compiled in would be superb -- I guess that's where the
>>> savings would really shine.
>>>
>>
>> Doing the same build for DEBUG gives me
>>
>> -O0
>>
>> SECFV [20%Full] 212992 total, 44464 used, 168528 free
>> FVMAIN_COMPACT [75%Full] 1753088 total, 1326592 used, 426496 free
>> DXEFV [67%Full] 10485760 total, 7077696 used, 3408064 free
>> PEIFV [28%Full] 917504 total, 257200 used, 660304 free
>>
>> -O1
>>
>> SECFV [16%Full] 212992 total, 35248 used, 177744 free
>> FVMAIN_COMPACT [76%Full] 1753088 total, 1339592 used, 413496 free
>> DXEFV [56%Full] 10485760 total, 5933120 used, 4552640 free
>> PEIFV [23%Full] 917504 total, 216240 used, 701264 free
>>
>> -O2
>>
>> SECFV [17%Full] 212992 total, 36688 used, 176304 free
>> FVMAIN_COMPACT [83%Full] 1753088 total, 1469904 used, 283184 free
>> DXEFV [58%Full] 10485760 total, 6157440 used, 4328320 free
>> PEIFV [24%Full] 917504 total, 227408 used, 690096 free
>>
>> which paints roughly the same picture: much smaller binaries, but O2
>> is worse than O0/O1 after compression.
>
> Thank you for checking this -- in absolute terms, -O1 shaves about 1.1MB
> off DXEFV. Not bad!
>
>> Another thing to take into account is that DXE code size translates
>> directly into RuntimeServicesCode memory footprint of
>> DXE_RUNTIME_DRIVER modules,
>
> Good point!
>
>> although I am not sure if that has ever
>> been a source of concern.
>
> I've never thought of it -- my gut feeling is that the AcpiNVS stuff is
> larger. (I could easily be biased though: the AcpiNVS allocations are
> all manual and require thought, while RuntimeServicesCode allocs are all
> automatic. I might be projecting "mental effort required" to "allocation
> size" :))
>

Actually, it appears -Os does work, as long as you also pass the
-maccumulate-outgoing-args switch. I am not sure what I did wrong
before, but all the builds below boot fine.

I get the following results:

RELEASE_GCC44

SECFV [6%Full] 212992 total, 14320 used, 198672 free
FVMAIN_COMPACT [56%Full] 1753088 total, 997184 used, 755904 free
DXEFV [33%Full] 10485760 total, 3508336 used, 6977424 free
PEIFV [11%Full] 917504 total, 109896 used, 807608 free

RELEASE_GCC46

SECFV [6%Full] 212992 total, 13936 used, 199056 free
FVMAIN_COMPACT [56%Full] 1753088 total, 987632 used, 765456 free
DXEFV [33%Full] 10485760 total, 3468304 used, 7017456 free
PEIFV [11%Full] 917504 total, 107784 used, 809720 free

RELEASE_GCC47

SECFV [6%Full] 212992 total, 13808 used, 199184 free
FVMAIN_COMPACT [56%Full] 1753088 total, 988072 used, 765016 free
DXEFV [32%Full] 10485760 total, 3449936 used, 7035824 free
PEIFV [11%Full] 917504 total, 106984 used, 810520 free

RELEASE_GCC48

SECFV [6%Full] 212992 total, 13744 used, 199248 free
FVMAIN_COMPACT [56%Full] 1753088 total, 992800 used, 760288 free
DXEFV [32%Full] 10485760 total, 3453456 used, 7032304 free
PEIFV [11%Full] 917504 total, 107464 used, 810040 free

DEBUG_GCC44

SECFV [14%Full] 212992 total, 31632 used, 181360 free
FVMAIN_COMPACT [74%Full] 1753088 total, 1300904 used, 452184 free
DXEFV [50%Full] 10485760 total, 5272464 used, 5213296 free
PEIFV [25%Full] 917504 total, 231496 used, 686008 free

DEBUG_GCC46

SECFV [14%Full] 212992 total, 31728 used, 181264 free
FVMAIN_COMPACT [73%Full] 1753088 total, 1291672 used, 461416 free
DXEFV [50%Full] 10485760 total, 5263888 used, 5221872 free
PEIFV [25%Full] 917504 total, 231656 used, 685848 free

DEBUG_GCC47

SECFV [14%Full] 212992 total, 31440 used, 181552 free
FVMAIN_COMPACT [73%Full] 1753088 total, 1290584 used, 462504 free
DXEFV [49%Full] 10485760 total, 5234224 used, 5251536 free
PEIFV [25%Full] 917504 total, 230152 used, 687352 free

DEBUG_GCC48

SECFV [14%Full] 212992 total, 31280 used, 181712 free
FVMAIN_COMPACT [73%Full] 1753088 total, 1296208 used, 456880 free
DXEFV [49%Full] 10485760 total, 5229424 used, 5256336 free
PEIFV [25%Full] 917504 total, 229512 used, 687992 free

I am going to respin the series to use Os rather than O2.

Thanks,
Ard.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Laszlo Ersek
On 07/15/16 12:15, Ard Biesheuvel wrote:
> On 15 July 2016 at 11:51, Laszlo Ersek  wrote:

>> What I'd be most interested in is
>> enabling optimization (saving space) for -b DEBUG -- I don't do source
>> level debugging, but I want the debug messages and the ASSERT()s. Saving
>> space with those compiled in would be superb -- I guess that's where the
>> savings would really shine.
>>
> 
> Doing the same build for DEBUG gives me
> 
> -O0
> 
> SECFV [20%Full] 212992 total, 44464 used, 168528 free
> FVMAIN_COMPACT [75%Full] 1753088 total, 1326592 used, 426496 free
> DXEFV [67%Full] 10485760 total, 7077696 used, 3408064 free
> PEIFV [28%Full] 917504 total, 257200 used, 660304 free
> 
> -O1
> 
> SECFV [16%Full] 212992 total, 35248 used, 177744 free
> FVMAIN_COMPACT [76%Full] 1753088 total, 1339592 used, 413496 free
> DXEFV [56%Full] 10485760 total, 5933120 used, 4552640 free
> PEIFV [23%Full] 917504 total, 216240 used, 701264 free
> 
> -O2
> 
> SECFV [17%Full] 212992 total, 36688 used, 176304 free
> FVMAIN_COMPACT [83%Full] 1753088 total, 1469904 used, 283184 free
> DXEFV [58%Full] 10485760 total, 6157440 used, 4328320 free
> PEIFV [24%Full] 917504 total, 227408 used, 690096 free
> 
> which paints roughly the same picture: much smaller binaries, but O2
> is worse than O0/O1 after compression.

Thank you for checking this -- in absolute terms, -O1 shaves about 1.1MB
off DXEFV. Not bad!

> Another thing to take into account is that DXE code size translates
> directly into RuntimeServicesCode memory footprint of
> DXE_RUNTIME_DRIVER modules,

Good point!

> although I am not sure if that has ever
> been a source of concern.

I've never thought of it -- my gut feeling is that the AcpiNVS stuff is
larger. (I could easily be biased though: the AcpiNVS allocations are
all manual and require thought, while RuntimeServicesCode allocs are all
automatic. I might be projecting "mental effort required" to "allocation
size" :))

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Ard Biesheuvel
On 15 July 2016 at 11:51, Laszlo Ersek  wrote:
> On 07/15/16 11:25, Ard Biesheuvel wrote:
>> On 15 July 2016 at 11:06, Laszlo Ersek  wrote:
>
>>> And, to our collective relief, this means that there should be no
>>> problem with this series. :) I think I can add my Tested-by:
>>>
>>> Tested-by: Laszlo Ersek 
>>>
>>
>> Thanks a lot!
>>
>> I do have another question, though. Looking at your logs, I found the
>> following results for the sizes
>>
>> GCC48 before:
>>
>> SECFV [11%Full] 212992 total, 25392 used, 187600 free
>> FVMAIN_COMPACT [63%Full] 1753088 total, 1113976 used, 639112 free
>> DXEFV [56%Full] 10485760 total, 5970624 used, 4515136 free
>> PEIFV [21%Full] 917504 total, 200616 used, 716888 free
>>
>> GCC48 after:
>>
>> SECFV [7%Full] 212992 total, 15728 used, 197264 free
>> FVMAIN_COMPACT [65%Full] 1753088 total, 1145808 used, 607280 free
>> DXEFV [39%Full] 10485760 total, 4148384 used, 6337376 free
>> PEIFV [14%Full] 917504 total, 131208 used, 786296 free
>>
>>
>> GCC49 before:
>>
>> SECFV [12%Full] 212992 total, 25616 used, 187376 free
>> FVMAIN_COMPACT [64%Full] 1753088 total, 1122760 used, 630328 free
>> DXEFV [58%Full] 10485760 total, 6106048 used, 4379712 free
>> PEIFV [22%Full] 917504 total, 205992 used, 711512 free
>>
>> GCC49 after:
>>
>> SECFV [7%Full] 212992 total, 15824 used, 197168 free
>> FVMAIN_COMPACT [66%Full] 1753088 total, 1166952 used, 586136 free
>> DXEFV [40%Full] 10485760 total, 4225728 used, 6260032 free
>> PEIFV [14%Full] 917504 total, 133672 used, 783832 free
>>
>> So DXEFV is substantially smaller, but FVMAIN_COMPACT ends up slightly
>> bigger. Which one do we care about mostly? I would assume
>> FVMAIN_COMPACT, since that translates into flash footprint directly.
>
> Correct.
>
> DXEFV is less relevant, but not irrelevant. We've been increasing its
> allotted size in the FDF file(s) over time, cramming more and more DXE
> phase stuff into OVMF. Its size does have an impact on guest memory
> availability, when the SMM driver stack is built in, and S3 is enabled
> on the QEMU command line. (Under these circumstances DXEFV's location is
> covered as AcpiNVS. Not optimal, but that's what we have.)
>
> So, in practice, shrinking DXEFV is useful as well.
>
> FVMAIN_COMPACT is what ultimately determines the flash chip's size.
> We've never had to change that (after a historical bump from 1MB to 2MB,
> speaking unified flash terms -- for OVMF_CODE.fd, the bump means
> 1MB-128KB --> 2MB-128KB). If we have to increase OVMF_CODE.fd, that will
> be visible on the host filesystem, and might present compatibility
> problems for existing VMs after their next boot. (I hope not, but I've
> never actually tested this scenario.)
>
> Mike mentioned elsewhere in this thread that he measured the utility of
> new build flags on the change in compressed size.
>
> I guess it just tells us that TianoCompress performs incredibly well :)
>
>> So I did a quick test, comparing O0 / O1 / O2 on GCC48, and it seems
>> the sweet spot is O1 (due to lack of support for Os in older GCCs)
>>
>> -O0
>>
>> SECFV [11%Full] 212992 total, 23536 used, 189456 free
>> FVMAIN_COMPACT [58%Full] 1753088 total, 1016816 used, 736272 free
>> DXEFV [47%Full] 10485760 total, 4978656 used, 5507104 free
>> PEIFV [14%Full] 917504 total, 133328 used, 784176 free
>>
>> -O1
>>
>> SECFV [6%Full] 212992 total, 14768 used, 198224 free
>> FVMAIN_COMPACT [58%Full] 1753088 total, 1017856 used, 735232 free
>> DXEFV [37%Full] 10485760 total, 3907648 used, 6578112 free
>> PEIFV [10%Full] 917504 total, 95696 used, 821808 free
>>
>> -O2
>>
>> SECFV [7%Full] 212992 total, 15728 used, 197264 free
>> FVMAIN_COMPACT [63%Full] 1753088 total, 144 used, 641944 free
>> DXEFV [38%Full] 10485760 total, 4029984 used, 6455776 free
>> PEIFV [10%Full] 917504 total, 100400 used, 817104 free
>
> Huh, very interesting. -O1 vs. -O2 produce almost identically sized
> PEIFV and DXEFV (both optimization settings beating -O0 of course), but
> -O2 does something that makes the compression deteriorate, even to the
> point where FVMAIN_COMPACT ends up larger than with -O0!
>
> More agressive unrolling / inlining with -O2 perhaps?
>
> And, the effect of -O1 seems to be: practically unchanged FVMAIN_COMPACT
> size, relative to -O0 (it's negligibly larger than with -O0 I guess),
> and significantly smaller DXEFV and PEIFV usage.
>
> -O1 seems useful. Assuming that the "negligible" size increase in
> FVMAIN_COMPACT, relative to -O0, remains negligible in the long term. :)
>
>> Anyway, thanks again for your time and effort in testing this, much 
>> appreciated.
>
> No, thank you for doing this; I've always been intrigued by the
> optimization possibilities for OVMF. What I'd be most interested in is
> enabling optimization (saving space) for -b DEBUG -- I don't do source
> level debugging, but I want the debug messages and the ASSERT()s. Saving
> space with those compiled in would be superb -- I guess that's where the
> savings would really shine.
>

Doing the sam

Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Paolo Bonzini


On 14/07/2016 16:57, Ard Biesheuvel wrote:
>> >   On patch 5, I don't see any change for IA32 arch. is there no mode for 
>> > IA32 arch? Here, small and pic must be enabled together, right? Otherwise, 
>> > the assumption is to load driver below 2G address. Have you collected size 
>> > data before and after this change?
>> >
> To be honest, I know very little about the IA32 ABI, but I don't think
> it has different code models, does it? Since the ELF relocations are
> stripped by the PE/COFF conversion, I don't think it is necessary to
> take into account that X64 and IA32 modules may interact with each
> other when running on a X64 platform.

IIUC, IA32 cannot use memory above 4GB because it doesn't use virtual
addresses.  Also, IA32 doesn't have PC-relative addressing at all.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Laszlo Ersek
On 07/15/16 11:25, Ard Biesheuvel wrote:
> On 15 July 2016 at 11:06, Laszlo Ersek  wrote:

>> And, to our collective relief, this means that there should be no
>> problem with this series. :) I think I can add my Tested-by:
>>
>> Tested-by: Laszlo Ersek 
>>
> 
> Thanks a lot!
> 
> I do have another question, though. Looking at your logs, I found the
> following results for the sizes
> 
> GCC48 before:
> 
> SECFV [11%Full] 212992 total, 25392 used, 187600 free
> FVMAIN_COMPACT [63%Full] 1753088 total, 1113976 used, 639112 free
> DXEFV [56%Full] 10485760 total, 5970624 used, 4515136 free
> PEIFV [21%Full] 917504 total, 200616 used, 716888 free
> 
> GCC48 after:
> 
> SECFV [7%Full] 212992 total, 15728 used, 197264 free
> FVMAIN_COMPACT [65%Full] 1753088 total, 1145808 used, 607280 free
> DXEFV [39%Full] 10485760 total, 4148384 used, 6337376 free
> PEIFV [14%Full] 917504 total, 131208 used, 786296 free
> 
> 
> GCC49 before:
> 
> SECFV [12%Full] 212992 total, 25616 used, 187376 free
> FVMAIN_COMPACT [64%Full] 1753088 total, 1122760 used, 630328 free
> DXEFV [58%Full] 10485760 total, 6106048 used, 4379712 free
> PEIFV [22%Full] 917504 total, 205992 used, 711512 free
> 
> GCC49 after:
> 
> SECFV [7%Full] 212992 total, 15824 used, 197168 free
> FVMAIN_COMPACT [66%Full] 1753088 total, 1166952 used, 586136 free
> DXEFV [40%Full] 10485760 total, 4225728 used, 6260032 free
> PEIFV [14%Full] 917504 total, 133672 used, 783832 free
> 
> So DXEFV is substantially smaller, but FVMAIN_COMPACT ends up slightly
> bigger. Which one do we care about mostly? I would assume
> FVMAIN_COMPACT, since that translates into flash footprint directly.

Correct.

DXEFV is less relevant, but not irrelevant. We've been increasing its
allotted size in the FDF file(s) over time, cramming more and more DXE
phase stuff into OVMF. Its size does have an impact on guest memory
availability, when the SMM driver stack is built in, and S3 is enabled
on the QEMU command line. (Under these circumstances DXEFV's location is
covered as AcpiNVS. Not optimal, but that's what we have.)

So, in practice, shrinking DXEFV is useful as well.

FVMAIN_COMPACT is what ultimately determines the flash chip's size.
We've never had to change that (after a historical bump from 1MB to 2MB,
speaking unified flash terms -- for OVMF_CODE.fd, the bump means
1MB-128KB --> 2MB-128KB). If we have to increase OVMF_CODE.fd, that will
be visible on the host filesystem, and might present compatibility
problems for existing VMs after their next boot. (I hope not, but I've
never actually tested this scenario.)

Mike mentioned elsewhere in this thread that he measured the utility of
new build flags on the change in compressed size.

I guess it just tells us that TianoCompress performs incredibly well :)

> So I did a quick test, comparing O0 / O1 / O2 on GCC48, and it seems
> the sweet spot is O1 (due to lack of support for Os in older GCCs)
> 
> -O0
> 
> SECFV [11%Full] 212992 total, 23536 used, 189456 free
> FVMAIN_COMPACT [58%Full] 1753088 total, 1016816 used, 736272 free
> DXEFV [47%Full] 10485760 total, 4978656 used, 5507104 free
> PEIFV [14%Full] 917504 total, 133328 used, 784176 free
> 
> -O1
> 
> SECFV [6%Full] 212992 total, 14768 used, 198224 free
> FVMAIN_COMPACT [58%Full] 1753088 total, 1017856 used, 735232 free
> DXEFV [37%Full] 10485760 total, 3907648 used, 6578112 free
> PEIFV [10%Full] 917504 total, 95696 used, 821808 free
> 
> -O2
> 
> SECFV [7%Full] 212992 total, 15728 used, 197264 free
> FVMAIN_COMPACT [63%Full] 1753088 total, 144 used, 641944 free
> DXEFV [38%Full] 10485760 total, 4029984 used, 6455776 free
> PEIFV [10%Full] 917504 total, 100400 used, 817104 free

Huh, very interesting. -O1 vs. -O2 produce almost identically sized
PEIFV and DXEFV (both optimization settings beating -O0 of course), but
-O2 does something that makes the compression deteriorate, even to the
point where FVMAIN_COMPACT ends up larger than with -O0!

More agressive unrolling / inlining with -O2 perhaps?

And, the effect of -O1 seems to be: practically unchanged FVMAIN_COMPACT
size, relative to -O0 (it's negligibly larger than with -O0 I guess),
and significantly smaller DXEFV and PEIFV usage.

-O1 seems useful. Assuming that the "negligible" size increase in
FVMAIN_COMPACT, relative to -O0, remains negligible in the long term. :)

> Anyway, thanks again for your time and effort in testing this, much 
> appreciated.

No, thank you for doing this; I've always been intrigued by the
optimization possibilities for OVMF. What I'd be most interested in is
enabling optimization (saving space) for -b DEBUG -- I don't do source
level debugging, but I want the debug messages and the ASSERT()s. Saving
space with those compiled in would be superb -- I guess that's where the
savings would really shine.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Ard Biesheuvel
On 15 July 2016 at 11:06, Laszlo Ersek  wrote:
> On 07/15/16 10:58, Ard Biesheuvel wrote:
>> On 15 July 2016 at 10:48, Laszlo Ersek  wrote:
>>> On 07/15/16 10:40, Ard Biesheuvel wrote:
 On 15 July 2016 at 08:33, Laszlo Ersek  wrote:
> On 07/15/16 08:07, Ard Biesheuvel wrote:
>> On 15 July 2016 at 01:27, Laszlo Ersek  wrote:
>
>>> First, the build tests. I built OVMF 84 times, with the following 
>>> settings:
>>>
>>> * Dimension 1: whether your and Steven's patches were applied or not.
>>
>> I take it this means this series only?
>
> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I
> meant the five patches in this series -- one patch from Steven, four
> patches from you.
>
>>> * However, then I built my program -- find it attached -- with -b DEBUG
>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the
>>> difference in output:
>>>
>>> FS2:\> DebugEnrollDefaultKeys.efi
>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 
>>> VendorKeys=0
>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 
>>> VendorKeys=0
>>> info: success
>>>
>>> versus
>>>
>>> FS2:\> OptEnrollDefaultKeys.efi
>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 
>>> VendorKeys=0
>>> error: EnrollListOfX509Certs("db",
>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter
>>>
>>> I don't know why this happens, but it definitely relates to varargs --
>>> the program uses them liberally.
>>>
>>
>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,
>> it makes sense to report it to the GCC maintainers (assuming we can
>> create a test case). I noticed that this code iterates over the same
>> VA_LIST twice, I wonder how well that was tested ...
>
> You make a good point about using VA_LIST twice possibly tickling gcc
> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my
> code -- in order to step over some arguments --, which might not be all
> that usual as well.
>
> In any case, the way I use varargs seems to be standards conformant.
>
> With regard to reporting this to gcc developers: I won't try that. I'm
> discouraged by two facts:
>
> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since
> October 2011. You'll find some recent comments from Steven and David in
> it :)
>
> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --
> shortly after I ran into the -Os corruption issue with GCC48, I created
> and posted this small reproducer. Creating the reproducer wasn't
> trivial. In parallel I sent the same reproducer to one of my (indirect)
> colleagues at Red Hat, who had been a veteran in upstream GCC
> development and maintenance. I also asked him how/where I should report
> the bug. I got no answer from him.
>
> If you'd like to report a GCC bug, please go ahead, but I won't waste my
> time :)
>

 It seems that it is your testcase that is broken:

 --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
 +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
 @@ -663,6 +663,8 @@ EnrollListOfX509Certs (
UINT8*Data;
UINT8*Position;

 +  Status = EFI_SUCCESS;
 +
//
// compute total size first, for UINT32 range check, and allocation
//

 With that change, everything works fine for me
>>>
>>> I agree that proving my code wrong would be the best outcome here, but
>>> can you please explain to me on what path we return from
>>> EnrollListOfX509Certs() without setting Status?
>>>
>>
>> Well, it seems that in the success case, we are supposed to end up at line 
>> 692
>>
>>   if (EFI_ERROR (Status)) {
>> goto Out;
>>   }
>>
>> and expect the condition to be FALSE. If we have successfully tallied
>> the size of at least one cert, 'DataSize == sizeof *SingleHeader' will
>> be FALSE, and so Status will never have been assigned a value.
>> Apparently, the optimizer sees some kind of undefined behavior here
>> which it can exploit to bail unconditionally (or Status happens to be
>> 0 under -O0)
>>
>
> You are perfectly right. Thank you for catching this!
>
> And, to our collective relief, this means that there should be no
> problem with this series. :) I think I can add my Tested-by:
>
> Tested-by: Laszlo Ersek 
>

Thanks a lot!

I do have another question, though. Looking at your logs, I found the
following results for the sizes

GCC48 before:

SECFV [11%Full] 212992 total, 25392 used, 187600 free
FVMAIN_COMPACT [63%Full] 1753088 total, 1113976 used, 639112 free
DXEFV [56%Full] 10485760 total, 5970624 used, 4515136 free
PEIFV [21%Full] 917504 total, 200616 used, 716888 free

GCC48 after:

SECFV [7%Full] 212992 total, 15728 used

Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Laszlo Ersek
On 07/15/16 10:58, Ard Biesheuvel wrote:
> On 15 July 2016 at 10:48, Laszlo Ersek  wrote:
>> On 07/15/16 10:40, Ard Biesheuvel wrote:
>>> On 15 July 2016 at 08:33, Laszlo Ersek  wrote:
 On 07/15/16 08:07, Ard Biesheuvel wrote:
> On 15 July 2016 at 01:27, Laszlo Ersek  wrote:

>> First, the build tests. I built OVMF 84 times, with the following 
>> settings:
>>
>> * Dimension 1: whether your and Steven's patches were applied or not.
>
> I take it this means this series only?

 Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I
 meant the five patches in this series -- one patch from Steven, four
 patches from you.

>> * However, then I built my program -- find it attached -- with -b DEBUG
>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the
>> difference in output:
>>
>> FS2:\> DebugEnrollDefaultKeys.efi
>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 
>> VendorKeys=0
>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 
>> VendorKeys=0
>> info: success
>>
>> versus
>>
>> FS2:\> OptEnrollDefaultKeys.efi
>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 
>> VendorKeys=0
>> error: EnrollListOfX509Certs("db",
>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter
>>
>> I don't know why this happens, but it definitely relates to varargs --
>> the program uses them liberally.
>>
>
> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,
> it makes sense to report it to the GCC maintainers (assuming we can
> create a test case). I noticed that this code iterates over the same
> VA_LIST twice, I wonder how well that was tested ...

 You make a good point about using VA_LIST twice possibly tickling gcc
 the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my
 code -- in order to step over some arguments --, which might not be all
 that usual as well.

 In any case, the way I use varargs seems to be standards conformant.

 With regard to reporting this to gcc developers: I won't try that. I'm
 discouraged by two facts:

 * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since
 October 2011. You'll find some recent comments from Steven and David in
 it :)

 * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --
 shortly after I ran into the -Os corruption issue with GCC48, I created
 and posted this small reproducer. Creating the reproducer wasn't
 trivial. In parallel I sent the same reproducer to one of my (indirect)
 colleagues at Red Hat, who had been a veteran in upstream GCC
 development and maintenance. I also asked him how/where I should report
 the bug. I got no answer from him.

 If you'd like to report a GCC bug, please go ahead, but I won't waste my
 time :)

>>>
>>> It seems that it is your testcase that is broken:
>>>
>>> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>>> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>>> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (
>>>UINT8*Data;
>>>UINT8*Position;
>>>
>>> +  Status = EFI_SUCCESS;
>>> +
>>>//
>>>// compute total size first, for UINT32 range check, and allocation
>>>//
>>>
>>> With that change, everything works fine for me
>>
>> I agree that proving my code wrong would be the best outcome here, but
>> can you please explain to me on what path we return from
>> EnrollListOfX509Certs() without setting Status?
>>
> 
> Well, it seems that in the success case, we are supposed to end up at line 692
> 
>   if (EFI_ERROR (Status)) {
> goto Out;
>   }
> 
> and expect the condition to be FALSE. If we have successfully tallied
> the size of at least one cert, 'DataSize == sizeof *SingleHeader' will
> be FALSE, and so Status will never have been assigned a value.
> Apparently, the optimizer sees some kind of undefined behavior here
> which it can exploit to bail unconditionally (or Status happens to be
> 0 under -O0)
> 

You are perfectly right. Thank you for catching this!

And, to our collective relief, this means that there should be no
problem with this series. :) I think I can add my Tested-by:

Tested-by: Laszlo Ersek 

Cheers!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Ard Biesheuvel
On 15 July 2016 at 10:48, Laszlo Ersek  wrote:
> On 07/15/16 10:40, Ard Biesheuvel wrote:
>> On 15 July 2016 at 08:33, Laszlo Ersek  wrote:
>>> On 07/15/16 08:07, Ard Biesheuvel wrote:
 On 15 July 2016 at 01:27, Laszlo Ersek  wrote:
>>>
> First, the build tests. I built OVMF 84 times, with the following 
> settings:
>
> * Dimension 1: whether your and Steven's patches were applied or not.

 I take it this means this series only?
>>>
>>> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I
>>> meant the five patches in this series -- one patch from Steven, four
>>> patches from you.
>>>
> * However, then I built my program -- find it attached -- with -b DEBUG
> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the
> difference in output:
>
> FS2:\> DebugEnrollDefaultKeys.efi
> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 
> VendorKeys=0
> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 
> VendorKeys=0
> info: success
>
> versus
>
> FS2:\> OptEnrollDefaultKeys.efi
> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 
> VendorKeys=0
> error: EnrollListOfX509Certs("db",
> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter
>
> I don't know why this happens, but it definitely relates to varargs --
> the program uses them liberally.
>

 If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,
 it makes sense to report it to the GCC maintainers (assuming we can
 create a test case). I noticed that this code iterates over the same
 VA_LIST twice, I wonder how well that was tested ...
>>>
>>> You make a good point about using VA_LIST twice possibly tickling gcc
>>> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my
>>> code -- in order to step over some arguments --, which might not be all
>>> that usual as well.
>>>
>>> In any case, the way I use varargs seems to be standards conformant.
>>>
>>> With regard to reporting this to gcc developers: I won't try that. I'm
>>> discouraged by two facts:
>>>
>>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since
>>> October 2011. You'll find some recent comments from Steven and David in
>>> it :)
>>>
>>> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --
>>> shortly after I ran into the -Os corruption issue with GCC48, I created
>>> and posted this small reproducer. Creating the reproducer wasn't
>>> trivial. In parallel I sent the same reproducer to one of my (indirect)
>>> colleagues at Red Hat, who had been a veteran in upstream GCC
>>> development and maintenance. I also asked him how/where I should report
>>> the bug. I got no answer from him.
>>>
>>> If you'd like to report a GCC bug, please go ahead, but I won't waste my
>>> time :)
>>>
>>
>> It seems that it is your testcase that is broken:
>>
>> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (
>>UINT8*Data;
>>UINT8*Position;
>>
>> +  Status = EFI_SUCCESS;
>> +
>>//
>>// compute total size first, for UINT32 range check, and allocation
>>//
>>
>> With that change, everything works fine for me
>
> I agree that proving my code wrong would be the best outcome here, but
> can you please explain to me on what path we return from
> EnrollListOfX509Certs() without setting Status?
>

Well, it seems that in the success case, we are supposed to end up at line 692

  if (EFI_ERROR (Status)) {
goto Out;
  }

and expect the condition to be FALSE. If we have successfully tallied
the size of at least one cert, 'DataSize == sizeof *SingleHeader' will
be FALSE, and so Status will never have been assigned a value.
Apparently, the optimizer sees some kind of undefined behavior here
which it can exploit to bail unconditionally (or Status happens to be
0 under -O0)
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Laszlo Ersek
On 07/15/16 10:40, Ard Biesheuvel wrote:
> On 15 July 2016 at 08:33, Laszlo Ersek  wrote:
>> On 07/15/16 08:07, Ard Biesheuvel wrote:
>>> On 15 July 2016 at 01:27, Laszlo Ersek  wrote:
>>
 First, the build tests. I built OVMF 84 times, with the following settings:

 * Dimension 1: whether your and Steven's patches were applied or not.
>>>
>>> I take it this means this series only?
>>
>> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I
>> meant the five patches in this series -- one patch from Steven, four
>> patches from you.
>>
 * However, then I built my program -- find it attached -- with -b DEBUG
 and -b RELEASE too, and the -b RELEASE binary is broken. Here's the
 difference in output:

 FS2:\> DebugEnrollDefaultKeys.efi
 info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0
 info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0
 info: success

 versus

 FS2:\> OptEnrollDefaultKeys.efi
 info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0
 error: EnrollListOfX509Certs("db",
 D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter

 I don't know why this happens, but it definitely relates to varargs --
 the program uses them liberally.

>>>
>>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,
>>> it makes sense to report it to the GCC maintainers (assuming we can
>>> create a test case). I noticed that this code iterates over the same
>>> VA_LIST twice, I wonder how well that was tested ...
>>
>> You make a good point about using VA_LIST twice possibly tickling gcc
>> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my
>> code -- in order to step over some arguments --, which might not be all
>> that usual as well.
>>
>> In any case, the way I use varargs seems to be standards conformant.
>>
>> With regard to reporting this to gcc developers: I won't try that. I'm
>> discouraged by two facts:
>>
>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since
>> October 2011. You'll find some recent comments from Steven and David in
>> it :)
>>
>> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --
>> shortly after I ran into the -Os corruption issue with GCC48, I created
>> and posted this small reproducer. Creating the reproducer wasn't
>> trivial. In parallel I sent the same reproducer to one of my (indirect)
>> colleagues at Red Hat, who had been a veteran in upstream GCC
>> development and maintenance. I also asked him how/where I should report
>> the bug. I got no answer from him.
>>
>> If you'd like to report a GCC bug, please go ahead, but I won't waste my
>> time :)
>>
> 
> It seems that it is your testcase that is broken:
> 
> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (
>UINT8*Data;
>UINT8*Position;
> 
> +  Status = EFI_SUCCESS;
> +
>//
>// compute total size first, for UINT32 range check, and allocation
>//
> 
> With that change, everything works fine for me

I agree that proving my code wrong would be the best outcome here, but
can you please explain to me on what path we return from
EnrollListOfX509Certs() without setting Status?

Thanks
Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Ard Biesheuvel
On 15 July 2016 at 08:33, Laszlo Ersek  wrote:
> On 07/15/16 08:07, Ard Biesheuvel wrote:
>> On 15 July 2016 at 01:27, Laszlo Ersek  wrote:
>
>>> First, the build tests. I built OVMF 84 times, with the following settings:
>>>
>>> * Dimension 1: whether your and Steven's patches were applied or not.
>>
>> I take it this means this series only?
>
> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I
> meant the five patches in this series -- one patch from Steven, four
> patches from you.
>
>>> * However, then I built my program -- find it attached -- with -b DEBUG
>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the
>>> difference in output:
>>>
>>> FS2:\> DebugEnrollDefaultKeys.efi
>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0
>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0
>>> info: success
>>>
>>> versus
>>>
>>> FS2:\> OptEnrollDefaultKeys.efi
>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0
>>> error: EnrollListOfX509Certs("db",
>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter
>>>
>>> I don't know why this happens, but it definitely relates to varargs --
>>> the program uses them liberally.
>>>
>>
>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,
>> it makes sense to report it to the GCC maintainers (assuming we can
>> create a test case). I noticed that this code iterates over the same
>> VA_LIST twice, I wonder how well that was tested ...
>
> You make a good point about using VA_LIST twice possibly tickling gcc
> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my
> code -- in order to step over some arguments --, which might not be all
> that usual as well.
>
> In any case, the way I use varargs seems to be standards conformant.
>
> With regard to reporting this to gcc developers: I won't try that. I'm
> discouraged by two facts:
>
> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since
> October 2011. You'll find some recent comments from Steven and David in
> it :)
>
> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --
> shortly after I ran into the -Os corruption issue with GCC48, I created
> and posted this small reproducer. Creating the reproducer wasn't
> trivial. In parallel I sent the same reproducer to one of my (indirect)
> colleagues at Red Hat, who had been a veteran in upstream GCC
> development and maintenance. I also asked him how/where I should report
> the bug. I got no answer from him.
>
> If you'd like to report a GCC bug, please go ahead, but I won't waste my
> time :)
>

It seems that it is your testcase that is broken:

--- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
+++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
@@ -663,6 +663,8 @@ EnrollListOfX509Certs (
   UINT8*Data;
   UINT8*Position;

+  Status = EFI_SUCCESS;
+
   //
   // compute total size first, for UINT32 range check, and allocation
   //

With that change, everything works fine for me

Thanks,
Ard.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-15 Thread Gao, Liming
Ard:
  The error message shows the error happens in the first loop in VA_START and 
VA_END. Could we update the logic to use single va list and verify it again? If 
the single va list passes, it can prove your suspect. If the single va list 
still fail, it may be other issue. I think we need to know which usage will 
cause break with your patches. 

Thanks
Liming
> -Original Message-
> From: Ard Biesheuvel [mailto:ard.biesheu...@linaro.org]
> Sent: Friday, July 15, 2016 2:07 PM
> To: Laszlo Ersek 
> Cc: Shi, Steven ; Zhu, Yonghong
> ; Gao, Liming ; Kinney,
> Michael D ; Justen, Jordan L
> ; af...@apple.com; edk2-devel@lists.01.org
> 
> Subject: Re: [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64
> 
> On 15 July 2016 at 01:27, Laszlo Ersek  wrote:
> > On 07/14/16 15:16, Ard Biesheuvel wrote:
> >> This series is not an attempt to steal Steven's thunder. I was merely
> >> inspired by some of the changes he is proposing for GCC 5 and Clang
> >> support, which I noticed would be applicable to older versions of GCC
> >> as well.
> >>
> >> The first patch fixes the issue that __builtin_unreachable() is not
> >> implemented by GCC 4.4 or earlier.
> >>
> >> Patch #2 is Steven's patch to switch to the flavor of VA_LIST that is
> >> explicitly modeled after the MS implementation. This by itself is an
> >> improvement, since the open coded implementation that performs
> arithmetic
> >> on the address of explicit arguments to obtain the variadic arguments is
> >> fragile and difficult to maintain, and should be best avoided.
> >>
> >> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to
> GCC44.
> >> I tested this change with both Ovmf and EmulatorPkg built in various ways
> >> and with various versions, with the caveat that I did not always use a
> matching
> >> binutils (i.e., of the same era). Since the issues this series deal with 
> >> are
> >> all code generation issues, I think this is reasonable, but more testing
> would
> >> be appreciated.
> >>
> >> Patch #4 applies the visibility 'hidden' GCC pragma globally. Please refer 
> >> to
> >> the commit log for the motivation.
> >>
> >> Patch #5 switches GCC/X64 to the PIC small code model, which results in
> smaller
> >> code.
> >>
> >> Ard Biesheuvel (4):
> >>   MdePkg: avoid __builtin_unreachable() on GCC v4.4
> >>   BaseTools/tools_def: enable O2 optimization for GCC X64 builds
> >>   MdePkg X64: force 'hidden' visibility when building with -fpic
> >>   BaseTools/tools_def: switch GCC/X64 to the PIC small model
> >>
> >> Shi, Steven (1):
> >>   MdePkg: Enable new MS VA intrinsics for GNUC x86 64bits build
> >>
> >>  BaseTools/Conf/tools_def.template  | 18 +++-
> >>  MdePkg/Include/Base.h  | 29 +++-
> >>  MdePkg/Include/X64/ProcessorBind.h | 10 +++
> >>  3 files changed, 49 insertions(+), 8 deletions(-)
> >>  mode change 100644 => 100755 MdePkg/Include/Base.h
> >>
> >
> > So, I did some tests.
> >
> 
> Thanks a lot for the awesome work, and for the amount of effort you
> have put in. I could never have tested this as methodical and with the
> same coverage as you have.
> 
> > First, the build tests. I built OVMF 84 times, with the following settings:
> >
> > * Dimension 1: whether your and Steven's patches were applied or not.
> 
> I take it this means this series only?
> 
> > The basis was always:
> > - upstream 04147690b59b,
> > - plus my personal patches that I constantly rebase,
> > - plus my pending "feat_ctrl_issue97_v3" patches,
> > - plus my work-in-progress build fixes for GCCxx and VS2015x86 warnings,
> > - plus OpenSSL 1.0.2g.
> >
> > These patches don't interfere with your and Steven's work. So on top of
> > this basis, your and Steven's patches were either applied (rebased) or
> > not. This is a multiplier of 2.
> >
> > * Dimension 2: the GCC toolchain used. Ranging from GCC44 through
> GCC49,
> > plus gcc-6.1 at the end (using GCC49 settings). Multiplier of 7.
> >
> > * Dimension 3: DEBUG versus RELEASE build target. Factor of 2.
> >
> > * Dimension 4: the OVMF platform. I used the following command lines:
> >
> > - build -a IA32-p OvmfPkg/OvmfPkgIa32.dsc-D SMM_REQUIRE \
> > -D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
> > - build -a IA32 -a X64 -p OvmfPkg/OvmfPkgIa32X64.dsc -D SMM_REQUIRE \
> > -D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
> > - build -a X64 -p OvmfPkg/OvmfPkgX64.dsc \
> > -D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
> >
> > As you can see, Ia32 and Ia32X64 are built with the SMM driver stack;
> > X64 is built without it. (This is because edk2 S3 doesn't support
> > X64+SMM at the moment, so I either have to build X64 without SMM, or
> > disable S3 support on the QEMU command line. I find the latter more
> > inconvenient, plus Ia32X64 is just fine for 64-bit OS + SMM + S3.)
> > Anyhow, this is a factor of 3.
> >
> > For these builds, I collected the log files and the report files. Values
> > in dimension 3 are not separated into separate 

Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Laszlo Ersek
On 07/15/16 08:07, Ard Biesheuvel wrote:
> On 15 July 2016 at 01:27, Laszlo Ersek  wrote:

>> First, the build tests. I built OVMF 84 times, with the following settings:
>>
>> * Dimension 1: whether your and Steven's patches were applied or not.
> 
> I take it this means this series only?

Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I
meant the five patches in this series -- one patch from Steven, four
patches from you.

>> * However, then I built my program -- find it attached -- with -b DEBUG
>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the
>> difference in output:
>>
>> FS2:\> DebugEnrollDefaultKeys.efi
>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0
>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0
>> info: success
>>
>> versus
>>
>> FS2:\> OptEnrollDefaultKeys.efi
>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0
>> error: EnrollListOfX509Certs("db",
>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter
>>
>> I don't know why this happens, but it definitely relates to varargs --
>> the program uses them liberally.
>>
> 
> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,
> it makes sense to report it to the GCC maintainers (assuming we can
> create a test case). I noticed that this code iterates over the same
> VA_LIST twice, I wonder how well that was tested ...

You make a good point about using VA_LIST twice possibly tickling gcc
the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my
code -- in order to step over some arguments --, which might not be all
that usual as well.

In any case, the way I use varargs seems to be standards conformant.

With regard to reporting this to gcc developers: I won't try that. I'm
discouraged by two facts:

* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since
October 2011. You'll find some recent comments from Steven and David in
it :)

* http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --
shortly after I ran into the -Os corruption issue with GCC48, I created
and posted this small reproducer. Creating the reproducer wasn't
trivial. In parallel I sent the same reproducer to one of my (indirect)
colleagues at Red Hat, who had been a veteran in upstream GCC
development and maintenance. I also asked him how/where I should report
the bug. I got no answer from him.

If you'd like to report a GCC bug, please go ahead, but I won't waste my
time :)

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Ard Biesheuvel
On 15 July 2016 at 01:27, Laszlo Ersek  wrote:
> On 07/14/16 15:16, Ard Biesheuvel wrote:
>> This series is not an attempt to steal Steven's thunder. I was merely
>> inspired by some of the changes he is proposing for GCC 5 and Clang
>> support, which I noticed would be applicable to older versions of GCC
>> as well.
>>
>> The first patch fixes the issue that __builtin_unreachable() is not
>> implemented by GCC 4.4 or earlier.
>>
>> Patch #2 is Steven's patch to switch to the flavor of VA_LIST that is
>> explicitly modeled after the MS implementation. This by itself is an
>> improvement, since the open coded implementation that performs arithmetic
>> on the address of explicit arguments to obtain the variadic arguments is
>> fragile and difficult to maintain, and should be best avoided.
>>
>> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.
>> I tested this change with both Ovmf and EmulatorPkg built in various ways
>> and with various versions, with the caveat that I did not always use a 
>> matching
>> binutils (i.e., of the same era). Since the issues this series deal with are
>> all code generation issues, I think this is reasonable, but more testing 
>> would
>> be appreciated.
>>
>> Patch #4 applies the visibility 'hidden' GCC pragma globally. Please refer to
>> the commit log for the motivation.
>>
>> Patch #5 switches GCC/X64 to the PIC small code model, which results in 
>> smaller
>> code.
>>
>> Ard Biesheuvel (4):
>>   MdePkg: avoid __builtin_unreachable() on GCC v4.4
>>   BaseTools/tools_def: enable O2 optimization for GCC X64 builds
>>   MdePkg X64: force 'hidden' visibility when building with -fpic
>>   BaseTools/tools_def: switch GCC/X64 to the PIC small model
>>
>> Shi, Steven (1):
>>   MdePkg: Enable new MS VA intrinsics for GNUC x86 64bits build
>>
>>  BaseTools/Conf/tools_def.template  | 18 +++-
>>  MdePkg/Include/Base.h  | 29 +++-
>>  MdePkg/Include/X64/ProcessorBind.h | 10 +++
>>  3 files changed, 49 insertions(+), 8 deletions(-)
>>  mode change 100644 => 100755 MdePkg/Include/Base.h
>>
>
> So, I did some tests.
>

Thanks a lot for the awesome work, and for the amount of effort you
have put in. I could never have tested this as methodical and with the
same coverage as you have.

> First, the build tests. I built OVMF 84 times, with the following settings:
>
> * Dimension 1: whether your and Steven's patches were applied or not.

I take it this means this series only?

> The basis was always:
> - upstream 04147690b59b,
> - plus my personal patches that I constantly rebase,
> - plus my pending "feat_ctrl_issue97_v3" patches,
> - plus my work-in-progress build fixes for GCCxx and VS2015x86 warnings,
> - plus OpenSSL 1.0.2g.
>
> These patches don't interfere with your and Steven's work. So on top of
> this basis, your and Steven's patches were either applied (rebased) or
> not. This is a multiplier of 2.
>
> * Dimension 2: the GCC toolchain used. Ranging from GCC44 through GCC49,
> plus gcc-6.1 at the end (using GCC49 settings). Multiplier of 7.
>
> * Dimension 3: DEBUG versus RELEASE build target. Factor of 2.
>
> * Dimension 4: the OVMF platform. I used the following command lines:
>
> - build -a IA32-p OvmfPkg/OvmfPkgIa32.dsc-D SMM_REQUIRE \
> -D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
> - build -a IA32 -a X64 -p OvmfPkg/OvmfPkgIa32X64.dsc -D SMM_REQUIRE \
> -D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
> - build -a X64 -p OvmfPkg/OvmfPkgX64.dsc \
> -D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
>
> As you can see, Ia32 and Ia32X64 are built with the SMM driver stack;
> X64 is built without it. (This is because edk2 S3 doesn't support
> X64+SMM at the moment, so I either have to build X64 without SMM, or
> disable S3 support on the QEMU command line. I find the latter more
> inconvenient, plus Ia32X64 is just fine for 64-bit OS + SMM + S3.)
> Anyhow, this is a factor of 3.
>
> For these builds, I collected the log files and the report files. Values
> in dimension 3 are not separated into separate files, because you can
> pass -b DEBUG -b RELEASE at once, and the build utility will do the
> right thing -- it will build both targets in succession. However, the
> log file and the report file each will contain both DEBUG and RELEASE
> output.
>
> For the builds I also collected the OVMF_CODE.fd firmware binaries.
>
> The good news is that all of the builds succeeded. You can draw size
> comparisons from the build log files: see SECFV, PEIFV and DXEFV for
> uncompressed sizes, and FVMAIN_COMPACT for the compressed size of
> PEIFV+DXEFV together. (Thanks Mike for the reminder!)
>
> I'll send you the link to the results archive off-list. (I tend not to
> post stuff publicly that I cannot review in detail, purely out of
> privacy concerns, and this archive is just too big for that. For example
> I dislike disclosing full pathnames from my system: in commit messages I
> always edit them out.)
>
> Clearly if

Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Ard Biesheuvel
On 15 July 2016 at 03:28, Gao, Liming  wrote:
> I see Steven patch uses -Os after enables new MS VA intrinsics. So, -Os 
> should work. But, I am not sure whether -Os is better than -O2 for edk2.
>

-Os did not work for me with GCC49 or earlier. It complains about a
missing -maccumulate-outgoing-args, but adding that option results in
a broken binary.

>> -Original Message-
>> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of
>> Laszlo Ersek
>> Sent: Friday, July 15, 2016 6:20 AM
>> To: Bruce Cran ; Ard Biesheuvel
>> ; Shi, Steven ; Zhu,
>> Yonghong ; Gao, Liming ;
>> Kinney, Michael D ; Justen, Jordan L
>> ; af...@apple.com; edk2-de...@ml01.01.org
>> Subject: Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64
>>
>> On 07/14/16 23:57, Bruce Cran wrote:
>> > On 7/14/2016 7:16 AM, Ard Biesheuvel wrote:
>> >
>> >> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to
>> >> GCC44.
>> >
>> > I wonder if -Os might be a better default optimization? Or perhaps
>> > there's plenty of flash on devices nowadays and performance is more
>> > important than size?
>> >
>> > At least for our driver, I see a 33% size reduction with -O2 and 44%
>> > with -Os for the driver .efi file. Perhaps more importantly, the .rom
>> > file (i.e. efirom output) has no reduction with -O2 and a 14% reduction
>> > with -Os.
>> >
>>
>> Beware of -Os with GCC48:
>>
>> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963
>>
>> Thanks
>> Laszlo
>> ___
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Gao, Liming
I see Steven patch uses -Os after enables new MS VA intrinsics. So, -Os should 
work. But, I am not sure whether -Os is better than -O2 for edk2. 

> -Original Message-
> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of
> Laszlo Ersek
> Sent: Friday, July 15, 2016 6:20 AM
> To: Bruce Cran ; Ard Biesheuvel
> ; Shi, Steven ; Zhu,
> Yonghong ; Gao, Liming ;
> Kinney, Michael D ; Justen, Jordan L
> ; af...@apple.com; edk2-de...@ml01.01.org
> Subject: Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64
> 
> On 07/14/16 23:57, Bruce Cran wrote:
> > On 7/14/2016 7:16 AM, Ard Biesheuvel wrote:
> >
> >> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to
> >> GCC44.
> >
> > I wonder if -Os might be a better default optimization? Or perhaps
> > there's plenty of flash on devices nowadays and performance is more
> > important than size?
> >
> > At least for our driver, I see a 33% size reduction with -O2 and 44%
> > with -Os for the driver .efi file. Perhaps more importantly, the .rom
> > file (i.e. efirom output) has no reduction with -O2 and a 14% reduction
> > with -Os.
> >
> 
> Beware of -Os with GCC48:
> 
> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963
> 
> Thanks
> Laszlo
> ___
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Laszlo Ersek
On 07/14/16 15:16, Ard Biesheuvel wrote:
> This series is not an attempt to steal Steven's thunder. I was merely
> inspired by some of the changes he is proposing for GCC 5 and Clang
> support, which I noticed would be applicable to older versions of GCC
> as well.
> 
> The first patch fixes the issue that __builtin_unreachable() is not
> implemented by GCC 4.4 or earlier.
> 
> Patch #2 is Steven's patch to switch to the flavor of VA_LIST that is
> explicitly modeled after the MS implementation. This by itself is an
> improvement, since the open coded implementation that performs arithmetic
> on the address of explicit arguments to obtain the variadic arguments is
> fragile and difficult to maintain, and should be best avoided.
> 
> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.
> I tested this change with both Ovmf and EmulatorPkg built in various ways
> and with various versions, with the caveat that I did not always use a 
> matching
> binutils (i.e., of the same era). Since the issues this series deal with are
> all code generation issues, I think this is reasonable, but more testing would
> be appreciated.
> 
> Patch #4 applies the visibility 'hidden' GCC pragma globally. Please refer to
> the commit log for the motivation.
> 
> Patch #5 switches GCC/X64 to the PIC small code model, which results in 
> smaller
> code.
> 
> Ard Biesheuvel (4):
>   MdePkg: avoid __builtin_unreachable() on GCC v4.4
>   BaseTools/tools_def: enable O2 optimization for GCC X64 builds
>   MdePkg X64: force 'hidden' visibility when building with -fpic
>   BaseTools/tools_def: switch GCC/X64 to the PIC small model
> 
> Shi, Steven (1):
>   MdePkg: Enable new MS VA intrinsics for GNUC x86 64bits build
> 
>  BaseTools/Conf/tools_def.template  | 18 +++-
>  MdePkg/Include/Base.h  | 29 +++-
>  MdePkg/Include/X64/ProcessorBind.h | 10 +++
>  3 files changed, 49 insertions(+), 8 deletions(-)
>  mode change 100644 => 100755 MdePkg/Include/Base.h
> 

So, I did some tests.

First, the build tests. I built OVMF 84 times, with the following settings:

* Dimension 1: whether your and Steven's patches were applied or not.
The basis was always:
- upstream 04147690b59b,
- plus my personal patches that I constantly rebase,
- plus my pending "feat_ctrl_issue97_v3" patches,
- plus my work-in-progress build fixes for GCCxx and VS2015x86 warnings,
- plus OpenSSL 1.0.2g.

These patches don't interfere with your and Steven's work. So on top of
this basis, your and Steven's patches were either applied (rebased) or
not. This is a multiplier of 2.

* Dimension 2: the GCC toolchain used. Ranging from GCC44 through GCC49,
plus gcc-6.1 at the end (using GCC49 settings). Multiplier of 7.

* Dimension 3: DEBUG versus RELEASE build target. Factor of 2.

* Dimension 4: the OVMF platform. I used the following command lines:

- build -a IA32-p OvmfPkg/OvmfPkgIa32.dsc-D SMM_REQUIRE \
-D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
- build -a IA32 -a X64 -p OvmfPkg/OvmfPkgIa32X64.dsc -D SMM_REQUIRE \
-D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE
- build -a X64 -p OvmfPkg/OvmfPkgX64.dsc \
-D SECURE_BOOT_ENABLE -D HTTP_BOOT_ENABLE

As you can see, Ia32 and Ia32X64 are built with the SMM driver stack;
X64 is built without it. (This is because edk2 S3 doesn't support
X64+SMM at the moment, so I either have to build X64 without SMM, or
disable S3 support on the QEMU command line. I find the latter more
inconvenient, plus Ia32X64 is just fine for 64-bit OS + SMM + S3.)
Anyhow, this is a factor of 3.

For these builds, I collected the log files and the report files. Values
in dimension 3 are not separated into separate files, because you can
pass -b DEBUG -b RELEASE at once, and the build utility will do the
right thing -- it will build both targets in succession. However, the
log file and the report file each will contain both DEBUG and RELEASE
output.

For the builds I also collected the OVMF_CODE.fd firmware binaries.

The good news is that all of the builds succeeded. You can draw size
comparisons from the build log files: see SECFV, PEIFV and DXEFV for
uncompressed sizes, and FVMAIN_COMPACT for the compressed size of
PEIFV+DXEFV together. (Thanks Mike for the reminder!)

I'll send you the link to the results archive off-list. (I tend not to
post stuff publicly that I cannot review in detail, purely out of
privacy concerns, and this archive is just too big for that. For example
I dislike disclosing full pathnames from my system: in commit messages I
always edit them out.)

Clearly if you distill any numbers from the archive, you are going to
review those in detail (and hopefully you won't disclose the entire
archive), so it should suffice to constrain my privacy concerns to you
personally. :)

-o-

Second, the run test. I executed two test scenarios here. For these, the
following dimensions were fixed:

- Dimension 1: your and Steven's patches were always applied.
- Dimension 

Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Kinney, Michael D
Hi Bruce,

I think size is most important for the default settings.  This is why it
it important to measure the size changes for each set of flag options.
Size measurements are a bit complex due to the interaction with compressors.
I have found that measuring the used space in an FV with compression enabled
is the best way to determine if a compiler/linker flag has a positive or
negative influence on the size.  For efirom, I recommend enabling UEFI 
compression (-ec option).

If a specific module needs a flag a for performance optimization, that can 
be done in a  section of DSC or [BuildOptions] section of INF.

Mike

> -Original Message-
> From: Bruce Cran [mailto:br...@cran.org.uk]
> Sent: Thursday, July 14, 2016 2:57 PM
> To: Ard Biesheuvel ; Shi, Steven 
> ;
> Zhu, Yonghong ; Gao, Liming ; 
> Kinney,
> Michael D ; Justen, Jordan L 
> ;
> ler...@redhat.com; af...@apple.com; edk2-devel@lists.01.org
> Subject: Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64
> 
> On 7/14/2016 7:16 AM, Ard Biesheuvel wrote:
> 
> > Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.
> 
> I wonder if -Os might be a better default optimization? Or perhaps
> there's plenty of flash on devices nowadays and performance is more
> important than size?
> 
> At least for our driver, I see a 33% size reduction with -O2 and 44%
> with -Os for the driver .efi file. Perhaps more importantly, the .rom
> file (i.e. efirom output) has no reduction with -O2 and a 14% reduction
> with -Os.
> 
> --
> Bruce
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Bruce Cran

On 7/14/2016 4:20 PM, Laszlo Ersek wrote:


On 07/14/16 23:57, Bruce Cran wrote:


I wonder if -Os might be a better default optimization? Or perhaps
there's plenty of flash on devices nowadays and performance is more
important than size?

Beware of -Os with GCC48:

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963


Oh, good point. Also, at this point any optimization is a _huge_ 
improvement! :)


--
Bruce
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Laszlo Ersek
On 07/14/16 23:57, Bruce Cran wrote:
> On 7/14/2016 7:16 AM, Ard Biesheuvel wrote:
> 
>> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to
>> GCC44.
> 
> I wonder if -Os might be a better default optimization? Or perhaps
> there's plenty of flash on devices nowadays and performance is more
> important than size?
> 
> At least for our driver, I see a 33% size reduction with -O2 and 44%
> with -Os for the driver .efi file. Perhaps more importantly, the .rom
> file (i.e. efirom output) has no reduction with -O2 and a 14% reduction
> with -Os.
> 

Beware of -Os with GCC48:

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Bruce Cran

On 7/14/2016 7:16 AM, Ard Biesheuvel wrote:


Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.


I wonder if -Os might be a better default optimization? Or perhaps 
there's plenty of flash on devices nowadays and performance is more 
important than size?


At least for our driver, I see a 33% size reduction with -O2 and 44% 
with -Os for the driver .efi file. Perhaps more importantly, the .rom 
file (i.e. efirom output) has no reduction with -O2 and a 14% reduction 
with -Os.


--
Bruce
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Ard Biesheuvel
On 14 July 2016 at 16:42, Gao, Liming  wrote:
> Ard:
>   On patch 4, the visibility 'hidden' GCC pragma globally. Does it work 
> together with other mode, such as large? Or it only works with small and pic? 
> I want to clarify this change has no negative impact.

The #if checks for 'defined(__pic__)', which is only true if -fpic is
being passed on the command line. It works fine with other code
models, it only affects the compiler's decision whether to reference
symbols indirectly via the GOT.

>   On patch 5, I don't see any change for IA32 arch. is there no mode for IA32 
> arch? Here, small and pic must be enabled together, right? Otherwise, the 
> assumption is to load driver below 2G address. Have you collected size data 
> before and after this change?
>

To be honest, I know very little about the IA32 ABI, but I don't think
it has different code models, does it? Since the ELF relocations are
stripped by the PE/COFF conversion, I don't think it is necessary to
take into account that X64 and IA32 modules may interact with each
other when running on a X64 platform.

As far as the size is concerned, I get the following results building
DxeCore.efi with GCC47 (under OvmfPkgX64.dsc)

Before:

Section Headers:
  [Nr] Name  Type Address   Offset
   Size  EntSize  Flags  Link  Info  Align
  [ 0]   NULL   
        0 0 0
  [ 1] .text PROGBITS 0240  00c0
   0001e3d0    AX   0 0 32
  [ 2] .rela.textRELA   00029528
   f360  0018   I   6 1 8
  [ 3] .data PROGBITS 0001e620  0001e4a0
   3448    WA   0 0 32
  [ 4] .rela.dataRELA   0003
   1158  0018   I   6 3 8

After:

Section Headers:
  [Nr] Name  Type Address   Offset
   Size  EntSize  Flags  Link  Info  Align
  [ 0]   NULL   
        0 0 0
  [ 1] .text PROGBITS 0240  00c0
   0001a5a0    AX   0 0 32
  [ 2] .rela.textRELA   00025d48
   00010218  0018   I   6 1 8
  [ 3] .data PROGBITS 0001a7e0  0001a660
   3668    WA   0 0 32
  [ 4] .rela.dataRELA   00035f60
   1788  0018   I   6 3 8


So the text section shrinks by 16 KB (13%) in this case, and I got
similar results for other modules.

-- 
Ard.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Gao, Liming
Ard:
  On patch 4, the visibility 'hidden' GCC pragma globally. Does it work 
together with other mode, such as large? Or it only works with small and pic? I 
want to clarify this change has no negative impact. 
  On patch 5, I don't see any change for IA32 arch. is there no mode for IA32 
arch? Here, small and pic must be enabled together, right? Otherwise, the 
assumption is to load driver below 2G address. Have you collected size data 
before and after this change?

Thanks
Liming
-Original Message-
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Ard 
Biesheuvel
Sent: Thursday, July 14, 2016 9:16 PM
To: Shi, Steven ; Zhu, Yonghong ; 
Gao, Liming ; Kinney, Michael D 
; Justen, Jordan L ; 
ler...@redhat.com; af...@apple.com; edk2-devel@lists.01.org
Cc: Ard Biesheuvel 
Subject: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

This series is not an attempt to steal Steven's thunder. I was merely
inspired by some of the changes he is proposing for GCC 5 and Clang
support, which I noticed would be applicable to older versions of GCC
as well.

The first patch fixes the issue that __builtin_unreachable() is not
implemented by GCC 4.4 or earlier.

Patch #2 is Steven's patch to switch to the flavor of VA_LIST that is
explicitly modeled after the MS implementation. This by itself is an
improvement, since the open coded implementation that performs arithmetic
on the address of explicit arguments to obtain the variadic arguments is
fragile and difficult to maintain, and should be best avoided.

Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.
I tested this change with both Ovmf and EmulatorPkg built in various ways
and with various versions, with the caveat that I did not always use a matching
binutils (i.e., of the same era). Since the issues this series deal with are
all code generation issues, I think this is reasonable, but more testing would
be appreciated.

Patch #4 applies the visibility 'hidden' GCC pragma globally. Please refer to
the commit log for the motivation.

Patch #5 switches GCC/X64 to the PIC small code model, which results in smaller
code.

Ard Biesheuvel (4):
  MdePkg: avoid __builtin_unreachable() on GCC v4.4
  BaseTools/tools_def: enable O2 optimization for GCC X64 builds
  MdePkg X64: force 'hidden' visibility when building with -fpic
  BaseTools/tools_def: switch GCC/X64 to the PIC small model

Shi, Steven (1):
  MdePkg: Enable new MS VA intrinsics for GNUC x86 64bits build

 BaseTools/Conf/tools_def.template  | 18 +++-
 MdePkg/Include/Base.h  | 29 +++-
 MdePkg/Include/X64/ProcessorBind.h | 10 +++
 3 files changed, 49 insertions(+), 8 deletions(-)
 mode change 100644 => 100755 MdePkg/Include/Base.h

-- 
2.7.4

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Ard Biesheuvel
On 14 July 2016 at 15:16, Ard Biesheuvel  wrote:
> This series is not an attempt to steal Steven's thunder. I was merely
> inspired by some of the changes he is proposing for GCC 5 and Clang
> support, which I noticed would be applicable to older versions of GCC
> as well.
>
> The first patch fixes the issue that __builtin_unreachable() is not
> implemented by GCC 4.4 or earlier.
>
> Patch #2 is Steven's patch to switch to the flavor of VA_LIST that is
> explicitly modeled after the MS implementation. This by itself is an
> improvement, since the open coded implementation that performs arithmetic
> on the address of explicit arguments to obtain the variadic arguments is
> fragile and difficult to maintain, and should be best avoided.
>
> Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.
> I tested this change with both Ovmf and EmulatorPkg built in various ways
> and with various versions, with the caveat that I did not always use a 
> matching
> binutils (i.e., of the same era). Since the issues this series deal with are
> all code generation issues, I think this is reasonable, but more testing would
> be appreciated.
>
> Patch #4 applies the visibility 'hidden' GCC pragma globally. Please refer to
> the commit log for the motivation.
>
> Patch #5 switches GCC/X64 to the PIC small code model, which results in 
> smaller
> code.
>

Code can be found here
https://git.linaro.org/people/ard.biesheuvel/uefi-next.git/shortlog/refs/heads/gcc-x64-opt
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


[edk2] [PATCH 0/5] MdePkg BaseTools: GCC optimization for X64

2016-07-14 Thread Ard Biesheuvel
This series is not an attempt to steal Steven's thunder. I was merely
inspired by some of the changes he is proposing for GCC 5 and Clang
support, which I noticed would be applicable to older versions of GCC
as well.

The first patch fixes the issue that __builtin_unreachable() is not
implemented by GCC 4.4 or earlier.

Patch #2 is Steven's patch to switch to the flavor of VA_LIST that is
explicitly modeled after the MS implementation. This by itself is an
improvement, since the open coded implementation that performs arithmetic
on the address of explicit arguments to obtain the variadic arguments is
fragile and difficult to maintain, and should be best avoided.

Patch #3 enabled -O2 optimization for X64/RELEASE all the way back to GCC44.
I tested this change with both Ovmf and EmulatorPkg built in various ways
and with various versions, with the caveat that I did not always use a matching
binutils (i.e., of the same era). Since the issues this series deal with are
all code generation issues, I think this is reasonable, but more testing would
be appreciated.

Patch #4 applies the visibility 'hidden' GCC pragma globally. Please refer to
the commit log for the motivation.

Patch #5 switches GCC/X64 to the PIC small code model, which results in smaller
code.

Ard Biesheuvel (4):
  MdePkg: avoid __builtin_unreachable() on GCC v4.4
  BaseTools/tools_def: enable O2 optimization for GCC X64 builds
  MdePkg X64: force 'hidden' visibility when building with -fpic
  BaseTools/tools_def: switch GCC/X64 to the PIC small model

Shi, Steven (1):
  MdePkg: Enable new MS VA intrinsics for GNUC x86 64bits build

 BaseTools/Conf/tools_def.template  | 18 +++-
 MdePkg/Include/Base.h  | 29 +++-
 MdePkg/Include/X64/ProcessorBind.h | 10 +++
 3 files changed, 49 insertions(+), 8 deletions(-)
 mode change 100644 => 100755 MdePkg/Include/Base.h

-- 
2.7.4

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel