Re: [edk2] Enable optimization for gcc x64 builds

2015-07-23 Thread Bruce Cran
Going back to the original description 
(http://comments.gmane.org/gmane.comp.bios.tianocore.devel/10741), I 
noticed in 
https://firmware.intel.com/sites/default/files/MinnowBoard_MAX-Rel_0.81-ReleaseNotes.txt
 
that Intel say:

2. Because the binary size created using GCC (Linux environment) is ~20% lager 
than the size of
the binary created using the Microsoft toolchain (Windows Environment), the 
firmware build
in the Linux environment (GCC build) uses the minimum shell instead of 
fullshell, The image
built in the Linux environment (GCC build) may have some limitation for 
UEFI shell application.


I guess that will be because there's no optimization! If this patch gets 
committed, could it be backported to the UDK2014.SP1 branch so the 
MinnowboardMAX project can make use of it, or would it be too risky?

By the way, messages are still going through to the 
[email protected] list. Shouldn't it be set to read-only now the 
01.org ML is active?

-- 
Bruce


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2015-07-23 Thread David Woodhouse
On Thu, 2015-07-23 at 12:04 +0200, Paolo Bonzini wrote:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39472#c8 suggests that the
> > support was backported to GCC 4.4 too.
> 
> "ix86/gcc-4_4-branch" sounds like an internal branch for use by Intel
> engineers.  Features are not backported to stable branches.

Hm, yes. I was misled by that final comment. It doesn't seem to be in
gcc-4_4-branch; you're right.

I don't suppose we can ditch GCC 4.4 support too?

I hesitate slightly because *last* time I said 'here's a nickel, kid.
Get yourself a better compiler' I then ended up spending a month or so
hacking LLVM to add .code16 support... :)

> > If we *can't* kill EFIAPI completely, then we need to get GCC's
> > __builtin_va_list to do the right thing according to the ABI of the
> > function it happens to be compiling at the time. This is 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818
> 
> Am I CCed because you'd like me to fix it? :)  I can take a look.

You were Cc'd because I just revived an old thread and you were already
on it. But don't let me discourage you!

My *primary* motivation right now is getting our OpenSSL patches
upstream though, and fixing PR50818 doesn't really help with that in
the short term. But it *would* be nice.

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2015-07-23 Thread Paolo Bonzini


On 23/07/2015 11:46, David Woodhouse wrote:
> On Tue, 2014-11-04 at 14:32 -0800, Jordan Justen wrote:
>> On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
>>> So my 1st question is why do you need to mix calling conventions, 
>>> and depend
>>> on EFIAPI for interoperability. Why not just change the ABI on all
>>> functions?
>>
>> GCC 4.4 doesn't support the command line option to change everything
>> over. So, EFIAPI was the only option then.
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39472#c8 suggests that the
> support was backported to GCC 4.4 too.

"ix86/gcc-4_4-branch" sounds like an internal branch for use by Intel
engineers.  Features are not backported to stable branches.

>> For GCC >= 4.5, I actually think we should convert *RELEASE* builds
>> over to using the ms-abi all the time to generate smaller code. I
>> think we should leave DEBUG builds as mixed to help clean up EFIAPI
>> issues.
> 
> Is it feasible to just kill EFIAPI completely, if GCC4X toolchains are
> the only ones that use it and we can move them to -mabi=ms?
> 
> Tim pointed out that vendors use it on 32-bit targets to use __fastcall
> for internal functions. If that's a worthwhile optimisation it might
> make sense to merge that properly — perhaps using an annotation for the
> *internal* functions where it makes most sense, rather than having to
> annotate the EFIAPI functions? But quite frankly, if they're not
> contributing optimisations upstream in a timely fashion then we
> *really* shouldn't be pandering to them. If that's the only remaining
> use case then we should let it die.
> 
> If we *can't* kill EFIAPI completely, then we need to get GCC's
> __builtin_va_list to do the right thing according to the ABI of the
> function it happens to be compiling at the time. This is 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818

Am I CCed because you'd like me to fix it? :)  I can take a look.

Paolo

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2015-07-23 Thread David Woodhouse
On Tue, 2014-11-04 at 14:32 -0800, Jordan Justen wrote:
> On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
> > So my 1st question is why do you need to mix calling conventions, 
> > and depend
> > on EFIAPI for interoperability. Why not just change the ABI on all
> > functions?
> 
> GCC 4.4 doesn't support the command line option to change everything
> over. So, EFIAPI was the only option then.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39472#c8 suggests that the
support was backported to GCC 4.4 too.

> > Problems with the mixed calling convention:
> > 1) All assembly routines must be marked as EFIAPI, or the C code will
> > generate the wrong calling convention. Not an issue in the MdePkg, but
> > potentially an issue in other packages.
> 
> I don't see this as a problem. I think this is the rules that we have
> set up for EDK II. It just so happens that the GCC4X toolchains are
> the only ones that use EFIAPI, and thus are the only ones that allow
> us to keep our codebase clean with regards to EFIAPI.
> 
> For GCC >= 4.5, I actually think we should convert *RELEASE* builds
> over to using the ms-abi all the time to generate smaller code. I
> think we should leave DEBUG builds as mixed to help clean up EFIAPI
> issues.

Is it feasible to just kill EFIAPI completely, if GCC4X toolchains are
the only ones that use it and we can move them to -mabi=ms?

Tim pointed out that vendors use it on 32-bit targets to use __fastcall
for internal functions. If that's a worthwhile optimisation it might
make sense to merge that properly — perhaps using an annotation for the
*internal* functions where it makes most sense, rather than having to
annotate the EFIAPI functions? But quite frankly, if they're not
contributing optimisations upstream in a timely fashion then we
*really* shouldn't be pandering to them. If that's the only remaining
use case then we should let it die.

If we *can't* kill EFIAPI completely, then we need to get GCC's
__builtin_va_list to do the right thing according to the ABI of the
function it happens to be compiling at the time. This is 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818

-- 
David WoodhouseOpen Source Technology Centre
[email protected]  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-11 Thread Scott Duplichan
Laszlo Ersek [mailto:[email protected]] wrote:

]On 11/07/14 21:29, Scott Duplichan wrote:
]> Jordan Justen [mailto:[email protected]] wrote:
]> 
]> ]On 2014-11-07 08:16:23, Scott Duplichan wrote:
]> ]> These are all good answers. I can't come up with a strong argument for the
]> ]> mixed sysv/ms ABI. Maybe the next step is to test -mabi=ms using several 
gcc
]> ]> versions (I think -mabi=ms was introduced with gcc 4.5). If that works, I 
could
]> ]> submit a patch and see what happens..
]> ]
]> ]I mentioned a reason in this thread a few days back. But, we should
]> ]look into -mabi=ms for RELEASE builds.
]> ]
]> ]-Jordan
]> 
]> I agree, the approach in your previous email is a good one. Prototyping
]> asm functions to enforce calling convention is always a good idea. In theory
]> an IA32 build could be done with a Microsoft compiler with option /Gr
]> (__fastcall calling convention) and it would work. This would not be possible
]> if asm function calling convention information were missing. If I make this
]> patch, I will add the gcc -mabi=ms to the release build.
]> 
]> Now for rants...
]> 1) Why do so many developers never want to test release builds? To me, code
]> is not clean until both debug and release builds work smoothly.
]
]In my experience, release (== optimized) builds are practically
]unsupportable. Even the Linux kernel disables some optimizations that
]make the disassembly unreadable. Unless stuff is power and/or
]performance critical, I prefer if the code does exactly what I tell it
]to do. (Case in point: the -Os bug with recursion + ellipsis. It works
]with -O0. Compilers have bugs.)
]
]*All* software is chock full of bugs, and having to figure out what goes
]wrong at a customer's site is a question of "when", not "if". They
]either won't be able, or willing, to attempt to reproduce the issue with
]a debug build, or they will try and the bug might disappear.
]
]Consequently, since I'm not keen on shipping anything but a debug build,
]I don't feel like putting many resources into release builds.

Release builds are/were shipped out of necessity, at least in the past.
This was due to a desire to cut board cost by using the smallest possible
flash chip. But times are changing and NOR flash capacity is growing 
even faster than code size. So the flash size reduction motivation for
optimizing code is losing importance I guess. In my experience, getting
a release build to work sometimes results in uncovering hidden coding
errors. A bigger reason to use release builds is boot time reduction.
While UEFI will never boot as fast as coreboot, it can narrow the gap
some by minimizing the time spent reading data from the flash chip.
Adding -Os and link time optimization can cut the image size in half.
That saves significant time when the image is read from the flash chip.
]
]> 2) Why is the NOOPT build missing from virtually every DSC file in EDK2?
]
]I guess in OVMF we never needed it?

I got OVMF booting on a real server a few years ago. Adding the NOOPT
build was the first thing I did. That let me step through the source code
and see all local variables. I couldn't have gotten it working as quickly
as I did without source level debugging.

Instead of adding NOOPT to every project, adding it to one or two might
be a better idea. I don't want to see a lot of code change just for the
sake debugger support. 

]> The EDK NOOPT build is most like what developers call a DEBUG build. It is
]> the only one setup for source level debugging, at least for Microsoft tool
]> chains.
]
]I don't use Microsoft tool chains. And source level debugging with gdb
]is hardly possible even on DEBUG; you have to jump through incredible
]hoops. NOOPT doesn't solve anything for Linux-based developers & users
]of OVMF.

I understand. But Microsoft tool chains are still supported by the EDK2
project. Dropping support for Microsoft tool chains would solve some
problems. But clearly that isn't going to happen any time soon. It is
surprising to see the strength and weaknesses of different tool chains.
Microsoft perfected link time optimization 10+ years ago. Yet they 
didn't even bother with C99 support until recently. GCC had C99 10+
years ago, yet only recently perfected link time optimization.

It is unfortunate there is no nice open source debugger for use with
EDK2 and other embedded projects. Some of the OEM and IBV debuggers
I used were really nice, though at the time they supported only
Microsoft debug symbols.

]> The Duet DSC files are missing both RELEASE and NOOPT options.
]
]It's an emulator platform, isn't it? You probably won't use it in
]production.

Of all EDK2 projects, Duet seems to need the fewest changes for use
on real hardware. You would be surprised to know how many Duet based
systems have shipped from tier one oems.

]Thanks
]Laszlo

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get al

Re: [edk2] Enable optimization for gcc x64 builds

2014-11-11 Thread Laszlo Ersek
On 11/07/14 21:29, Scott Duplichan wrote:
> Jordan Justen [mailto:[email protected]] wrote:
> 
> ]On 2014-11-07 08:16:23, Scott Duplichan wrote:
> ]> These are all good answers. I can't come up with a strong argument for the
> ]> mixed sysv/ms ABI. Maybe the next step is to test -mabi=ms using several 
> gcc
> ]> versions (I think -mabi=ms was introduced with gcc 4.5). If that works, I 
> could
> ]> submit a patch and see what happens..
> ]
> ]I mentioned a reason in this thread a few days back. But, we should
> ]look into -mabi=ms for RELEASE builds.
> ]
> ]-Jordan
> 
> I agree, the approach in your previous email is a good one. Prototyping
> asm functions to enforce calling convention is always a good idea. In theory
> an IA32 build could be done with a Microsoft compiler with option /Gr
> (__fastcall calling convention) and it would work. This would not be possible
> if asm function calling convention information were missing. If I make this
> patch, I will add the gcc -mabi=ms to the release build.
> 
> Now for rants...
> 1) Why do so many developers never want to test release builds? To me, code
> is not clean until both debug and release builds work smoothly.

In my experience, release (== optimized) builds are practically
unsupportable. Even the Linux kernel disables some optimizations that
make the disassembly unreadable. Unless stuff is power and/or
performance critical, I prefer if the code does exactly what I tell it
to do. (Case in point: the -Os bug with recursion + ellipsis. It works
with -O0. Compilers have bugs.)

*All* software is chock full of bugs, and having to figure out what goes
wrong at a customer's site is a question of "when", not "if". They
either won't be able, or willing, to attempt to reproduce the issue with
a debug build, or they will try and the bug might disappear.

Consequently, since I'm not keen on shipping anything but a debug build,
I don't feel like putting many resources into release builds.

> 2) Why is the NOOPT build missing from virtually every DSC file in EDK2?

I guess in OVMF we never needed it?

> The EDK NOOPT build is most like what developers call a DEBUG build. It is
> the only one setup for source level debugging, at least for Microsoft tool
> chains.

I don't use Microsoft tool chains. And source level debugging with gdb
is hardly possible even on DEBUG; you have to jump through incredible
hoops. NOOPT doesn't solve anything for Linux-based developers & users
of OVMF.

> The Duet DSC files are missing both RELEASE and NOOPT options.

It's an emulator platform, isn't it? You probably won't use it in
production.

Thanks
Laszlo

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-07 Thread Andrew Fish
> 
> Now for rants...
> 1) Why do so many developers never want to test release builds? To me, code
> is not clean until both debug and release builds work smoothly.
> 2) Why is the NOOPT build missing from virtually every DSC file in EDK2?
> The EDK NOOPT build is most like what developers call a DEBUG build. It is
> the only one setup for source level debugging, at least for Microsoft tool
> chains. The Duet DSC files are missing both RELEASE and NOOPT options. I
> may submit a patch to allow all 3 builds to every DSC file.
> 

Scott,

Historically, if I remember correctly, the NOOPT builds were added to support 
Nt32Pkg/EmulatorPkg. 

“Back in the day” a NOOPT build would not generally fit in a ROM. But given 
folks could be building an option ROM, shell, OS loader that don’t have size 
constraints I think you are right in pointing out the NOOPT builds should be in 
all the supported tool chains. 

Thanks,

Andrew Fish

> Thanks,
> Scott


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-07 Thread Scott Duplichan
Jordan Justen [mailto:[email protected]] wrote:

]On 2014-11-07 08:16:23, Scott Duplichan wrote:
]> These are all good answers. I can't come up with a strong argument for the
]> mixed sysv/ms ABI. Maybe the next step is to test -mabi=ms using several gcc
]> versions (I think -mabi=ms was introduced with gcc 4.5). If that works, I 
could
]> submit a patch and see what happens..
]
]I mentioned a reason in this thread a few days back. But, we should
]look into -mabi=ms for RELEASE builds.
]
]-Jordan

I agree, the approach in your previous email is a good one. Prototyping
asm functions to enforce calling convention is always a good idea. In theory
an IA32 build could be done with a Microsoft compiler with option /Gr
(__fastcall calling convention) and it would work. This would not be possible
if asm function calling convention information were missing. If I make this
patch, I will add the gcc -mabi=ms to the release build.

Now for rants...
1) Why do so many developers never want to test release builds? To me, code
is not clean until both debug and release builds work smoothly.
2) Why is the NOOPT build missing from virtually every DSC file in EDK2?
The EDK NOOPT build is most like what developers call a DEBUG build. It is
the only one setup for source level debugging, at least for Microsoft tool
chains. The Duet DSC files are missing both RELEASE and NOOPT options. I
may submit a patch to allow all 3 builds to every DSC file.

Thanks,
Scott


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-07 Thread Jordan Justen
On 2014-11-07 08:16:23, Scott Duplichan wrote:
> These are all good answers. I can't come up with a strong argument for the
> mixed sysv/ms ABI. Maybe the next step is to test -mabi=ms using several gcc
> versions (I think -mabi=ms was introduced with gcc 4.5). If that works, I 
> could
> submit a patch and see what happens..

I mentioned a reason in this thread a few days back. But, we should
look into -mabi=ms for RELEASE builds.

-Jordan

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-07 Thread Scott Duplichan
Andrew Fish [mailto:[email protected]] wrote:


Sent: Tuesday, November 04, 2014 03:40 PM
To: [email protected]
Cc: Paolo Bonzini
Subject: Re: [edk2] Enable optimization for gcc x64 builds

 

 

On Nov 4, 2014, at 12:15 PM, Scott Duplichan mailto:[email protected]> > wrote:

 

But pedantically you need change the definition of BasePrintLibSPrint() to 
include EFIAPI. 

 

If you look at BasePrintLibSPrintMarker() (and some of the other routines) you 
will notice a BASE_LIST and a VA_LIST. We had an API that exposed a VA_LIST 
(well code that was casting to VA_LIST) in the report status code stack and it 
forced use to use our own made up BASE_LIST concept to get it to work. I think 
you are going to hit similar issues mixing calling conventions. 

 

So my 1st question is why do you need to mix calling conventions, and depend on 
EFIAPI for interoperability. Why not just change the ABI on all functions?

 

If I understand your question "Why not just change the ABI on all functions", 
you mean use Microsoft ABI throughout the code even when compiled with gcc? The 
gcc option -mabi=ms makes this easy, and it reduces code size too (8% in one 
test). Part of that code size reduction is because it removes the requirement 
to save xmm6-xmm15 when calling msabi code. Gcc doesn't optimize the 
save/restore of xmm6-xmm15, it just does them all. The problems with ms abi I 
can think of are:

1) Linux developers accustomed to stepping through the sysv calling convention 
would have to adapt to the ms calling convention.

 

Well all the public interfaces (EFI service, protocol member functions, etc.) 
are EFIAPI, so there will be a mix. 





2) -mabi=ms is probably  little used and therefore more likely to have bugs. 
This might require dropping support for older gcc tool chains.

 

Looks like mixing has bugs too. It looks like this issue is caused by a 
mismatch in the vararg definitions between the two worlds. You can’t really mix 
styles in a given call stack. It almost seems like you want to force one var 
arg scheme every place possible. 





3) According to an email from you in April, ms abi might not support stack 
trace without debug symbols.

 

 

This is not the ABI it is VC++ code generation. There is  nothing in the ABI 
about how to unwind a stack frame, it is about how to call code in C. 

 

In my clang examples, in this thread,  we have an EFIAPI compatible calling 
convention with stack unwind. -target x86_64-pc-win32-macho means build an X64 
image using EFIAPI, do the standard frame pointer, and we kick out a Mach-O 
executable for the debugger. We convert the Mach-O to PE/COFF for EFI 
compatibility. So on clang it as EFIABI, but with stack unwind. We can always 
unwind a frame without symbols, until we hit code compiled with VC++. 





Even if ms abi is never made the default for gcc code, adding an environment 
variable such as EXTRA_CC_FLAGS would allow its use in custom builds by those 
who need the code size reduction it brings.

 

What about switching EDK2 to sysv abi? I assume that would require dropping 
support for Microsoft compilers. 

 

 

The EFI calling convention is in the spec, so all things EFI would break. 
Option ROMs on cards, installed Operating system, etc…. The edk2 is an open 
source project that implements industry standard, not just a chunk of code. 

 

Thanks,

 

Andrew Fish

 

These are all good answers. I can't come up with a strong argument for the 
mixed sysv/ms ABI. Maybe the next step is to test -mabi=ms using several gcc 
versions (I think -mabi=ms was introduced with gcc 4.5). If that works, I could 
submit a patch and see what happens..

Thanks,

Scott





 

Thanks,

Scott

 

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-05 Thread Andrew Fish

> On Nov 5, 2014, at 2:00 AM, Laszlo Ersek  wrote:
> 
> On 11/05/14 00:02, Andrew Fish wrote:
>> 
>>> On Nov 4, 2014, at 2:32 PM, Jordan Justen  wrote:
>>> 
>>> On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
 So my 1st question is why do you need to mix calling conventions, and 
 depend
 on EFIAPI for interoperability. Why not just change the ABI on all
 functions?
>>> 
>>> GCC 4.4 doesn't support the command line option to change everything
>>> over. So, EFIAPI was the only option then.
>>> 
 Problems with the mixed calling convention:
 1) All assembly routines must be marked as EFIAPI, or the C code will
 generate the wrong calling convention. Not an issue in the MdePkg, but
 potentially an issue in other packages.
>>> 
>>> I don't see this as a problem. I think this is the rules that we have
>>> set up for EDK II. It just so happens that the GCC4X toolchains are
>>> the only ones that use EFIAPI, and thus are the only ones that allow
>>> us to keep our codebase clean with regards to EFIAPI.
>>> 
>> 
>> I agree it is good in keeping the edk2 code clean. I was more concerned 
>> about code from 3rd parties. 
>> So sorry this was more about a warning when you are porting code on what to 
>> look out for. 
>> 
>> I’m really only concerned about how the VA_LIST stuff is going to
>> work? Does it need to shift for native vs. EFIAPI? If so how you pass
>> the VA_LIST around if the code is not all the same ABI?
> 
> We need to distinguish passing arguments through the ellipsis (...) from
> passing VA_LIST (as a normal, named parameter).
> 
> For passing arguments through the ellipsis (...), the called function
> *must* be EFIAPI (in the current state of the tree). Otherwise
> VA_START() won't work in the callee.
> 
> (BTW I have no problem with the above "restriction".)
> 
> Regarding passing VA_LIST by name (which is identical to passing a
> simple CHAR8* by name) -- it already works fine, regardless of EFIAPI
> vs. no-EFIAPI.
> 

OK thanks for the info. For some flavors of GCC the __buildin_va_list is a 
structure. Since the size of the structure is > 8 bytes it is passed via a 
pointer per the calling conventions. For X64  EFIAPI VA_LIST is a pointer to 
the frame where the register based arguments have have been spilled. 

For clang  x86_64 __buildin_va_list is also a structure, so you can’t mix and 
match.  

Thanks,

Andrew Fish

~/work/Compiler>cat vv.c
#include 
#include 

int
main ()
{
  printf ("sizeof __builtin_va_list %lu\n", sizeof (__builtin_va_list));
  return 0;
}
~/work/Compiler>clang vv.c
~/work/Compiler>./a.out
sizeof __builtin_va_list 24
~/work/Compiler>clang -arch i386 vv.c
~/work/Compiler>./a.out
sizeof __builtin_va_list 4
~/work/Compiler>


> The problem discussed in this thread is unrelated to EFIAPI. The problem
> is (apparently) that gcc's -Os optimization corrupts a local variable in
> a chain of recursive calls.
> 
>>> For GCC >= 4.5, I actually think we should convert *RELEASE* builds
>>> over to using the ms-abi all the time to generate smaller code. I
>>> think we should leave DEBUG builds as mixed to help clean up EFIAPI
>>> issues.
>>> 
>> 
>> You guys should figure out if you can have a ms-abi but add the frame 
>> pointers. The compiler is open source ….
> 
> In my experimentation yesterday, one of the first things I tried (on
> "gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16)") was
> '-fno-omit-frame-pointer'.
> 
> Because, '-Os' implies '-fomit-frame-pointer', and at that point I
> thought that maybe '-fomit-frame-pointer', incurred by '-Os', was
> causing the issue.
> 
> So, I added '-fno-omit-frame-pointer' *after* -Os.
> 
> It triggered a "sorry, unimplemented" bug, which was very similar to
>  >:
> 
>sorry, unimplemented: ms_abi attribute requires
>-maccumulate-outgoing-args or subtarget optimization implying it
> 
> However, after appending '-maccumulate-outgoing-args' as well, the build
> resumed. (To clarify, this meant:
> 
>  -Os -fno-omit-frame-pointer -maccumulate-outgoing-args
> 
> .) Unfortunately, although the tree did build like this, the original
> issue persisted. Which I took as proof that the bug was unrelated to
> reserving or not reserving %rbp as frame pointer.
> 
> I wish I could write a small reproducer for this problem...
> 
>> 

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-05 Thread Laszlo Ersek
On 11/05/14 11:00, Laszlo Ersek wrote:

> I wish I could write a small reproducer for this problem...

I wrote such a reproducer. I'll post it as a separate patch, just for
discussion. If we all agree that the code should work, then I could turn
it into a gcc bug report.

Thanks!
Laszlo


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-05 Thread Laszlo Ersek
On 11/05/14 00:02, Andrew Fish wrote:
> 
>> On Nov 4, 2014, at 2:32 PM, Jordan Justen  wrote:
>>
>> On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
>>> So my 1st question is why do you need to mix calling conventions, and depend
>>> on EFIAPI for interoperability. Why not just change the ABI on all
>>> functions?
>>
>> GCC 4.4 doesn't support the command line option to change everything
>> over. So, EFIAPI was the only option then.
>>
>>> Problems with the mixed calling convention:
>>> 1) All assembly routines must be marked as EFIAPI, or the C code will
>>> generate the wrong calling convention. Not an issue in the MdePkg, but
>>> potentially an issue in other packages.
>>
>> I don't see this as a problem. I think this is the rules that we have
>> set up for EDK II. It just so happens that the GCC4X toolchains are
>> the only ones that use EFIAPI, and thus are the only ones that allow
>> us to keep our codebase clean with regards to EFIAPI.
>>
> 
> I agree it is good in keeping the edk2 code clean. I was more concerned about 
> code from 3rd parties. 
> So sorry this was more about a warning when you are porting code on what to 
> look out for. 
> 
> I’m really only concerned about how the VA_LIST stuff is going to
> work? Does it need to shift for native vs. EFIAPI? If so how you pass
> the VA_LIST around if the code is not all the same ABI?

We need to distinguish passing arguments through the ellipsis (...) from
passing VA_LIST (as a normal, named parameter).

For passing arguments through the ellipsis (...), the called function
*must* be EFIAPI (in the current state of the tree). Otherwise
VA_START() won't work in the callee.

(BTW I have no problem with the above "restriction".)

Regarding passing VA_LIST by name (which is identical to passing a
simple CHAR8* by name) -- it already works fine, regardless of EFIAPI
vs. no-EFIAPI.

The problem discussed in this thread is unrelated to EFIAPI. The problem
is (apparently) that gcc's -Os optimization corrupts a local variable in
a chain of recursive calls.

>> For GCC >= 4.5, I actually think we should convert *RELEASE* builds
>> over to using the ms-abi all the time to generate smaller code. I
>> think we should leave DEBUG builds as mixed to help clean up EFIAPI
>> issues.
>>
> 
> You guys should figure out if you can have a ms-abi but add the frame 
> pointers. The compiler is open source ….

In my experimentation yesterday, one of the first things I tried (on
"gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16)") was
'-fno-omit-frame-pointer'.

Because, '-Os' implies '-fomit-frame-pointer', and at that point I
thought that maybe '-fomit-frame-pointer', incurred by '-Os', was
causing the issue.

So, I added '-fno-omit-frame-pointer' *after* -Os.

It triggered a "sorry, unimplemented" bug, which was very similar to
:

sorry, unimplemented: ms_abi attribute requires
-maccumulate-outgoing-args or subtarget optimization implying it

However, after appending '-maccumulate-outgoing-args' as well, the build
resumed. (To clarify, this meant:

  -Os -fno-omit-frame-pointer -maccumulate-outgoing-args

.) Unfortunately, although the tree did build like this, the original
issue persisted. Which I took as proof that the bug was unrelated to
reserving or not reserving %rbp as frame pointer.

I wish I could write a small reproducer for this problem...

> 
> Thanks,
> 
> Andrew Fish
> 
>> -Jordan
> 
> 
> --
> ___
> edk2-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/edk2-devel
> 


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Andrew Fish

> On Nov 4, 2014, at 2:32 PM, Jordan Justen  wrote:
> 
> On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
>> So my 1st question is why do you need to mix calling conventions, and depend
>> on EFIAPI for interoperability. Why not just change the ABI on all
>> functions?
> 
> GCC 4.4 doesn't support the command line option to change everything
> over. So, EFIAPI was the only option then.
> 
>> Problems with the mixed calling convention:
>> 1) All assembly routines must be marked as EFIAPI, or the C code will
>> generate the wrong calling convention. Not an issue in the MdePkg, but
>> potentially an issue in other packages.
> 
> I don't see this as a problem. I think this is the rules that we have
> set up for EDK II. It just so happens that the GCC4X toolchains are
> the only ones that use EFIAPI, and thus are the only ones that allow
> us to keep our codebase clean with regards to EFIAPI.
> 

I agree it is good in keeping the edk2 code clean. I was more concerned about 
code from 3rd parties. 
So sorry this was more about a warning when you are porting code on what to 
look out for. 

I’m really only concerned about how the VA_LIST stuff is going to work? Does it 
need to shift for native vs. EFIAPI? If so how you pass the VA_LIST around if 
the code is not all the same ABI?

> For GCC >= 4.5, I actually think we should convert *RELEASE* builds
> over to using the ms-abi all the time to generate smaller code. I
> think we should leave DEBUG builds as mixed to help clean up EFIAPI
> issues.
> 

You guys should figure out if you can have a ms-abi but add the frame pointers. 
The compiler is open source ….

Thanks,

Andrew Fish

> -Jordan


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Tim Lewis
Another note (from the archives): vendors used mixed builds for VS2003/VS2005 
on 32-bit in order to use __fastcall for internal function calls and then 
EFIABI for all the various UEFI calls. 

Tim

-Original Message-
From: Jordan Justen [mailto:[email protected]] 
Sent: Tuesday, November 04, 2014 2:33 PM
To: Andrew J. Fish
Cc: Paolo Bonzini; [email protected]
Subject: Re: [edk2] Enable optimization for gcc x64 builds

On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
> So my 1st question is why do you need to mix calling conventions, and 
> depend on EFIAPI for interoperability. Why not just change the ABI on 
> all functions?

GCC 4.4 doesn't support the command line option to change everything over. So, 
EFIAPI was the only option then.

> Problems with the mixed calling convention:
> 1) All assembly routines must be marked as EFIAPI, or the C code will 
> generate the wrong calling convention. Not an issue in the MdePkg, but 
> potentially an issue in other packages.

I don't see this as a problem. I think this is the rules that we have set up 
for EDK II. It just so happens that the GCC4X toolchains are the only ones that 
use EFIAPI, and thus are the only ones that allow us to keep our codebase clean 
with regards to EFIAPI.

For GCC >= 4.5, I actually think we should convert *RELEASE* builds over to 
using the ms-abi all the time to generate smaller code. I think we should leave 
DEBUG builds as mixed to help clean up EFIAPI issues.

-Jordan

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Jordan Justen
On Tue, Nov 4, 2014 at 9:28 AM, Andrew Fish  wrote:
> So my 1st question is why do you need to mix calling conventions, and depend
> on EFIAPI for interoperability. Why not just change the ABI on all
> functions?

GCC 4.4 doesn't support the command line option to change everything
over. So, EFIAPI was the only option then.

> Problems with the mixed calling convention:
> 1) All assembly routines must be marked as EFIAPI, or the C code will
> generate the wrong calling convention. Not an issue in the MdePkg, but
> potentially an issue in other packages.

I don't see this as a problem. I think this is the rules that we have
set up for EDK II. It just so happens that the GCC4X toolchains are
the only ones that use EFIAPI, and thus are the only ones that allow
us to keep our codebase clean with regards to EFIAPI.

For GCC >= 4.5, I actually think we should convert *RELEASE* builds
over to using the ms-abi all the time to generate smaller code. I
think we should leave DEBUG builds as mixed to help clean up EFIAPI
issues.

-Jordan

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Andrew Fish

> On Nov 4, 2014, at 12:15 PM, Scott Duplichan  wrote:
> 
> But pedantically you need change the definition of BasePrintLibSPrint() to 
> include EFIAPI. 
>  
> If you look at BasePrintLibSPrintMarker() (and some of the other routines) 
> you will notice a BASE_LIST and a VA_LIST. We had an API that exposed a 
> VA_LIST (well code that was casting to VA_LIST) in the report status code 
> stack and it forced use to use our own made up BASE_LIST concept to get it to 
> work. I think you are going to hit similar issues mixing calling conventions. 
>  
> So my 1st question is why do you need to mix calling conventions, and depend 
> on EFIAPI for interoperability. Why not just change the ABI on all functions?
>  
> If I understand your question "Why not just change the ABI on all functions", 
> you mean use Microsoft ABI throughout the code even when compiled with gcc? 
> The gcc option -mabi=ms makes this easy, and it reduces code size too (8% in 
> one test). Part of that code size reduction is because it removes the 
> requirement to save xmm6-xmm15 when calling msabi code. Gcc doesn't optimize 
> the save/restore of xmm6-xmm15, it just does them all. The problems with ms 
> abi I can think of are:
> 1) Linux developers accustomed to stepping through the sysv calling 
> convention would have to adapt to the ms calling convention.

Well all the public interfaces (EFI service, protocol member functions, etc.) 
are EFIAPI, so there will be a mix. 

> 2) -mabi=ms is probably  little used and therefore more likely to have bugs. 
> This might require dropping support for older gcc tool chains.

Looks like mixing has bugs too. It looks like this issue is caused by a 
mismatch in the vararg definitions between the two worlds. You can’t really mix 
styles in a given call stack. It almost seems like you want to force one var 
arg scheme every place possible. 

> 3) According to an email from you in April, ms abi might not support stack 
> trace without debug symbols.
>  

This is not the ABI it is VC++ code generation. There is  nothing in the ABI 
about how to unwind a stack frame, it is about how to call code in C. 

In my clang examples, in this thread,  we have an EFIAPI compatible calling 
convention with stack unwind. -target x86_64-pc-win32-macho means build an X64 
image using EFIAPI, do the standard frame pointer, and we kick out a Mach-O 
executable for the debugger. We convert the Mach-O to PE/COFF for EFI 
compatibility. So on clang it as EFIABI, but with stack unwind. We can always 
unwind a frame without symbols, until we hit code compiled with VC++. 

> Even if ms abi is never made the default for gcc code, adding an environment 
> variable such as EXTRA_CC_FLAGS would allow its use in custom builds by those 
> who need the code size reduction it brings.
>  
> What about switching EDK2 to sysv abi? I assume that would require dropping 
> support for Microsoft compilers. 


The EFI calling convention is in the spec, so all things EFI would break. 
Option ROMs on cards, installed Operating system, etc…. The edk2 is an open 
source project that implements industry standard, not just a chunk of code. 

Thanks,

Andrew Fish

>  
> Thanks,
> Scott

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Scott Duplichan
Andrew Fish [mailto:[email protected]]  wrote:


Sent: Tuesday, November 04, 2014 11:29 AM
To: [email protected]
Cc: Paolo Bonzini
Subject: Re: [edk2] Enable optimization for gcc x64 builds

 

 

On Nov 4, 2014, at 5:48 AM, Laszlo Ersek mailto:[email protected]> > wrote:

 

diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
index 8dc5ec7..fbb3726 100644
--- a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
+++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
@@ -680,10 +680,20 @@ BasePrintLibSPrintMarker (
if (TmpGuid == NULL) {
  ArgumentString = "";
} else {
+  UINTN (EFIAPI * volatile PrintFunction) (
+ OUT CHAR8*StartOfBuffer,
+ IN  UINTNBufferSize,
+ IN  UINTNFlags,
+ IN  CONST CHAR8  *FormatString,
+ ...
+ );
+
+  PrintFunction = BasePrintLibSPrint;
+
  GuidData1 = ReadUnaligned32 (&(TmpGuid->Data1));
  GuidData2 = ReadUnaligned16 (&(TmpGuid->Data2));
  GuidData3 = ReadUnaligned16 (&(TmpGuid->Data3));
-  BasePrintLibSPrint (
+  PrintFunction (
ValueBuffer,
MAXIMUM_VALUE_CHARACTERS,
0,


With this patch, GUIDs are printed okay with -Os.

(Of course it's not edk2 that needs to be fixed.)



 

But pedantically you need change the definition of BasePrintLibSPrint() to 
include EFIAPI. 

 

If you look at BasePrintLibSPrintMarker() (and some of the other routines) you 
will notice a BASE_LIST and a VA_LIST. We had an API that exposed a VA_LIST 
(well code that was casting to VA_LIST) in the report status code stack and it 
forced use to use our own made up BASE_LIST concept to get it to work. I think 
you are going to hit similar issues mixing calling conventions. 

 

So my 1st question is why do you need to mix calling conventions, and depend on 
EFIAPI for interoperability. Why not just change the ABI on all functions?

 

If I understand your question "Why not just change the ABI on all functions", 
you mean use Microsoft ABI throughout the code even when compiled with gcc? The 
gcc option -mabi=ms makes this easy, and it reduces code size too (8% in one 
test). Part of that code size reduction is because it removes the requirement 
to save xmm6-xmm15 when calling msabi code. Gcc doesn't optimize the 
save/restore of xmm6-xmm15, it just does them all. The problems with ms abi I 
can think of are:

1) Linux developers accustomed to stepping through the sysv calling convention 
would have to adapt to the ms calling convention.

2) -mabi=ms is probably  little used and therefore more likely to have bugs. 
This might require dropping support for older gcc tool chains.

3) According to an email from you in April, ms abi might not support stack 
trace without debug symbols.

 

Even if ms abi is never made the default for gcc code, adding an environment 
variable such as EXTRA_CC_FLAGS would allow its use in custom builds by those 
who need the code size reduction it brings.

 

What about switching EDK2 to sysv abi? I assume that would require dropping 
support for Microsoft compilers. 

 

Thanks,

Scott

 

Problems with the mixed calling convention:

1) All assembly routines must be marked as EFIAPI, or the C code will generate 
the wrong calling convention. Not an issue in the MdePkg, but potentially an 
issue in other packages. 

2) The var arg chain needs to be constant. I think for i386 you get lucky and 
the calling conventions are close enough it kind of works, but for X64 they are 
very different. Even if you force VA_LIST to be the Microsoft one, it is not 
clear to me that forces the compiler to treat every … the Microsoft way. 

 

Thanks,

 

Andrew Fish

 

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Laszlo Ersek
On 11/04/14 18:28, Andrew Fish wrote:
> 
>> On Nov 4, 2014, at 5:48 AM, Laszlo Ersek > > wrote:
>>
>> diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
>> b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
>> index 8dc5ec7..fbb3726 100644
>> --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
>> +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
>> @@ -680,10 +680,20 @@ BasePrintLibSPrintMarker (
>> if (TmpGuid == NULL) {
>>   ArgumentString = "";
>> } else {
>> +  UINTN (EFIAPI * volatile PrintFunction) (
>> + OUT CHAR8*StartOfBuffer,
>> + IN  UINTNBufferSize,
>> + IN  UINTNFlags,
>> + IN  CONST CHAR8  *FormatString,
>> + ...
>> + );
>> +
>> +  PrintFunction = BasePrintLibSPrint;
>> +
>>   GuidData1 = ReadUnaligned32 (&(TmpGuid->Data1));
>>   GuidData2 = ReadUnaligned16 (&(TmpGuid->Data2));
>>   GuidData3 = ReadUnaligned16 (&(TmpGuid->Data3));
>> -  BasePrintLibSPrint (
>> +  PrintFunction (
>> ValueBuffer,
>> MAXIMUM_VALUE_CHARACTERS,
>> 0,
>> 
>>
>> With this patch, GUIDs are printed okay with -Os.
>>
>> (Of course it's not edk2 that needs to be fixed.)
>>
> 
> But pedantically you need change the definition of BasePrintLibSPrint()
> to include EFIAPI. 

I tried that (without applying above patch, before posting my message);
it didn't help.

Thanks,
Laszlo

> 
> If you look at BasePrintLibSPrintMarker() (and some of the other
> routines) you will notice a BASE_LIST and a VA_LIST. We had an API that
> exposed a VA_LIST (well code that was casting to VA_LIST) in the report
> status code stack and it forced use to use our own made up BASE_LIST
> concept to get it to work. I think you are going to hit similar issues
> mixing calling conventions. 
> 
> So my 1st question is why do you need to mix calling conventions, and
> depend on EFIAPI for interoperability. Why not just change the ABI on
> all functions?
> 
> Problems with the mixed calling convention:
> 1) All assembly routines must be marked as EFIAPI, or the C code will
> generate the wrong calling convention. Not an issue in the MdePkg, but
> potentially an issue in other packages. 
> 2) The var arg chain needs to be constant. I think for i386 you get
> lucky and the calling conventions are close enough it kind of works, but
> for X64 they are very different. Even if you force VA_LIST to be the
> Microsoft one, it is not clear to me that forces the compiler to treat
> every … the Microsoft way. 
> 
> Thanks,
> 
> Andrew Fish
> 
> 
> 
> --
> 
> 
> 
> ___
> edk2-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/edk2-devel
> 


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-04 Thread Andrew Fish

> On Nov 4, 2014, at 5:48 AM, Laszlo Ersek  wrote:
> 
> diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> index 8dc5ec7..fbb3726 100644
> --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> @@ -680,10 +680,20 @@ BasePrintLibSPrintMarker (
> if (TmpGuid == NULL) {
>   ArgumentString = "";
> } else {
> +  UINTN (EFIAPI * volatile PrintFunction) (
> + OUT CHAR8*StartOfBuffer,
> + IN  UINTNBufferSize,
> + IN  UINTNFlags,
> + IN  CONST CHAR8  *FormatString,
> + ...
> + );
> +
> +  PrintFunction = BasePrintLibSPrint;
> +
>   GuidData1 = ReadUnaligned32 (&(TmpGuid->Data1));
>   GuidData2 = ReadUnaligned16 (&(TmpGuid->Data2));
>   GuidData3 = ReadUnaligned16 (&(TmpGuid->Data3));
> -  BasePrintLibSPrint (
> +  PrintFunction (
> ValueBuffer,
> MAXIMUM_VALUE_CHARACTERS,
> 0,
> 
> 
> With this patch, GUIDs are printed okay with -Os.
> 
> (Of course it's not edk2 that needs to be fixed.)
> 

But pedantically you need change the definition of BasePrintLibSPrint() to 
include EFIAPI. 

If you look at BasePrintLibSPrintMarker() (and some of the other routines) you 
will notice a BASE_LIST and a VA_LIST. We had an API that exposed a VA_LIST 
(well code that was casting to VA_LIST) in the report status code stack and it 
forced use to use our own made up BASE_LIST concept to get it to work. I think 
you are going to hit similar issues mixing calling conventions. 

So my 1st question is why do you need to mix calling conventions, and depend on 
EFIAPI for interoperability. Why not just change the ABI on all functions?

Problems with the mixed calling convention:
1) All assembly routines must be marked as EFIAPI, or the C code will generate 
the wrong calling convention. Not an issue in the MdePkg, but potentially an 
issue in other packages. 
2) The var arg chain needs to be constant. I think for i386 you get lucky and 
the calling conventions are close enough it kind of works, but for X64 they are 
very different. Even if you force VA_LIST to be the Microsoft one, it is not 
clear to me that forces the compiler to treat every … the Microsoft way. 

Thanks,

Andrew Fish

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Andrew Fish

> On Nov 3, 2014, at 12:24 PM, Scott Duplichan  wrote:
> 
> Laszlo Ersek [mailto:[email protected] ] wrote:
> 
> ]On 10/29/14 05:59, Scott Duplichan wrote:
> ]> Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32
> ]> builds, -Os (optimize for small code size) is used. Why is this? Apparently
> ]> it is because variable argument list handling fails when gcc X64 
> optimization
> ]> is enabled. The solution is an improvement to the patch of SVN rev 10440:
> ]> http://sourceforge.net/p/edk2/mailman/message/2512/ 
> 
> ]
> ]My reading of r10440 is different. As far as I understand,
> ]
> ]  (gcc-4.4, X64, stdarg builtins)
> ]
> ]is simply a broken a combination, regardless of optimization.
> 
> You are right about gcc X64 builds using the standard (native)
> stdarg builtins. Without the original r10440 patch, a test using
> Duet crashes early on. The exception handler dump has bogus values,
> probably due to the same stdarg problem.
> 
> My point was why can't -Os be used for the current gcc X64 build like
> it is for the IA32 build? Maybe r10440 is not relevant enough to even 
> be mentioned. What I found is adding -Os to the X64 Duet project
> causes the 'g' (GUID) format to malfunction. There may be other 
> formatting problems, but this one is most obvious in the log file:
> 
> X64 Duet boot log from gcc X64 build (standard, correct):
> WELCOME TO EFI WORLD!
> InstallProtocolInterface: D2B2B828-0826-48A7-B3DF-983C006024F0 1FDF9D58
> HOBLIST address in DXE = 0x1F3DA018
> Memory Allocation 0x0004 0x1FD69000 - 0x1FD88FFF
> Memory Allocation 0x0004 0x1F964000 - 0x1FD68FFF
> Memory Allocation 0x0003 0x1FDB9000 - 0x1FE0
> FV Hob0x1FE1 - 0x1FFA
> InstallProtocolInterface: D8117CFE-94A6-11D4-9A3A-0090273FC14D 1FDF2AC0
> InstallProtocolInterface: 8F644FA9-E850-4DB1-9CE2-0B44698E8DA4 1F3D6A30
> InstallProtocolInterface: 09576E91-6D3F-11D2-8E39-00A0C969723B 1F3D7818
> 
> X64 Duet boot log from gcc X64 build (with -Os added):
> WELCOME TO EFI WORLD!
> InstallProtocolInterface: -3030-332D3030-2D30303030 1FE6F8C0
> HOBLIST address in DXE = 0x1F461018
> Memory Allocation 0x0004 0x1FDF - 0x1FE0
> Memory Allocation 0x0004 0x1F9EB000 - 0x1FDE
> Memory Allocation 0x0003 0x1FE4 - 0x1FE7
> FV Hob0x1FE8 - 0x1FFA
> InstallProtocolInterface: 1F9EAB08-46312000-342D3830-2D30303030 1FE69070
> InstallProtocolInterface: 0001-30300200-332D3130-2D30303230 1F45DA30
> InstallProtocolInterface: 0001-30308FA1-332D3130-2D31414630 1F45E818
> 
> If "-Os -mabi=ms" is used for the gcc X64 build, then the pre-r10440
> method (using the native stdarg builtins) works. But that is just hiding
> the problem.
> 
> The __builtin_ms_va_* macros for cross ABI use are not well documented
> as far as I can find. File cross-stdarg.h is about it. But they have been
> around for a long time, at least since gcc 4.4.
> 

It seems if you make EFI VA_LIST point to __buitin_ms_va__* then you need to 
decorate any function using VA_LIST with EFIAPI to make sure the code gen, 
calling, and va_list all match up?

The EFI rules are documented here: 
http://msdn.microsoft.com/en-us/library/9b372w95.aspx

From debugging problems like this in the past you can usually figure it out 
from the assembly. 
The EFI/VC++ rules are very simple, like passing parameters, and the marker is 
a pointer to a stack frame looking thing. The Unix version is much more 
complicated and the marker is sometimes a data structure. The rules in Unix for 
floating point are very complex so you tend to see more code overhead in the 
Unix flow.

In the following example I compiled it 1st for Unix and then for EFI calling 
convention. Note the added complexity in the p() function assembly introduced 
by the more complex Unix rules. Also note that Unix passes 6 values in 
registers, and EFI/VC++ is just 4 registers, and the order of the registers are 
different. 

When we were adding the x86_64-pc-win32-macho target to clang we found a few 
places where the compiler emitted the wrong from of the var args. So we tracked 
it down by looking at the assembly, then we built a simple stand alone case to 
file a bug against the compiler. 

~/work/Compiler>cat va.c

#include 

int printf (const char *, ...);

void 
p2 (int a, __builtin_va_list *valist)
{
  int i;

  for (i=0; i clang -S -Os va.c
~/work/Compiler>cat va.S
.section__TEXT,__text,regular,pure_instructions
.globl  _p2
_p2:## @p2
.cfi_startproc
## BB#0:
pushq   %rbp
Ltmp3:
.cfi_def_cfa_offset 16
Ltmp4:
.cfi_offset %rbp, -16
movq%rsp, %rbp
Ltmp5:
.cfi_def_cfa_register %rbp
pushq   %r15
pushq   %r14
pushq   %rbx
pushq   %rax
Ltmp6:
.cfi_offs

Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Scott Duplichan
Bruce Cran [mailto:[email protected]]  wrote:

]On 11/3/2014 9:21 AM, Laszlo Ersek wrote:
]
]> So Scott's patch seems to be aligned with the
]> tradition. (Currently no gcc optimizations are enabled at all when
]> building for X64, neither for speed nor for size.)
]
]Doesn't the following cause X64 builds to use -Os (and IA32 to use -O2, 
]if it wasn't overridden by -Os by later, version-specific flags)?
]
]DEFINE GCC_ALL_CC_FLAGS= -g -Os -fshort-wchar 
]-fno-strict-aliasing -Wall -Werror -Wno-array-bounds -c -include AutoGen.h
]
]DEFINE GCC_IA32_CC_FLAGS   = DEF(GCC_ALL_CC_FLAGS) -m32 
]-malign-double -freorder-blocks -freorder-blocks-and-partition -O2 
]-mno-stack-arg-probe
]
]DEFINE GCC_X64_CC_FLAGS= DEF(GCC_ALL_CC_FLAGS) -mno-red-zone 
]-Wno-address -mno-stack-arg-probeq
]
]-- 
]Bruce

I believe you are right for the case of UNIXGCC and CYGGCC X64. ELFGCC
X64 also uses -Os. But I happened to be testing with GCC49 X64, and that
one doesn't use any -O flag. Reducing/consolidating the various gcc tool
chain definitions would be helpful.

Thanks,
Scott


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Bruce Cran
Ah, thanks - I failed to trace it all the way through and just assumed
it *was* used by one of the BUILDTYPE_TOOLCHAIN_ARCH_CC_FLAGS lines.

-- 
Bruce

On Mon, Nov 3, 2014 at 1:20 PM, Laszlo Ersek  wrote:

> No; GCC_X64_CC_FLAGS is an "internal use" define. Its name doesn't
> follow the
>
>   (DEBUG|RELEASE|NOOPT)___CC_FLAGS
>
> pattern, hence it has no direct effect. It only has an effect via
> assignment to macros that *do* follow the above pattern, and such
> assignments don't seem to exist (for the cases that we presently care
> about):

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Scott Duplichan
Laszlo Ersek [mailto:[email protected]] wrote:

]On 10/29/14 05:59, Scott Duplichan wrote:
]> Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32
]> builds, -Os (optimize for small code size) is used. Why is this? Apparently
]> it is because variable argument list handling fails when gcc X64 optimization
]> is enabled. The solution is an improvement to the patch of SVN rev 10440:
]> http://sourceforge.net/p/edk2/mailman/message/2512/
]
]My reading of r10440 is different. As far as I understand,
]
]  (gcc-4.4, X64, stdarg builtins)
]
]is simply a broken a combination, regardless of optimization.

You are right about gcc X64 builds using the standard (native)
stdarg builtins. Without the original r10440 patch, a test using
Duet crashes early on. The exception handler dump has bogus values,
probably due to the same stdarg problem.

My point was why can't -Os be used for the current gcc X64 build like
it is for the IA32 build? Maybe r10440 is not relevant enough to even 
be mentioned. What I found is adding -Os to the X64 Duet project
causes the 'g' (GUID) format to malfunction. There may be other 
formatting problems, but this one is most obvious in the log file:

X64 Duet boot log from gcc X64 build (standard, correct):
 WELCOME TO EFI WORLD!
InstallProtocolInterface: D2B2B828-0826-48A7-B3DF-983C006024F0 1FDF9D58
HOBLIST address in DXE = 0x1F3DA018
Memory Allocation 0x0004 0x1FD69000 - 0x1FD88FFF
Memory Allocation 0x0004 0x1F964000 - 0x1FD68FFF
Memory Allocation 0x0003 0x1FDB9000 - 0x1FE0
FV Hob0x1FE1 - 0x1FFA
InstallProtocolInterface: D8117CFE-94A6-11D4-9A3A-0090273FC14D 1FDF2AC0
InstallProtocolInterface: 8F644FA9-E850-4DB1-9CE2-0B44698E8DA4 1F3D6A30
InstallProtocolInterface: 09576E91-6D3F-11D2-8E39-00A0C969723B 1F3D7818

X64 Duet boot log from gcc X64 build (with -Os added):
 WELCOME TO EFI WORLD!
InstallProtocolInterface: -3030-332D3030-2D30303030 1FE6F8C0
HOBLIST address in DXE = 0x1F461018
Memory Allocation 0x0004 0x1FDF - 0x1FE0
Memory Allocation 0x0004 0x1F9EB000 - 0x1FDE
Memory Allocation 0x0003 0x1FE4 - 0x1FE7
FV Hob0x1FE8 - 0x1FFA
InstallProtocolInterface: 1F9EAB08-46312000-342D3830-2D30303030 1FE69070
InstallProtocolInterface: 0001-30300200-332D3130-2D30303230 1F45DA30
InstallProtocolInterface: 0001-30308FA1-332D3130-2D31414630 1F45E818

If "-Os -mabi=ms" is used for the gcc X64 build, then the pre-r10440
method (using the native stdarg builtins) works. But that is just hiding
the problem.

The __builtin_ms_va_* macros for cross ABI use are not well documented
as far as I can find. File cross-stdarg.h is about it. But they have been
around for a long time, at least since gcc 4.4.

Thanks,
scott 


]> The patch in this email only adds gcc X64 optimization for gcc versions 4.8
]> and newer.
]
]What happens if you add -Os for gcc-4.8+ (X64) without touching
]NO_BUILTIN_VA_FUNCS and the VA_* macros? Just curious.

See 'g' formatting problem above.

]What implies any connection between lack of -Os and VA_*?

The fact that -Os is used throughout edk2, except for the one
case where VA_* prevents it.

]Thanks!
]Laszlo
]
]> This is because testing with older versions of gcc is a lot of
]> work. On the other hand, the patch could be a lot simpler if it were to
]> ignore gcc version. The patch is boot tested using Duet with gcc 4.8.2 and
]> gcc 4.9.1. For these two cases, the print formatting problem is resolved
]> by the patch.
]> 
]> Should we:
]> 1) Restrict the change to recent gcc versions where testing is easy
]>(approach of included patch)
]> 2) Apply the change to all gcc versions, and let older versions go
]>untested?
]> 3) Try to find/build the needed older gcc versions so that the patch
]>can apply to all versions and be tested too
]> 
]> Thanks,
]> Scott



--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Laszlo Ersek
On 11/03/14 20:54, Bruce Cran wrote:
> On 11/3/2014 9:21 AM, Laszlo Ersek wrote:
> 
>> So Scott's patch seems to be aligned with the
>> tradition. (Currently no gcc optimizations are enabled at all when
>> building for X64, neither for speed nor for size.)
> 
> Doesn't the following cause X64 builds to use -Os (and IA32 to use -O2, 
> if it wasn't overridden by -Os by later, version-specific flags)?
> 
> DEFINE GCC_ALL_CC_FLAGS= -g -Os -fshort-wchar 
> -fno-strict-aliasing -Wall -Werror -Wno-array-bounds -c -include AutoGen.h
> 
> DEFINE GCC_IA32_CC_FLAGS   = DEF(GCC_ALL_CC_FLAGS) -m32 
> -malign-double -freorder-blocks -freorder-blocks-and-partition -O2 
> -mno-stack-arg-probe
> 
> DEFINE GCC_X64_CC_FLAGS= DEF(GCC_ALL_CC_FLAGS) -mno-red-zone 
> -Wno-address -mno-stack-arg-probeq
> 

No; GCC_X64_CC_FLAGS is an "internal use" define. Its name doesn't
follow the

  (DEBUG|RELEASE|NOOPT)___CC_FLAGS

pattern, hence it has no direct effect. It only has an effect via
assignment to macros that *do* follow the above pattern, and such
assignments don't seem to exist (for the cases that we presently care
about):

*_GCC44_X64_CC_FLAGS = DEF(GCC44_X64_CC_FLAGS)
*_GCC45_X64_CC_FLAGS = DEF(GCC45_X64_CC_FLAGS)
*_GCC46_X64_CC_FLAGS = DEF(GCC46_X64_CC_FLAGS)
*_GCC47_X64_CC_FLAGS = DEF(GCC47_X64_CC_FLAGS)
*_GCC48_X64_CC_FLAGS = DEF(GCC48_X64_CC_FLAGS)
*_GCC49_X64_CC_FLAGS = DEF(GCC49_X64_CC_FLAGS)

The RHS macros all chain to GCC44_ALL_CC_FLAGS (recursively), which is
ultimately open-coded as:

DEFINE GCC44_ALL_CC_FLAGS= -g -fshort-wchar
-fno-strict-aliasing -Wall -Werror -Wno-array-bounds -ffunction-sections
-fdata-sections -c -include AutoGen.h
-DSTRING_ARRAY_NAME=$(BASE_NAME)Strings

In short, GCC_X64_CC_FLAGS is not used where it would matter (for this
case).

Thanks
Laszlo


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Bruce Cran
On 11/3/2014 9:21 AM, Laszlo Ersek wrote:

> So Scott's patch seems to be aligned with the
> tradition. (Currently no gcc optimizations are enabled at all when
> building for X64, neither for speed nor for size.)

Doesn't the following cause X64 builds to use -Os (and IA32 to use -O2, 
if it wasn't overridden by -Os by later, version-specific flags)?

DEFINE GCC_ALL_CC_FLAGS= -g -Os -fshort-wchar 
-fno-strict-aliasing -Wall -Werror -Wno-array-bounds -c -include AutoGen.h

DEFINE GCC_IA32_CC_FLAGS   = DEF(GCC_ALL_CC_FLAGS) -m32 
-malign-double -freorder-blocks -freorder-blocks-and-partition -O2 
-mno-stack-arg-probe

DEFINE GCC_X64_CC_FLAGS= DEF(GCC_ALL_CC_FLAGS) -mno-red-zone 
-Wno-address -mno-stack-arg-probeq

-- 
Bruce

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Laszlo Ersek
On 10/29/14 17:18, Tim Lewis wrote:
> Scott --
> 
> For historical perspective, the EDK2 build flags have focused on
> space over speed because of the code size constraints placed on
> flash-resident code. Not being as familiar with gcc as I am with
> VS20xx, I don't know whether these can be set together.

Maybe I'm misunderstanding (and in that case I apologize), but I'm under
the impression that you are misunderstanding '-Os'. '-Os' optimizes for
size, not speed. So Scott's patch seems to be aligned with the
tradition. (Currently no gcc optimizations are enabled at all when
building for X64, neither for speed nor for size.)

   -Os Optimize for size.  -Os enables all -O2 optimizations that
   do not typically increase code size.  It also performs
   further optimizations designed to reduce code size.

   -Os disables the following optimization flags:
   -falign-functions  -falign-jumps  -falign-loops
   -falign-labels  -freorder-blocks
   -freorder-blocks-and-partition -fprefetch-loop-arrays
   -ftree-vect-loop-version

Again I'm sorry if I misunderstood you.

Thanks
Laszlo

> 
> Tim
> 
> -Original Message-
> From: Scott Duplichan [mailto:[email protected]] 
> Sent: Tuesday, October 28, 2014 9:59 PM
> To: [email protected]
> Subject: [edk2] Enable optimization for gcc x64 builds
> 
> Optimization is not enabled for x64 builds using gcc 4.4-4.9. For
> IA32 builds, -Os (optimize for small code size) is used. Why is this?
> Apparently it is because variable argument list handling fails when
> gcc X64 optimization is enabled. The solution is an improvement to
> the patch of SVN rev 10440: 
> http://sourceforge.net/p/edk2/mailman/message/2512/
> 
> The patch in this email only adds gcc X64 optimization for gcc
> versions 4.8 and newer. This is because testing with older versions
> of gcc is a lot of work. On the other hand, the patch could be a lot
> simpler if it were to ignore gcc version. The patch is boot tested
> using Duet with gcc 4.8.2 and gcc 4.9.1. For these two cases, the
> print formatting problem is resolved by the patch.
> 
> Should we:
> 1) Restrict the change to recent gcc versions where testing is easy
>(approach of included patch)
> 2) Apply the change to all gcc versions, and let older versions go
>untested?
> 3) Try to find/build the needed older gcc versions so that the patch
>can apply to all versions and be tested too
> 
> Thanks,
> Scott
> 


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-11-03 Thread Laszlo Ersek
On 10/29/14 05:59, Scott Duplichan wrote:
> Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32
> builds, -Os (optimize for small code size) is used. Why is this? Apparently
> it is because variable argument list handling fails when gcc X64 optimization
> is enabled. The solution is an improvement to the patch of SVN rev 10440:
> http://sourceforge.net/p/edk2/mailman/message/2512/

My reading of r10440 is different. As far as I understand,

  (gcc-4.4, X64, stdarg builtins)

is simply a broken a combination, regardless of optimization.

> The patch in this email only adds gcc X64 optimization for gcc versions 4.8
> and newer.

What happens if you add -Os for gcc-4.8+ (X64) without touching
NO_BUILTIN_VA_FUNCS and the VA_* macros? Just curious.

What implies any connection between lack of -Os and VA_*?

Thanks!
Laszlo

> This is because testing with older versions of gcc is a lot of
> work. On the other hand, the patch could be a lot simpler if it were to
> ignore gcc version. The patch is boot tested using Duet with gcc 4.8.2 and
> gcc 4.9.1. For these two cases, the print formatting problem is resolved
> by the patch.
> 
> Should we:
> 1) Restrict the change to recent gcc versions where testing is easy
>(approach of included patch)
> 2) Apply the change to all gcc versions, and let older versions go
>untested?
> 3) Try to find/build the needed older gcc versions so that the patch
>can apply to all versions and be tested too
> 
> Thanks,
> Scott
> 
> 
> 
> --
> 
> 
> 
> ___
> edk2-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/edk2-devel
> 


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-10-29 Thread Scott Duplichan
Bruce Cran [mailto:[email protected]] wrote:

]On Tue, Oct 28, 2014 at 10:59 PM, Scott Duplichan  wrote:
]> Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32
]> builds, -Os (optimize for small code size) is used. Why is this? Apparently
]> it is because variable argument list handling fails when gcc X64 optimization
]> is enabled. The solution is an improvement to the patch of SVN rev 10440:
]> http://sourceforge.net/p/edk2/mailman/message/2512/
]
]Related to gcc flags, should we also enable -ffreestanding by default,
]or at least -fno-builtin?

Those flags certainly seem appropriate for UEFI use. In fact, AARCH64, XCLANG,
XCODE5 already use -fno-builtin. But the reality is that most of the patches I
submit get rejected, even when there is a clear benefit. To even have a chance
of getting these options approved, there would have to be some easy to 
demonstrate
benefit. For example, maybe using these options can prevent gcc from replacing
copy loops in source code with a call to memcpy.

Thanks,
Scott

]-- 
]Bruce



--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-10-29 Thread Scott Duplichan
Tim Lewis [mailto:[email protected]] wrote:

]Scott --
]
]For historical perspective, the EDK2 build flags have focused on space over 
speed because of the code size constraints placed on ]flash-resident code. Not 
being as familiar with gcc as I am with VS20xx, I don't know whether these can 
be set together.
]
]Tim

Hello Tim,

I agree targeting small image size is a reasonable goal. Optimizations that 
trade
size for speed generally do not benefit UEFI or other boot code due to I/O 
overhead.
Though flash capacity has increased a lot lately, the access time can have a 
negative
effect on boot time. The current gcc x64 flags are not consistent with the goal 
of
small code size. This patch addresses that. For example, shell.efi x64 release 
build:

gcc 4.9.1 current  1295 kb
gcc 4.9.1 w/patch  1013 kb   
VS2010  865 kb

What do you refer to by "set together"?

Thanks,
Scott


-Original Message-
From: Scott Duplichan [mailto:[email protected]] 
Sent: Tuesday, October 28, 2014 9:59 PM
To: [email protected]
Subject: [edk2] Enable optimization for gcc x64 builds

Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32 builds, 
-Os (optimize for small code size) is used. Why is this? Apparently it is 
because variable argument list handling fails when gcc X64 optimization is 
enabled. The solution is an improvement to the patch of SVN rev 10440:
http://sourceforge.net/p/edk2/mailman/message/2512/

The patch in this email only adds gcc X64 optimization for gcc versions 4.8 and 
newer. This is because testing with older versions of gcc is a lot of work. On 
the other hand, the patch could be a lot simpler if it were to ignore gcc 
version. The patch is boot tested using Duet with gcc 4.8.2 and gcc 4.9.1. For 
these two cases, the print formatting problem is resolved by the patch.

Should we:
1) Restrict the change to recent gcc versions where testing is easy
   (approach of included patch)
2) Apply the change to all gcc versions, and let older versions go
   untested?
3) Try to find/build the needed older gcc versions so that the patch
   can apply to all versions and be tested too

Thanks,
Scott

-- 

Enable optimization for gcc x64 builds.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Scott Duplichan 

--

Index: BaseTools/Conf/tools_def.template 
===
--- BaseTools/Conf/tools_def.template   (revision 16254)
+++ BaseTools/Conf/tools_def.template   (working copy)
@@ -3843,7 +3843,7 @@
 
 DEFINE GCC44_ALL_CC_FLAGS= -g -fshort-wchar -fno-strict-aliasing 
-Wall -Werror -Wno-array-bounds -ffunction-sections -fdata-sections -c -include 
AutoGen.h -DSTRING_ARRAY_NAME=$(BASE_NAME)Strings
 DEFINE GCC44_IA32_CC_FLAGS   = DEF(GCC44_ALL_CC_FLAGS) -m32 
-malign-double -fno-stack-protector -D EFI32
-DEFINE GCC44_X64_CC_FLAGS= DEF(GCC44_ALL_CC_FLAGS) -m64 
-fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" -DNO_BUILTIN_VA_FUNCS 
-mno-red-zone -Wno-address -mcmodel=large
+DEFINE GCC44_X64_CC_FLAGS= DEF(GCC44_ALL_CC_FLAGS) -m64 
-fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" -mno-red-zone 
-Wno-address -mcmodel=large
 DEFINE GCC44_IA32_X64_DLINK_COMMON   = -nostdlib -n -q --gc-sections 
--script=$(EDK_TOOLS_PATH)/Scripts/gcc4.4-ld-script
 DEFINE GCC44_IA32_X64_ASLDLINK_FLAGS = DEF(GCC44_IA32_X64_DLINK_COMMON) 
--entry ReferenceAcpiTable -u ReferenceAcpiTable
 DEFINE GCC44_IA32_X64_DLINK_FLAGS= DEF(GCC44_IA32_X64_DLINK_COMMON) 
--entry $(IMAGE_ENTRY_POINT) -u $(IMAGE_ENTRY_POINT) -Map 
$(DEST_DIR_DEBUG)/$(BASE_NAME).map
@@ -4420,7 +4420,7 @@
 *_GCC48_X64_ASLCC_FLAGS  = DEF(GCC_ASLCC_FLAGS) -m64
 *_GCC48_X64_ASLDLINK_FLAGS   = DEF(GCC48_IA32_X64_ASLDLINK_FLAGS) -m 
elf_x86_64
 *_GCC48_X64_ASM_FLAGS= DEF(GCC48_ASM_FLAGS) -m64
-*_GCC48_X64_CC_FLAGS = DEF(GCC48_X64_CC_FLAGS)
+*_GCC48_X64_CC_FLAGS = DEF(GCC48_X64_CC_FLAGS) -Os
 *_GCC48_X64_DLINK_FLAGS  = DEF(GCC48_X64_DLINK_FLAGS)
 *_GCC48_X64_RC_FLAGS = DEF(GCC_X64_RC_FLAGS)
 *_GCC48_X64_OBJCOPY_FLAGS= 
@@ -4542,7 +4542,7 @@
 *_GCC49_X64_ASLCC_FLAGS  = DEF(GCC_ASLCC_FLAGS) -m64
 *_GCC49_X64_ASLDLINK_FLAGS   = DEF(GCC49_IA32_X64_ASLDLINK_FLAGS) -m 
elf_x86_64
 *_GCC49_X64_ASM_FLAGS= DEF(GCC49_ASM_FLAGS) -m64
-*_GCC49_X64_CC_FLAGS = DEF(GCC49_X64_CC_FLAGS)
+*_GCC49_X64_CC_FLAGS = DEF(GCC49_X64_CC_FLAGS) -Os
 *_GCC49_X64_DLINK_FLAGS  = DEF(GCC49_X64_DLINK_FLAGS)
 *_GCC49_X64_RC_FLAGS = DEF(GCC_X64_RC_FLAGS)
 *_GCC49_X64_OBJCOPY_FLAGS= 
Index: EdkCompatibilityPkg/Foundation/Include/EfiStdArg.h
===
--- EdkCompatibilityPkg/Foundation/Include/EfiStdArg.h  (revision 16254)
+++ EdkCompatibilityPkg/Foundation/Include/EfiStdArg.h  (working copy)
@@ -66,6 +66,7 @@
  

Re: [edk2] Enable optimization for gcc x64 builds

2014-10-29 Thread Bruce Cran
On 10/29/2014 10:18 AM, Tim Lewis wrote:

> For historical perspective, the EDK2 build flags have focused on space over 
> speed because of the code size constraints placed on flash-resident code. Not 
> being as familiar with gcc as I am with VS20xx, I don't know whether these 
> can be set together.

It looks like there are several places where -Os and -O2 are used:

GCC_ALL_CC_FLAGS: -Os
GCC_IA32_CC_FLAGS: -O2
*_GCC44_IA32_CC_FLAGS: -Os
*_GCC45_IA32_CC_FLAGS: -Os
*_GCC46_IA32_CC_FLAGS: -Os
*_GCC47_IA32_CC_FLAGS: -Os
*_GCC48_IA32_CC_FLAGS: -Os
*_GCC49_IA32_CC_FLAGS: -Os
*_ELFGCC_X64_CC_FLAGS: -Os

The XCODE and XCLANG definitions seem more consistent between IA32 and 
X64 versions.

-- 
Bruce


--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel


Re: [edk2] Enable optimization for gcc x64 builds

2014-10-29 Thread Tim Lewis
Scott --

For historical perspective, the EDK2 build flags have focused on space over 
speed because of the code size constraints placed on flash-resident code. Not 
being as familiar with gcc as I am with VS20xx, I don't know whether these can 
be set together.

Tim

-Original Message-
From: Scott Duplichan [mailto:[email protected]] 
Sent: Tuesday, October 28, 2014 9:59 PM
To: [email protected]
Subject: [edk2] Enable optimization for gcc x64 builds

Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32 builds, 
-Os (optimize for small code size) is used. Why is this? Apparently it is 
because variable argument list handling fails when gcc X64 optimization is 
enabled. The solution is an improvement to the patch of SVN rev 10440:
http://sourceforge.net/p/edk2/mailman/message/2512/

The patch in this email only adds gcc X64 optimization for gcc versions 4.8 and 
newer. This is because testing with older versions of gcc is a lot of work. On 
the other hand, the patch could be a lot simpler if it were to ignore gcc 
version. The patch is boot tested using Duet with gcc 4.8.2 and gcc 4.9.1. For 
these two cases, the print formatting problem is resolved by the patch.

Should we:
1) Restrict the change to recent gcc versions where testing is easy
   (approach of included patch)
2) Apply the change to all gcc versions, and let older versions go
   untested?
3) Try to find/build the needed older gcc versions so that the patch
   can apply to all versions and be tested too

Thanks,
Scott

-- 

Enable optimization for gcc x64 builds.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Scott Duplichan 

--

Index: BaseTools/Conf/tools_def.template 
===
--- BaseTools/Conf/tools_def.template   (revision 16254)
+++ BaseTools/Conf/tools_def.template   (working copy)
@@ -3843,7 +3843,7 @@
 
 DEFINE GCC44_ALL_CC_FLAGS= -g -fshort-wchar -fno-strict-aliasing 
-Wall -Werror -Wno-array-bounds -ffunction-sections -fdata-sections -c -include 
AutoGen.h -DSTRING_ARRAY_NAME=$(BASE_NAME)Strings
 DEFINE GCC44_IA32_CC_FLAGS   = DEF(GCC44_ALL_CC_FLAGS) -m32 
-malign-double -fno-stack-protector -D EFI32
-DEFINE GCC44_X64_CC_FLAGS= DEF(GCC44_ALL_CC_FLAGS) -m64 
-fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" -DNO_BUILTIN_VA_FUNCS 
-mno-red-zone -Wno-address -mcmodel=large
+DEFINE GCC44_X64_CC_FLAGS= DEF(GCC44_ALL_CC_FLAGS) -m64 
-fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" -mno-red-zone 
-Wno-address -mcmodel=large
 DEFINE GCC44_IA32_X64_DLINK_COMMON   = -nostdlib -n -q --gc-sections 
--script=$(EDK_TOOLS_PATH)/Scripts/gcc4.4-ld-script
 DEFINE GCC44_IA32_X64_ASLDLINK_FLAGS = DEF(GCC44_IA32_X64_DLINK_COMMON) 
--entry ReferenceAcpiTable -u ReferenceAcpiTable
 DEFINE GCC44_IA32_X64_DLINK_FLAGS= DEF(GCC44_IA32_X64_DLINK_COMMON) 
--entry $(IMAGE_ENTRY_POINT) -u $(IMAGE_ENTRY_POINT) -Map 
$(DEST_DIR_DEBUG)/$(BASE_NAME).map
@@ -4420,7 +4420,7 @@
 *_GCC48_X64_ASLCC_FLAGS  = DEF(GCC_ASLCC_FLAGS) -m64
 *_GCC48_X64_ASLDLINK_FLAGS   = DEF(GCC48_IA32_X64_ASLDLINK_FLAGS) -m 
elf_x86_64
 *_GCC48_X64_ASM_FLAGS= DEF(GCC48_ASM_FLAGS) -m64
-*_GCC48_X64_CC_FLAGS = DEF(GCC48_X64_CC_FLAGS)
+*_GCC48_X64_CC_FLAGS = DEF(GCC48_X64_CC_FLAGS) -Os
 *_GCC48_X64_DLINK_FLAGS  = DEF(GCC48_X64_DLINK_FLAGS)
 *_GCC48_X64_RC_FLAGS = DEF(GCC_X64_RC_FLAGS)
 *_GCC48_X64_OBJCOPY_FLAGS= 
@@ -4542,7 +4542,7 @@
 *_GCC49_X64_ASLCC_FLAGS  = DEF(GCC_ASLCC_FLAGS) -m64
 *_GCC49_X64_ASLDLINK_FLAGS   = DEF(GCC49_IA32_X64_ASLDLINK_FLAGS) -m 
elf_x86_64
 *_GCC49_X64_ASM_FLAGS= DEF(GCC49_ASM_FLAGS) -m64
-*_GCC49_X64_CC_FLAGS = DEF(GCC49_X64_CC_FLAGS)
+*_GCC49_X64_CC_FLAGS = DEF(GCC49_X64_CC_FLAGS) -Os
 *_GCC49_X64_DLINK_FLAGS  = DEF(GCC49_X64_DLINK_FLAGS)
 *_GCC49_X64_RC_FLAGS = DEF(GCC_X64_RC_FLAGS)
 *_GCC49_X64_OBJCOPY_FLAGS= 
Index: EdkCompatibilityPkg/Foundation/Include/EfiStdArg.h
===
--- EdkCompatibilityPkg/Foundation/Include/EfiStdArg.h  (revision 16254)
+++ EdkCompatibilityPkg/Foundation/Include/EfiStdArg.h  (working copy)
@@ -66,6 +66,7 @@
   @return The aligned size.
 **/
 #define _INT_SIZE_OF(n) ((sizeof (n) + sizeof (UINTN) - 1) &~(sizeof (UINTN) - 
1))
+#define GCC_VERSION (__GNUC__ * 10 + __GNUC_MINOR__)
 
 #if defined(__CC_ARM)
 //
@@ -92,25 +93,37 @@
 
 #define VA_COPY(Dest, Start)  __va_copy (Dest, Start)
 
-#elif defined(__GNUC__) && !defined(NO_BUILTIN_VA_FUNCS)
+#elif defined(__GNUC__) && !defined(__x86_64__)
 //
 // Use GCC built-in macros for variable argument lists.
 //
 
 ///
-/// Variable used to traverse the list of arguments. This type can vary by 
-/// implementation and could be an array or structure. 
+/// Variable used to traverse the list of argum

Re: [edk2] Enable optimization for gcc x64 builds

2014-10-29 Thread Bruce Cran
On Tue, Oct 28, 2014 at 10:59 PM, Scott Duplichan  wrote:
> Optimization is not enabled for x64 builds using gcc 4.4-4.9. For IA32
> builds, -Os (optimize for small code size) is used. Why is this? Apparently
> it is because variable argument list handling fails when gcc X64 optimization
> is enabled. The solution is an improvement to the patch of SVN rev 10440:
> http://sourceforge.net/p/edk2/mailman/message/2512/

Related to gcc flags, should we also enable -ffreestanding by default,
or at least -fno-builtin?

-- 
Bruce

--
___
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel