Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Paolo Bonzini
On 22/08/2017 18:04, Laszlo Ersek wrote:
>> That said, the extra "-Wl," in "-Wl,-pie" is not necessary; the compiler
>> driver knows "-pie" and swallows it when compiling (and passes it to the
>> linker).
> Now *that* I can get behind. If this works, then please let us do it --
> replace "-fpie" with "-pie" in GCC44_X64_CC_FLAGS, and add no "-Wl,"
> stuff to any DLINK defines.

Note that you still need -fpie in the CC flags (and it _should_ not need
"-pie" in CC flags, only in DLINK).

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Laszlo Ersek
On 08/22/17 16:15, Paolo Bonzini wrote:
> On 22/08/2017 16:03, Ard Biesheuvel wrote:
>> On 22 August 2017 at 14:27, Paolo Bonzini  wrote:
>>> On 22/08/2017 13:59, Laszlo Ersek wrote:
 This seems to suggest that "-pie" is the *master* switch (used only when
 linking), and "-fpie" is a *prerequisite* for it (to be used both when
 linking and compiling). Is this right?

 If so, then I think this is a gcc usability bug. We don't generally
 start our thinking from the linker side. The above implies that the
 simple (hosted) command line:

 $ gcc -o example -fpie source1.c source2.c

 could also result in miscompilation, because "-pie" is not given, only
 "-fpie".
>>>
>>> No, GCC should add -pie on its own.
>>>
>>
>> I disagree. PIE linking and PIE code generation are two completely
>> different things.
> 
> What I'm saying is that GCC should add -pie on its own if you add -fpie
> to the linker command line.

Yes, that's my point, from a usability perspective.

While I hope to understand Ard's explanation (and it seems to confirm
that "-fpie" at compilation is a 'prerequisite' for "-pie" at linking),
again, this simply isn't how humans think about building binaries. If we
are required to call the compiler frontend for all purposes -- and we
seem to be required --, then we shouldn't have to express the same end
goal in different ways for different link-editing phases.

> But that would require changes to the
> compiler driver.
> 
> That said, the extra "-Wl," in "-Wl,-pie" is not necessary; the compiler
> driver knows "-pie" and swallows it when compiling (and passes it to the
> linker).

Now *that* I can get behind. If this works, then please let us do it --
replace "-fpie" with "-pie" in GCC44_X64_CC_FLAGS, and add no "-Wl,"
stuff to any DLINK defines.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Paolo Bonzini
On 22/08/2017 16:03, Ard Biesheuvel wrote:
> On 22 August 2017 at 14:27, Paolo Bonzini  wrote:
>> On 22/08/2017 13:59, Laszlo Ersek wrote:
>>> This seems to suggest that "-pie" is the *master* switch (used only when
>>> linking), and "-fpie" is a *prerequisite* for it (to be used both when
>>> linking and compiling). Is this right?
>>>
>>> If so, then I think this is a gcc usability bug. We don't generally
>>> start our thinking from the linker side. The above implies that the
>>> simple (hosted) command line:
>>>
>>> $ gcc -o example -fpie source1.c source2.c
>>>
>>> could also result in miscompilation, because "-pie" is not given, only
>>> "-fpie".
>>
>> No, GCC should add -pie on its own.
>>
> 
> I disagree. PIE linking and PIE code generation are two completely
> different things.

What I'm saying is that GCC should add -pie on its own if you add -fpie
to the linker command line.  But that would require changes to the
compiler driver.

That said, the extra "-Wl," in "-Wl,-pie" is not necessary; the compiler
driver knows "-pie" and swallows it when compiling (and passes it to the
linker).

Paolo

> PIE linking (in the absence of LTO) only involves emitting the
> sections containing the metadata required by the loader at runtime.
> Because dynamic ELF relocations are more restricted than static ones,
> and only operate on native pointer sized quantities, this imposes
> constraints on the code generation, which is why we need the -fpic or
> -fpie compiler switch. On ARM, this means you should not emit absolute
> symbol references where the address is encoded in the immediate field
> of a sequence of movw/movt/movz/movk instructions. I'm sure there are
> similar restrictions on other architectures. Note that the arm64 KASLR
> kernel does use PIE linking but omits -fpic/-fpie simply because the
> default small code model never uses such instructions, but always uses
> relative references (and statically initialized function pointers etc
> are guaranteed to be dynamically relocatable)
> 
> IIRC, PIE linking predates the availability of the -fpie GCC flag, and
> so -fpic objects were used to create PIE binaries, because they
> happened to fulfil these requirements, given that they apply to shared
> libraries as well. However, -fpic is geared towards ELF symbol
> preemption and other restrictions that do apply to shared libraries
> but not to PIE executables, and so the -fpie switch was introduced as
> an alternative, which generates code that may not be suitable for a
> shared library but can be used in a PIE or ordinary executable.
> 
> None of this really applies to bare metal binaries, which is why we
> need the visibility tweaks when using -fpic/-fpie, in which case the
> compiler will relative references over absolute ones. Such objects
> could be combined in different ways, and PIE linking is only one of
> them.
> 

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Ard Biesheuvel
On 22 August 2017 at 14:27, Paolo Bonzini  wrote:
> On 22/08/2017 13:59, Laszlo Ersek wrote:
>> This seems to suggest that "-pie" is the *master* switch (used only when
>> linking), and "-fpie" is a *prerequisite* for it (to be used both when
>> linking and compiling). Is this right?
>>
>> If so, then I think this is a gcc usability bug. We don't generally
>> start our thinking from the linker side. The above implies that the
>> simple (hosted) command line:
>>
>> $ gcc -o example -fpie source1.c source2.c
>>
>> could also result in miscompilation, because "-pie" is not given, only
>> "-fpie".
>
> No, GCC should add -pie on its own.
>

I disagree. PIE linking and PIE code generation are two completely
different things.

PIE linking (in the absence of LTO) only involves emitting the
sections containing the metadata required by the loader at runtime.
Because dynamic ELF relocations are more restricted than static ones,
and only operate on native pointer sized quantities, this imposes
constraints on the code generation, which is why we need the -fpic or
-fpie compiler switch. On ARM, this means you should not emit absolute
symbol references where the address is encoded in the immediate field
of a sequence of movw/movt/movz/movk instructions. I'm sure there are
similar restrictions on other architectures. Note that the arm64 KASLR
kernel does use PIE linking but omits -fpic/-fpie simply because the
default small code model never uses such instructions, but always uses
relative references (and statically initialized function pointers etc
are guaranteed to be dynamically relocatable)

IIRC, PIE linking predates the availability of the -fpie GCC flag, and
so -fpic objects were used to create PIE binaries, because they
happened to fulfil these requirements, given that they apply to shared
libraries as well. However, -fpic is geared towards ELF symbol
preemption and other restrictions that do apply to shared libraries
but not to PIE executables, and so the -fpie switch was introduced as
an alternative, which generates code that may not be suitable for a
shared library but can be used in a PIE or ordinary executable.

None of this really applies to bare metal binaries, which is why we
need the visibility tweaks when using -fpic/-fpie, in which case the
compiler will relative references over absolute ones. Such objects
could be combined in different ways, and PIE linking is only one of
them.
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Paolo Bonzini
On 22/08/2017 13:59, Laszlo Ersek wrote:
> This seems to suggest that "-pie" is the *master* switch (used only when
> linking), and "-fpie" is a *prerequisite* for it (to be used both when
> linking and compiling). Is this right?
> 
> If so, then I think this is a gcc usability bug. We don't generally
> start our thinking from the linker side. The above implies that the
> simple (hosted) command line:
> 
> $ gcc -o example -fpie source1.c source2.c
> 
> could also result in miscompilation, because "-pie" is not given, only
> "-fpie".

No, GCC should add -pie on its own.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Gao, Liming
Laszlo:
  I will take BZ#671 and create the formal patch for review. 

Thanks
Liming
> -Original Message-
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Tuesday, August 22, 2017 7:59 PM
> To: Shi, Steven ; Ard Biesheuvel 
> 
> Cc: edk2-devel-01 ; Alex Williamson 
> ; Justen, Jordan L
> ; Gao, Liming ; Kinney, 
> Michael D ; Paolo Bonzini
> 
> Subject: Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large 
> code model for X64/GCC5/LTO
> 
> On 08/22/17 10:00, Shi, Steven wrote:
> > It is a link flag misuse in our GCC build toolchains, not
> > compiler/linker's problem. Below patch can fix the wrong assembly
> > function relocation type in PIE binary. With below patch, all the
> > GCC5, GCC49 and GCC48 can build correctly images of OvmfPkgIa32X64 and
> > OvmfPkgX64 platforms without my previous simple work around in my
> > side. Please try it in your side.
> >
> > Since we are using GCC as linker command, we MUST pass -pie to ld with
> > "-Wl,-pie", not just "--pie"  or "-fpie".
> >
> > So, this means we never correctly build small+PIE 64bits code with GCC
> > toolchains before (CLANG38 is correct). If you failed to enable
> > PIE/LTO build before, it is worthy to revisit those failures with
> > "-Wl,-pie". FYI.
> 
> The first question of Paolo (CC'd) was, when he saw this issue in my
> last status report, whether we added "-fpie" to the link command line as
> well. And, I confirmed, we did. This is how I responded to him:
> 
> > - in "BaseTools/Conf/build_rule.template", there's
> >
> > > [Static-Library-File]
> > > 
> > > *.lib
> > >
> > > 
> > > $(MAKE_FILE)
> > >
> > > 
> > > $(DEBUG_DIR)(+)$(MODULE_NAME).dll
> > >
> > > [...]
> > >
> > > 
> > > "$(DLINK)" -o ${dst} $(DLINK_FLAGS) 
> > > -Wl,--start-group,@$(STATIC_LIBRARY_FILES_LIST),--end-group $(CC_FLAGS)
> $(DLINK2_FLAGS)
> > > "$(OBJCOPY)" $(OBJCOPY_FLAGS) ${dst}
> >
> > (see "$(CC_FLAGS)")
> >
> > - and in the build log, I see
> >
> > > "gcc" \
> > >   -o 
> > > Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/CpuMpPei/CpuMpPei/DEBUG/CpuMpPei.dll
> > >  \
> > >   -nostdlib \
> > >   -Wl,-n,-q,--gc-sections \
> > >   -z common-page-size=0x40 \
> > >   -Wl,--entry,_ModuleEntryPoint \
> > >   -u _ModuleEntryPoint \
> > >   
> > > -Wl,-Map,Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/CpuMpPei/CpuMpPei/DEBUG/CpuMpPei.map
> > >  \
> > >   -Wl,-melf_x86_64,--oformat=elf64-x86-64 \
> > >   -flto \
> > >   -Os \
> > >
> -Wl,--start-group,@Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/CpuMpPei/CpuMpPei/OUTPUT/static_library_files.lst,--end-group
> \
> > >   -g \
> > >   -fshort-wchar \
> > >   -fno-builtin \
> > >   -fno-strict-aliasing \
> > >   -Wall \
> > >   -Werror \
> > >   -Wno-array-bounds \
> > >   -ffunction-sections \
> > >   -fdata-sections \
> > >   -include AutoGen.h \
> > >   -fno-common \
> > >   -DSTRING_ARRAY_NAME=CpuMpPeiStrings \
> > >   -m64 \
> > >   -fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" \
> > >   -maccumulate-outgoing-args \
> > >   -mno-red-zone \
> > >   -Wno-address \
> > >   -mcmodel=small \
> > >   -fpie \
> > >   -fno-asynchronous-unwind-tables \
> > >   -Wno-address \
> > >   -flto \
> > >   -DUSING_LTO \
> > >   -Os \
> > >   -mno-mmx \
> > >   -mno-sse \
> > >   -D DISABLE_NEW_DEPRECATED_INTERFACES \
> > >   -Wl,--defsym=PECOFF_HEADER_SIZE=0x228 \
> > >   -Wl,--script=BaseTools/Scripts/GccBase.lds \
> > >   -Wno-error
> 
> I don't understand why we need "-Wl,-pie" separately. The "gcc" binary
> should pass it through to "ld" as necessary, IMO. This is what the gcc
> documentation says:
> 
> > 3.13 Options for Linking
> > 
> > '-pie'
> >  Produce a position independent executable on targets that support
> >  it.  For predictable results, you must also specify the same set
> >  of options used for compilation ('-fpie', '-fPIE', or model
> >  suboptions) when you specify this linker option.
> >
> > 3.18 Op

Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Laszlo Ersek
On 08/22/17 10:00, Shi, Steven wrote:
> It is a link flag misuse in our GCC build toolchains, not
> compiler/linker's problem. Below patch can fix the wrong assembly
> function relocation type in PIE binary. With below patch, all the
> GCC5, GCC49 and GCC48 can build correctly images of OvmfPkgIa32X64 and
> OvmfPkgX64 platforms without my previous simple work around in my
> side. Please try it in your side.
>
> Since we are using GCC as linker command, we MUST pass -pie to ld with
> "-Wl,-pie", not just "--pie"  or "-fpie".
>
> So, this means we never correctly build small+PIE 64bits code with GCC
> toolchains before (CLANG38 is correct). If you failed to enable
> PIE/LTO build before, it is worthy to revisit those failures with
> "-Wl,-pie". FYI.

The first question of Paolo (CC'd) was, when he saw this issue in my
last status report, whether we added "-fpie" to the link command line as
well. And, I confirmed, we did. This is how I responded to him:

> - in "BaseTools/Conf/build_rule.template", there's
>
> > [Static-Library-File]
> > 
> > *.lib
> >
> > 
> > $(MAKE_FILE)
> >
> > 
> > $(DEBUG_DIR)(+)$(MODULE_NAME).dll
> >
> > [...]
> >
> > 
> > "$(DLINK)" -o ${dst} $(DLINK_FLAGS) 
> > -Wl,--start-group,@$(STATIC_LIBRARY_FILES_LIST),--end-group $(CC_FLAGS) 
> > $(DLINK2_FLAGS)
> > "$(OBJCOPY)" $(OBJCOPY_FLAGS) ${dst}
>
> (see "$(CC_FLAGS)")
>
> - and in the build log, I see
>
> > "gcc" \
> >   -o 
> > Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/CpuMpPei/CpuMpPei/DEBUG/CpuMpPei.dll
> >  \
> >   -nostdlib \
> >   -Wl,-n,-q,--gc-sections \
> >   -z common-page-size=0x40 \
> >   -Wl,--entry,_ModuleEntryPoint \
> >   -u _ModuleEntryPoint \
> >   
> > -Wl,-Map,Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/CpuMpPei/CpuMpPei/DEBUG/CpuMpPei.map
> >  \
> >   -Wl,-melf_x86_64,--oformat=elf64-x86-64 \
> >   -flto \
> >   -Os \
> >   
> > -Wl,--start-group,@Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/CpuMpPei/CpuMpPei/OUTPUT/static_library_files.lst,--end-group
> >  \
> >   -g \
> >   -fshort-wchar \
> >   -fno-builtin \
> >   -fno-strict-aliasing \
> >   -Wall \
> >   -Werror \
> >   -Wno-array-bounds \
> >   -ffunction-sections \
> >   -fdata-sections \
> >   -include AutoGen.h \
> >   -fno-common \
> >   -DSTRING_ARRAY_NAME=CpuMpPeiStrings \
> >   -m64 \
> >   -fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" \
> >   -maccumulate-outgoing-args \
> >   -mno-red-zone \
> >   -Wno-address \
> >   -mcmodel=small \
> >   -fpie \
> >   -fno-asynchronous-unwind-tables \
> >   -Wno-address \
> >   -flto \
> >   -DUSING_LTO \
> >   -Os \
> >   -mno-mmx \
> >   -mno-sse \
> >   -D DISABLE_NEW_DEPRECATED_INTERFACES \
> >   -Wl,--defsym=PECOFF_HEADER_SIZE=0x228 \
> >   -Wl,--script=BaseTools/Scripts/GccBase.lds \
> >   -Wno-error

I don't understand why we need "-Wl,-pie" separately. The "gcc" binary
should pass it through to "ld" as necessary, IMO. This is what the gcc
documentation says:

> 3.13 Options for Linking
> 
> '-pie'
>  Produce a position independent executable on targets that support
>  it.  For predictable results, you must also specify the same set
>  of options used for compilation ('-fpie', '-fPIE', or model
>  suboptions) when you specify this linker option.
>
> 3.18 Options for Code Generation Conventions
> 
> '-fpie'
> '-fPIE'
>  These options are similar to '-fpic' and '-fPIC', but generated
>  position independent code can be only linked into executables.
>  Usually these options are used when '-pie' GCC option is used
>  during linking.
>
>  '-fpie' and '-fPIE' both define the macros '__pie__' and
>  '__PIE__'. The macros have the value 1 for '-fpie' and 2 for
>  '-fPIE'.

This seems to suggest that "-pie" is the *master* switch (used only when
linking), and "-fpie" is a *prerequisite* for it (to be used both when
linking and compiling). Is this right?

If so, then I think this is a gcc usability bug. We don't generally
start our thinking from the linker side. The above implies that the
simple (hosted) command line:

$ gcc -o example -fpie source1.c source2.c

could also result in miscompilation, because "-pie" is not given, only
"-fpie".

On 08/22/17 10:00, Shi, Steven wrote:

> diff --git a/BaseTools/Conf/tools_def.template 
> b/BaseTools/Conf/tools_def.template
> index 1fa3ca3..9e46d65 100755
> --- a/BaseTools/Conf/tools_def.template
> +++ b/BaseTools/Conf/tools_def.template
> @@ -4375,7 +4375,7 @@ DEFINE GCC44_IA32_X64_DLINK_COMMON   = -nostdlib 
> -Wl,-n,-q,--gc-sections -z comm
> DEFINE GCC44_IA32_X64_ASLDLINK_FLAGS = DEF(GCC44_IA32_X64_DLINK_COMMON) 
> -Wl,--entry,ReferenceAcpiTable -u ReferenceAcpiTable
> DEFINE GCC44_IA32_X64_DLINK_FLAGS= DEF(GCC44_IA32_X64_DLINK_COMMON) 
> -Wl,--entry,$(IMAGE_ENTRY_POINT) -u $(IMAGE_ENTRY_POINT) 
> -Wl,-Map,$(DEST_DIR_DEBUG)/$(BASE_NAME).map
> DEFINE GCC44_IA32_DLINK2_FLAGS   = -Wl,--defsym=PECO

Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-22 Thread Shi, Steven
It is a link flag misuse in our GCC build toolchains, not compiler/linker’s 
problem. Below patch can fix the wrong assembly function relocation type in PIE 
binary. With below patch, all the GCC5, GCC49 and GCC48 can build correctly 
images of OvmfPkgIa32X64 and OvmfPkgX64 platforms without my previous simple 
work around in my side. Please try it in your side.

Since we are using GCC as linker command, we MUST pass -pie to ld with 
“-Wl,-pie”, not just “--pie”  or “-fpie”.

So, this means we never correctly build small+PIE 64bits code with GCC 
toolchains before (CLANG38 is correct). If you failed to enable PIE/LTO build 
before, it is worthy to revisit those failures with “-Wl,-pie”. FYI.






diff --git a/BaseTools/Conf/tools_def.template 
b/BaseTools/Conf/tools_def.template
index 1fa3ca3..9e46d65 100755
--- a/BaseTools/Conf/tools_def.template
+++ b/BaseTools/Conf/tools_def.template
@@ -4375,7 +4375,7 @@ DEFINE GCC44_IA32_X64_DLINK_COMMON   = -nostdlib 
-Wl,-n,-q,--gc-sections -z comm
DEFINE GCC44_IA32_X64_ASLDLINK_FLAGS = DEF(GCC44_IA32_X64_DLINK_COMMON) 
-Wl,--entry,ReferenceAcpiTable -u ReferenceAcpiTable
DEFINE GCC44_IA32_X64_DLINK_FLAGS= DEF(GCC44_IA32_X64_DLINK_COMMON) 
-Wl,--entry,$(IMAGE_ENTRY_POINT) -u $(IMAGE_ENTRY_POINT) 
-Wl,-Map,$(DEST_DIR_DEBUG)/$(BASE_NAME).map
DEFINE GCC44_IA32_DLINK2_FLAGS   = -Wl,--defsym=PECOFF_HEADER_SIZE=0x220 
DEF(GCC_DLINK2_FLAGS_COMMON)
-DEFINE GCC44_X64_DLINK_FLAGS = DEF(GCC44_IA32_X64_DLINK_FLAGS) 
-Wl,-melf_x86_64,--oformat=elf64-x86-64
+DEFINE GCC44_X64_DLINK_FLAGS = DEF(GCC44_IA32_X64_DLINK_FLAGS) 
-Wl,-melf_x86_64,--oformat=elf64-x86-64 -Wl,-pie
DEFINE GCC44_X64_DLINK2_FLAGS= -Wl,--defsym=PECOFF_HEADER_SIZE=0x228 
DEF(GCC_DLINK2_FLAGS_COMMON)
DEFINE GCC44_ASM_FLAGS   = DEF(GCC_ASM_FLAGS)

@@ -4455,7 +4455,7 @@ DEFINE GCC49_IA32_X64_DLINK_COMMON   = -nostdlib 
-Wl,-n,-q,--gc-sections -z comm
DEFINE GCC49_IA32_X64_ASLDLINK_FLAGS = DEF(GCC49_IA32_X64_DLINK_COMMON) 
-Wl,--entry,ReferenceAcpiTable -u ReferenceAcpiTable
DEFINE GCC49_IA32_X64_DLINK_FLAGS= DEF(GCC49_IA32_X64_DLINK_COMMON) 
-Wl,--entry,$(IMAGE_ENTRY_POINT) -u $(IMAGE_ENTRY_POINT) 
-Wl,-Map,$(DEST_DIR_DEBUG)/$(BASE_NAME).map
DEFINE GCC49_IA32_DLINK2_FLAGS   = DEF(GCC48_IA32_DLINK2_FLAGS)
-DEFINE GCC49_X64_DLINK_FLAGS = DEF(GCC49_IA32_X64_DLINK_FLAGS) 
-Wl,-melf_x86_64,--oformat=elf64-x86-64
+DEFINE GCC49_X64_DLINK_FLAGS = DEF(GCC49_IA32_X64_DLINK_FLAGS) 
-Wl,-melf_x86_64,--oformat=elf64-x86-64 -Wl,-pie
DEFINE GCC49_X64_DLINK2_FLAGS= DEF(GCC48_X64_DLINK2_FLAGS)
DEFINE GCC49_ASM_FLAGS   = DEF(GCC48_ASM_FLAGS)
DEFINE GCC49_ARM_ASM_FLAGS   = DEF(GCC48_ARM_ASM_FLAGS)





Steven Shi

Intel\SSG\STO\UEFI Firmware



Tel: +86 021-61166522

iNet: 821-6522



> -Original Message-

> From: Laszlo Ersek [mailto:ler...@redhat.com]

> Sent: Tuesday, August 15, 2017 11:46 PM

> To: Shi, Steven 

> Cc: edk2-devel-01 ; Alex Williamson

> ; Ard Biesheuvel

> ; Justen, Jordan L ;

> Gao, Liming ; Kinney, Michael D

> 

> Subject: Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to

> large code model for X64/GCC5/LTO

>

> On 08/12/17 05:05, Shi, Steven wrote:

> > OK. I can reproduce the failure with -smp 4 and -m 5120 in my side.

> >

> > It looks a linker bug about assemble function support in PIC/PIE

> > code.  You know, if we only have C code, the compiler/linker will

> > generate all the machine code and guarantee all the address reference

> > are position independent under PIC/PIE build. But if we mix manually

> > written assemble code in the C code,  the linker cannot really control

> > the address reference in the assemble code, and  might got confused.

>

> This is an incorrect description of the situation.

>

> The address reference is *not* in assembly code. (It used to be in

> assembly code, but Mike changed that earlier, for XCODE5 compatibility.)

>

> At this moment the address reference is in *C code*. The C code takes

> the address of an external function, and assigns it to a field in a data

> structure. The assembly code calls the function through this field. The

> assembly code makes no reference to the called function by name. See

> Mike's commit 3b2928b46987:

>

> -movrax, ASM_PFX(InitializeFloatingPointUnits)

> +movrax, qword [esi + InitializeFloatingPointUnitsAddress]

>  subrsp, 20h

>  call   rax   ; Call assembly function to initialize FPU 
> per UEFI spec

>

> And, indeed, it is *not* the assembly code that's being miscompiled. It

> is the C-language assignment below that is miscompiled:

>

> +  ExchangeInfo->InitializeFloatingPointUnitsAddress =

> (UINTN)InitializeFloatingPointUnits;

>

> > So, it is not seldom we could s

Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-15 Thread Laszlo Ersek
On 08/12/17 05:05, Shi, Steven wrote:
> OK. I can reproduce the failure with -smp 4 and -m 5120 in my side.
>
> It looks a linker bug about assemble function support in PIC/PIE
> code.  You know, if we only have C code, the compiler/linker will
> generate all the machine code and guarantee all the address reference
> are position independent under PIC/PIE build. But if we mix manually
> written assemble code in the C code,  the linker cannot really control
> the address reference in the assemble code, and  might got confused.

This is an incorrect description of the situation.

The address reference is *not* in assembly code. (It used to be in
assembly code, but Mike changed that earlier, for XCODE5 compatibility.)

At this moment the address reference is in *C code*. The C code takes
the address of an external function, and assigns it to a field in a data
structure. The assembly code calls the function through this field. The
assembly code makes no reference to the called function by name. See
Mike's commit 3b2928b46987:

-movrax, ASM_PFX(InitializeFloatingPointUnits)
+movrax, qword [esi + InitializeFloatingPointUnitsAddress]
 subrsp, 20h
 call   rax   ; Call assembly function to initialize FPU 
per UEFI spec

And, indeed, it is *not* the assembly code that's being miscompiled. It
is the C-language assignment below that is miscompiled:

+  ExchangeInfo->InitializeFloatingPointUnitsAddress = 
(UINTN)InitializeFloatingPointUnits;

> So, it is not seldom we could see the compiler/linker generate wrong
> code for mixed code,  especially with very high level optimization,
> e.g. LTO.
>
> Globally change memory model from small to large will bring not
> trivial impact (+15%) to code size, espcial for the uncomperssed
> option rom dirver. Below is some data of OvmfPkgX64.dsc platform.
>
>   Dxecore.efi CpuDxe.efi  CpuMpPeiPeiCore.efi
> Small+PIE:139520  47360   30144   46720
> Large:165696  55360   34496   53504

My argument is that the current "-mcmodel=small" option actually *lies*
to the compiler about our binaries. According to the GCC documentation,
"-mcmodel=small" implies that a binary built like this will never be
executed from above 2GB in the address space. This is why gcc-7 believes
it is allowed to generate a MOV instruction that sign-extends a 32-bit
address to 64-bit -- because we promise GCC that the sign bit will
always be clear to begin with.

IOW, we make a promise, gcc-7 generates code accordingly, and then we
break the promise, by executing the binary (the assignment in the C code
of MpInitLib) from above 2GB.


In particular, an X64 UEFI_DRIVER module, shipped as an option ROM on a
physical PCI(E) card, could be loaded anywhere at all in the 64-bit
address space (given sufficient memory in the computer). Building such a
driver with "-mcmodel=small" is wrong therefore; we cannot guarantee
that the driver will be executed from under 2GB.

Perhaps we should use "-mcmodel=large", but *keep* "-fpie". ... I've now
tried that, but it doesn't work. With "-mcmodel=large -fpie", the
compiler emits R_X86_64_GOTOFF64 relocations (type 25 decimal), and I
get errors like:

GenFw: ERROR 3000: Invalid
  
Build/OvmfX64/DEBUG_GCC5/X64/MdeModulePkg/Universal/ReportStatusCodeRouter/Pei/ReportStatusCodeRouterPei/DEBUG/ReportStatusCodeRouterPei.dll
  unsupported ELF EM_X86_64 relocation 0x19.

I believe Ard's commit 28ade7b802e0 ("MdePkg: move to 'hidden'
visibility for all symbols under GCC/X64", 2016-08-01) was meant to
prevent this, but apparently it's not enough with gcc-7.1.

> A simpler workaround could be to add a C function wrapper around the
> assemble lib function as below. This simple workaround works in my
> side.  But it is necessary to find this issue's root cause and fix it
> in the compiler/linker. I will try to raise this issue to
> compiler/linker guys.
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> old mode 100644
> new mode 100755
> index a3eea29..7afe434
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -738,6 +738,15 @@ WaitApWakeup (
>}
>  }
>
> +VOID
> +EFIAPI
> +InitializeFloatingPointUnitsWrapper (
> +  VOID
> +  )
> +{
> +  InitializeFloatingPointUnits();
> +}
> +
>  /**
>This function will fill the exchange info structure.
>
> @@ -771,7 +780,7 @@ FillExchangeInfoData (
>
>ExchangeInfo->EnableExecuteDisable = IsBspExecuteDisableEnabled ();
>
> -  ExchangeInfo->InitializeFloatingPointUnitsAddress = 
> (UINTN)InitializeFloatingPointUnits;
> +  ExchangeInfo->InitializeFloatingPointUnitsAddress = 
> (UINTN)InitializeFloatingPointUnitsWrapper;
>
>//
>// Get the BSP's data of GDT and IDT

I'm not convinced that this is the right fix, until we know exactly why
and how it changes the behavior of gcc-7. So I've now filed


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-11 Thread Shi, Steven
OK. I can reproduce the failure with -smp 4 and -m 5120 in my side. 
It looks a linker bug about assemble function support in PIC/PIE code.  You 
know, if we only have C code, the compiler/linker will generate all the machine 
code and guarantee all the address reference are position independent under 
PIC/PIE build. But if we mix manually written assemble code in the C code,  the 
linker cannot really control the address reference in the assemble code, and  
might got confused. So, it is not seldom we could see the compiler/linker 
generate wrong code for mixed code,  especially with very high level 
optimization, e.g. LTO.

Globally change memory model from small to large will bring not trivial impact 
(+15%) to code size, espcial for the uncomperssed option rom dirver. Below is 
some data of OvmfPkgX64.dsc platform.
Dxecore.efi CpuDxe.efi  CpuMpPeiPeiCore.efi
Small+PIE:  139520  47360   30144   46720
Large:  165696  55360   34496   53504

A simpler workaround could be to add a C function wrapper around the assemble 
lib function as below. This simple workaround works in my side.  But it is 
necessary to find this issue's root cause and fix it in the compiler/linker. I 
will try to raise this issue to compiler/linker guys.

diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
b/UefiCpuPkg/Library/MpInitLib/MpLib.c
old mode 100644
new mode 100755
index a3eea29..7afe434
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -738,6 +738,15 @@ WaitApWakeup (
   }
 }

+VOID
+EFIAPI
+InitializeFloatingPointUnitsWrapper (
+  VOID
+  )
+{
+  InitializeFloatingPointUnits();
+}
+
 /**
   This function will fill the exchange info structure.

@@ -771,7 +780,7 @@ FillExchangeInfoData (

   ExchangeInfo->EnableExecuteDisable = IsBspExecuteDisableEnabled ();

-  ExchangeInfo->InitializeFloatingPointUnitsAddress = 
(UINTN)InitializeFloatingPointUnits;
+  ExchangeInfo->InitializeFloatingPointUnitsAddress = 
(UINTN)InitializeFloatingPointUnitsWrapper;

   //
   // Get the BSP's data of GDT and IDT


Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522

> -Original Message-
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Friday, August 11, 2017 7:18 PM
> To: Shi, Steven ; edk2-devel-01  de...@lists.01.org>
> Cc: Ard Biesheuvel ; Justen, Jordan L
> ; Gao, Liming ; Kinney,
> Michael D 
> Subject: Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large
> code model for X64/GCC5/LTO
> 
> Hi Steven,
> 
> On 08/11/17 07:28, Shi, Steven wrote:
> > Hi Laszlo,
> >
> > I'm trying to reproduce your boot failure with OVMF in my Ubuntu
> > system, but not succeed.  My GCC was built from GCC main trunk code
> > in 20170601, and my ld linker is version 2.28. Could you try the ld
> > 2.28 with your gcc-7.1 and check whether it works in your side?  Or,
> > do you know where can I download the gcc-7.1 pre-built binaries?
> This was reproduced on a stock Fedora 26 installation:
> 
> https://getfedora.org/
> 
> > jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p
> OvmfPkg/OvmfPkgX64.dsc -t GCC5 -DSECURE_BOOT_ENABLE=TRUE -a X64 -b
> DEBUG  -DDEBUG_ON_SERIAL_PORT -n 5
> >
> > jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p
> OvmfPkg/OvmfPkgIa32X64.dsc -t GCC5 -DSECURE_BOOT_ENABLE=TRUE -a IA32
> -a X64 -b DEBUG  -DDEBUG_ON_SERIAL_PORT -n 5
> 
> These build commands look good.
> 
> > jshi19@jshi19-
> desktop:~/wksp_efi/Laszlo/edk2/Build/OvmfX64/DEBUG_GCC5/FV$ qemu-
> system-x86_64  -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. 
> -monitor
> stdio --enable-kvm
> >
> > jshi19@jshi19-
> desktop:~/wksp_efi/Laszlo/edk2/Build/Ovmf3264/DEBUG_GCC5/FV$ qemu-
> system-x86_64  -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. 
> -monitor
> stdio --enable-kvm
> 
> The QEMU command lines are missing two options:
> 
> * First, you have to make sure that the VM has enough memory under 4GB
> so that the PEI RAM can grow above the 2GB limit. You are using the
> i440fx board type, so that's OK (its 32-bit memory can go up to 3GB),
> but the "-m" switch is too small. Instead, I recommend:
> 
>   -m 5120
> 
> This will give the VM 5GB of RAM, 3GB of which will be placed under the
> 4GB mark, and 2GB will be placed above.
> 
> * Second, there must be at least one AP (application processor) for
> CpuMpPei to boot up. So please add something like:
> 
>   -smp 4
> 
> Thanks
> Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-11 Thread Alex Williamson
On Fri, 11 Aug 2017 02:34:26 +0200
Laszlo Ersek  wrote:
> Cc: Alex Williamson 
> Cc: Ard Biesheuvel 
> Cc: Jordan Justen 
> Cc: Liming Gao 
> Cc: Michael Kinney 
> Cc: Yonghong Zhu 
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Laszlo Ersek 
> ---
>  BaseTools/Conf/tools_def.template | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/BaseTools/Conf/tools_def.template 
> b/BaseTools/Conf/tools_def.template
> index 1fa3ca3ceae9..2ef495540e5f 100755
> --- a/BaseTools/Conf/tools_def.template
> +++ b/BaseTools/Conf/tools_def.template
> @@ -5335,10 +5335,10 @@ RELEASE_GCC5_IA32_DLINK_FLAGS= 
> DEF(GCC5_IA32_X64_DLINK_FLAGS) -flto -Os -Wl,
>  *_GCC5_X64_OBJCOPY_FLAGS =
>  *_GCC5_X64_NASM_FLAGS= -f elf64
>  
> -  DEBUG_GCC5_X64_CC_FLAGS= DEF(GCC5_X64_CC_FLAGS) -flto -DUSING_LTO 
> -Os
> +  DEBUG_GCC5_X64_CC_FLAGS= DEF(GCC5_X64_CC_FLAGS) -fno-pie 
> -mcmodel=large -flto -DUSING_LTO -Os
>DEBUG_GCC5_X64_DLINK_FLAGS = DEF(GCC5_X64_DLINK_FLAGS) -flto -Os
>  
> -RELEASE_GCC5_X64_CC_FLAGS= DEF(GCC5_X64_CC_FLAGS) -flto -DUSING_LTO 
> -Os -Wno-unused-but-set-variable
> +RELEASE_GCC5_X64_CC_FLAGS= DEF(GCC5_X64_CC_FLAGS) -fno-pie 
> -mcmodel=large -flto -DUSING_LTO -Os -Wno-unused-but-set-variable
>  RELEASE_GCC5_X64_DLINK_FLAGS = DEF(GCC5_X64_DLINK_FLAGS) -flto -Os
>  
>NOOPT_GCC5_X64_CC_FLAGS= DEF(GCC5_X64_CC_FLAGS) -O0

VM boots again, Thanks Laszlo!

Tested-by: Alex Williamson 
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-11 Thread Laszlo Ersek
Hi Steven,

On 08/11/17 07:28, Shi, Steven wrote:
> Hi Laszlo,
> 
> I'm trying to reproduce your boot failure with OVMF in my Ubuntu
> system, but not succeed.  My GCC was built from GCC main trunk code
> in 20170601, and my ld linker is version 2.28. Could you try the ld
> 2.28 with your gcc-7.1 and check whether it works in your side?  Or,
> do you know where can I download the gcc-7.1 pre-built binaries?
This was reproduced on a stock Fedora 26 installation:

https://getfedora.org/

> jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p OvmfPkg/OvmfPkgX64.dsc 
> -t GCC5 -DSECURE_BOOT_ENABLE=TRUE -a X64 -b DEBUG  -DDEBUG_ON_SERIAL_PORT -n 5
> 
> jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p 
> OvmfPkg/OvmfPkgIa32X64.dsc -t GCC5 -DSECURE_BOOT_ENABLE=TRUE -a IA32 -a X64 
> -b DEBUG  -DDEBUG_ON_SERIAL_PORT -n 5

These build commands look good.

> jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2/Build/OvmfX64/DEBUG_GCC5/FV$ 
> qemu-system-x86_64  -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. 
> -monitor stdio --enable-kvm
> 
> jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2/Build/Ovmf3264/DEBUG_GCC5/FV$ 
> qemu-system-x86_64  -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. 
> -monitor stdio --enable-kvm

The QEMU command lines are missing two options:

* First, you have to make sure that the VM has enough memory under 4GB
so that the PEI RAM can grow above the 2GB limit. You are using the
i440fx board type, so that's OK (its 32-bit memory can go up to 3GB),
but the "-m" switch is too small. Instead, I recommend:

  -m 5120

This will give the VM 5GB of RAM, 3GB of which will be placed under the
4GB mark, and 2GB will be placed above.

* Second, there must be at least one AP (application processor) for
CpuMpPei to boot up. So please add something like:

  -smp 4

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-11 Thread Laszlo Ersek
On 08/11/17 12:03, Ard Biesheuvel wrote:
> On 11 August 2017 at 01:34, Laszlo Ersek  wrote:

>> (2) Unfortunately, the C-language assignment to
>> "MP_CPU_EXCHANGE_INFO.InitializeFloatingPointUnitsAddress" is
>> miscompiled under the following circumstances:
>>
>> - the edk2 build target is DEBUG (or RELEASE),
>> - the edk2 toolchain selection is GCC5,
>> - (from the above two it follows that -Os and LTO are enabled,)
>> - MpInitLib is built for X64,
>> - and the compiler is gcc-7.1 (shipped in Fedora-26 e.g.).
>>
>> (If the PEI phase is 64-bit, then MpInitLib breaks in CpuMpPei. If the
>> PEI phase is 32-bit and the DXE phase is 64-bit, then MpInitLib breaks
>> in CpuDxe. If the DXE phase is 32-bit, then nothing breaks.)
>>
>> Below I'll use the NOOPT/GCC5/X64/gcc-7.1 build of CpuMpPei to show
>> the correct assembly code generated by gcc for the assignment, and
>> flip the build target to DEBUG for showing the broken assembly code.
>>
>> (3) Correct code for the assignment:
>>
>>> 5406:   48 8d 15 73 24 00 00lea0x2473(%rip),%rdx# 
>>> 7880 
>>> 540d:   48 8b 45 f8 mov-0x8(%rbp),%rax
>>> 5411:   48 89 90 84 00 00 00mov%rdx,0x84(%rax)
>>
>> Here the absolute address of InitializeFloatingPointUnits() is loaded
>> into RDX with RIP-relative addressing. Then RAX is pointed to the
>> ExchangeInfo structure, and finally RDX is stored into
>> ExchangeInfo->InitializeFloatingPointUnitsAddress (the field offset is
>> 0x84 in the structure).
>>
>> Here the only relocation happens when CpuMpPei is linked. The distance
>> between the call site and the callee is calculated and encoded in the
>> LEA instruction. The same distance is valid when CpuMpPei runs, so the
>> PEI core doesn't have to relocate the LEA instruction when it loads
>> CpuMpPei.
>>
>> (4) Incorrect code for the assignment:
>>
>>> 2d69:   48 c7 c2 60 58 00 00mov$0x5860,%rdx
>>> ...
>>> 2d7b:   48 89 95 84 00 00 00mov%rdx,0x84(%rbp)
>>> ...
>>> 5860 :
>>> 5860:   9b db e3finit
>>
>> This is not RIP-relative addressing. Therefore gcc generates a
>> relocation record for the address encoded in the MOV instruction, at
>> offset 0x2d69+0x3 -- initial value being 0x5860:
>>
>>> Sections:
>>> Idx Name  Size  VMA   LMA   File off  
>>> Algn
>>>   0 .text 6e66  0240  0240  00c0  
>>> 2**6
>>>   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>> ...
>>> RELOCATION RECORDS FOR [.text]:
>>> OFFSET   TYPE  VALUE
>>> ...
>>> 2b2c R_X86_64_32S  InitializeFloatingPointUnits
>>
>> (The offset of the address to relocate is 0x2d69+3 == 0x2d6c, which
>> equals the .text section base plus the relocation offset, i.e.,
>> 0x240+0x2b2c.)
>>
>> This is the only R_X86_64_32S relocation record in the linked-together
>> CpuMpPei.debug file (still an ELF file). In the rest of the build
>> process, the GenFw utility translates CpuMpPei.dll to a PE/COFF image
>> (CpuMpPei.efi), and converts the ELF relocation record to a PE/COFF
>> relocation record (of type EFI_IMAGE_REL_BASED_HIGHLOW).

> Thanks for the interesting read. One question: is all code potentially
> miscompiled? Or does it only affect code executing in real mode?

The code that is miscompiled is a normal assignment, in normal C code.
The structure type itself is packed [1], and the pointer to access it is
volatile-qualified [2], so those could be contributing factors.

[1] UefiCpuPkg/Library/MpInitLib/MpLib.h

#pragma pack(1)

//
// MP CPU exchange information for AP reset code
// This structure is required to be packed because fixed field offsets
// into this structure are used in assembly code in this module
//
typedef struct {
  UINTN Lock;
  UINTN StackStart;
  UINTN StackSize;
  UINTN CFunction;
  IA32_DESCRIPTOR   GdtrProfile;
  IA32_DESCRIPTOR   IdtrProfile;
  UINTN BufferStart;
  UINTN ModeOffset;
  UINTN NumApsExecuting;
  UINTN CodeSegment;
  UINTN DataSegment;
  UINTN EnableExecuteDisable;
  UINTN Cr3;
  UINTN InitFlag;
  CPU_INFO_IN_HOB   *CpuInfo;
  CPU_MP_DATA   *CpuMpData;
  UINTN InitializeFloatingPointUnitsAddress;
} MP_CPU_EXCHANGE_INFO;

#pragma pack()

[2] UefiCpuPkg/Library/MpInitLib/MpLib.c

VOID
FillExchangeInfoData (
  IN CPU_MP_DATA   *CpuMpData
  )
{
  volatile MP_CPU_EXCHANGE_INFO*ExchangeInfo;
...
  ExchangeInfo->InitializeFloatingPointUnitsAddress =
(UINTN)InitializeFloatingPointUnits;


The only reason the miscompilation affects real mode is that the
assig

Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-11 Thread Ard Biesheuvel
On 11 August 2017 at 01:34, Laszlo Ersek  wrote:
> Several users have recently reported boot failures with OVMF. The symptoms
> are: blank screen and all VCPUs pegged at 100%. Alex Williamson has found
> the commit (via bisection) that exposes the issue. In this patch I'll
> attempt to analyze the symptoms and fix the root problem.
>
> (1) "UefiCpuPkg/Library/MpInitLib" is an IA32/X64 library instance that
> provides multiprocessing services. It is built into the CpuMpPei
> module (for the PEI phase) and the CpuDxe module (for the DXE and BDS
> phases). When either of these modules starts up during boot, it
> briefly boots up all of the CPUs, perfoms some counting /
> synchronization, and then all the APs (application processors) are
> quiesced until further work arrives.
>
> The APs boot in real mode, and need assembly language init code (a
> "reset vector") that gradually takes them into 64-bit mode, before
> they can call back into normal C code. The reset vector code is
> assembled at build time, but it is copied as data under 1MB (for real
> mode execution by the APs) during boot, whenever CpuMpPei or CpuDxe
> launches.
>
> Part of the reset vector code is FPU initialization. The reset vector
> in "UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm" calls the assembly
> language function InitializeFloatingPointUnits(), which is however
> provided by an external library. Normally, when the object files were
> linked together, this function call would result in an absolute
> reference (and matching relocation, performed at module loading), and
> the instance of the reset vector that was copied under 1MB during boot
> would refer to the same, unchanged, absolute address of
> InitializeFloatingPointUnits(), residing in CpuMpPei or CpuDxe.
>
> This didn't work with X64 XCODE5 builds however. XCODE5 prefers
> RIP-relative addressing for X64 (i.e., the assembly language function
> call depended on the distance between the call site and the callee).
> When the call site was copied under 1MB but
> InitializeFloatingPointUnits() would not move, the call broke.
>
> This was tracked in TianoCore BZ#565 and fixed in commit 3b2928b46987
> ("UefiCpuPkg/MpInitLib: Fix X64 XCODE5/NASM compatibility issues",
> 2017-05-17). The solution was to add another function pointer to the
> MP_CPU_EXCHANGE_INFO communication area. (The function pointer would
> have size 4 in Ia32 builds, and size 8 in X64 builds.) MpInitLib would
> assign the absolute address of InitializeFloatingPointUnits() to the
> new field, and the APs would retrieve it -- matching the pointer size
> correctly --, and call it. The C assignment can be seen in the
> following hunk (from commit 3b2928b46987):
>
>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> index 407c44c95e62..735e099b32f2 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> @@ -751,6 +751,8 @@ FillExchangeInfoData (
>>
>>ExchangeInfo->EnableExecuteDisable = IsBspExecuteDisableEnabled ();
>>
>> +  ExchangeInfo->InitializeFloatingPointUnitsAddress = 
>> (UINTN)InitializeFloatingPointUnits;
>> +
>>//
>>// Get the BSP's data of GDT and IDT
>>//
>
> This moved the relocation of the InitializeFloatingPointUnits()
> reference (or, with XCODE5, the RIP-relative
> InitializeFloatingPointUnits() reference) from the assembly code
> (which is copied around) to the C code (which is not copied around).
>
> (2) Unfortunately, the C-language assignment to
> "MP_CPU_EXCHANGE_INFO.InitializeFloatingPointUnitsAddress" is
> miscompiled under the following circumstances:
>
> - the edk2 build target is DEBUG (or RELEASE),
> - the edk2 toolchain selection is GCC5,
> - (from the above two it follows that -Os and LTO are enabled,)
> - MpInitLib is built for X64,
> - and the compiler is gcc-7.1 (shipped in Fedora-26 e.g.).
>
> (If the PEI phase is 64-bit, then MpInitLib breaks in CpuMpPei. If the
> PEI phase is 32-bit and the DXE phase is 64-bit, then MpInitLib breaks
> in CpuDxe. If the DXE phase is 32-bit, then nothing breaks.)
>
> Below I'll use the NOOPT/GCC5/X64/gcc-7.1 build of CpuMpPei to show
> the correct assembly code generated by gcc for the assignment, and
> flip the build target to DEBUG for showing the broken assembly code.
>
> (3) Correct code for the assignment:
>
>> 5406:   48 8d 15 73 24 00 00lea0x2473(%rip),%rdx# 
>> 7880 
>> 540d:   48 8b 45 f8 mov-0x8(%rbp),%rax
>> 5411:   48 89 90 84 00 00 00mov%rdx,0x84(%rax)
>
> Here the absolute address of InitializeFloatingPointUnits() is loaded
> into RDX with RIP-relative addressing. Then RAX is pointed to the
> ExchangeInfo structure, and finally RDX is stored into
>  

Re: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO

2017-08-10 Thread Shi, Steven
Hi Laszlo,

I'm trying to reproduce your boot failure with OVMF in my Ubuntu system, but 
not succeed.  My GCC was built from GCC main trunk code in 20170601, and my ld 
linker is version 2.28. Could you try the ld 2.28 with your gcc-7.1 and check 
whether it works in your side?  Or, do you know where can I download the 
gcc-7.1 pre-built binaries?





jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ gcc --version

gcc (GCC) 8.0.0 20170601 (experimental)



jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ ld --version

GNU ld (GNU Binutils) 2.28



Below are my trying steps, and the Qemu can boot well into shell:

jshi19@jshi19-desktop:~/wksp_efi$ git clone https://github.com/lersek/edk2.git

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ git checkout gcc7_lto

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ git checkout 
7ef0dae092afcfb6fab7e8372c78097672168c4a

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Build -r

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Conf/tools_def.txt

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Conf/target.txt

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Conf/build_rule.txt

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ source edksetup.sh

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ make -C BaseTools/Source/C/ clean

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ make -C BaseTools/Source/C/
jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p OvmfPkg/OvmfPkgX64.dsc 
-t GCC5 -DSECURE_BOOT_ENABLE=TRUE -a X64 -b DEBUG  -DDEBUG_ON_SERIAL_PORT -n 5

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p 
OvmfPkg/OvmfPkgIa32X64.dsc -t GCC5 -DSECURE_BOOT_ENABLE=TRUE -a IA32 -a X64 -b 
DEBUG  -DDEBUG_ON_SERIAL_PORT -n 5

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2/Build/OvmfX64/DEBUG_GCC5/FV$ 
qemu-system-x86_64  -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. 
-monitor stdio --enable-kvm

jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2/Build/Ovmf3264/DEBUG_GCC5/FV$ 
qemu-system-x86_64  -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. 
-monitor stdio --enable-kvm







Steven Shi

Intel\SSG\STO\UEFI Firmware



Tel: +86 021-61166522

iNet: 821-6522





> -Original Message-

> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of

> Laszlo Ersek

> Sent: Friday, August 11, 2017 8:34 AM

> To: edk2-devel-01 

> Cc: Ard Biesheuvel ; Justen, Jordan L

> ; Gao, Liming ; Kinney,

> Michael D 

> Subject: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large

> code model for X64/GCC5/LTO

>

> Several users have recently reported boot failures with OVMF. The symptoms

> are: blank screen and all VCPUs pegged at 100%. Alex Williamson has found

> the commit (via bisection) that exposes the issue. In this patch I'll

> attempt to analyze the symptoms and fix the root problem.

>

> (1) "UefiCpuPkg/Library/MpInitLib" is an IA32/X64 library instance that

> provides multiprocessing services. It is built into the CpuMpPei

> module (for the PEI phase) and the CpuDxe module (for the DXE and BDS

> phases). When either of these modules starts up during boot, it

> briefly boots up all of the CPUs, perfoms some counting /

> synchronization, and then all the APs (application processors) are

> quiesced until further work arrives.

>

> The APs boot in real mode, and need assembly language init code (a

> "reset vector") that gradually takes them into 64-bit mode, before

> they can call back into normal C code. The reset vector code is

> assembled at build time, but it is copied as data under 1MB (for real

> mode execution by the APs) during boot, whenever CpuMpPei or CpuDxe

> launches.

>

> Part of the reset vector code is FPU initialization. The reset vector

> in "UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm" calls the assembly

> language function InitializeFloatingPointUnits(), which is however

> provided by an external library. Normally, when the object files were

> linked together, this function call would result in an absolute

> reference (and matching relocation, performed at module loading), and

> the instance of the reset vector that was copied under 1MB during boot

> would refer to the same, unchanged, absolute address of

> InitializeFloatingPointUnits(), residing in CpuMpPei or CpuDxe.

>

> This didn't work with X64 XCODE5 builds however. XCODE5 prefers

> RIP-relative addressing for X64 (i.e., the assembly language function

> call depended on the distance between the call site and the callee).

> When the call site was copied under 1MB but

> InitializeFloatingPointUnits() would not move, the call broke.

>

> This was tracked in TianoCore BZ#565 and fixed in commit 3b2928b46987

> ("UefiCpuPkg/MpInitLib: Fix X64 XCODE5/NASM compatibility issues",

> 2017-05-17). The solution was to add another function pointer to the

> MP_CPU_EXCHANGE_INFO communication area. (The function pointer

> would