Re: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC IA32/X64 assembly

2018-09-27 Thread Laszlo Ersek
On 09/27/18 11:28, Shao, Ming wrote:
> Hi Laszlo,
>
> I build Ruiyu's code with gcc 4.8.5 as X64. And got below disassembled
> code:
>
> [cid:image002.jpg@01D45686.AFEDF140]
>
> So I didn't see register used as both destination and source of xadd
> instruction.
>
> Then I build your patch, and got exactly the same disassembled code.
>
> Both Ray's patch and yours can pass my test.

That's not surprising: the gcc documentation says that not marking
input-output operands as such may, or may not, break the code. Such a
bug in the operand list of the inline assembly is not *required* to
break the code, it just *can*. The only guarantee given is that, if the
input-output operands are all marked properly as such, then the code
will not break.

> Could you provide more details about your environment so I can
> reproduce this issue? Thanks.

Sure. My compiler version is the following (on RHEL-7.5 Workstation):

- gcc-4.8.5-28.el7_5.1.x86_64

Binutils (not that I think it matters in this case, but anyway):

- binutils-2.27-28.base.el7_5.1.x86_64

The OVMF build command line was:

  build \
-a X64 \
-p OvmfPkg/OvmfPkgX64.dsc \
-t GCC48 \
-b NOOPT \
-D SECURE_BOOT_ENABLE \
-D TLS_ENABLE \
-D NETWORK_IP6_ENABLE \
-D HTTP_BOOT_ENABLE \
-n 4 \
--report-file=.../build.ovmf.report \
--log=.../build.ovmf.log

It's very likely that the gcc version in RHEL-7.5 has a large number of
backports from later upstream changes. The base version is indeed 4.8.5,
but the internal build number is "-28.el7_5.1". That could mean a huge
number of patches. I'll append the relevant part of the RPM changelog at
the bottom of my email.

I'd just like to point out that this is not a gcc bug; the compiler
behaves according to its documentation.

Thanks,
Laszlo

> * Tue Mar 27 2018 Jeff Law  4.8.5-29
> - s390 retpoline support for spectre mitigation (#1552021)
>
> * Fri Jan 19 2018 Marek Polacek  4.8.5-28
> - Minor testsuite fixes to clean up test results (#1469697)
> - retpoline support for spectre mitigation (#1535655)
>
> * Thu Jan 11 2018 Marek Polacek  4.8.5-27
> - bump for rebuild with RELRO enabled even for ppc64/ppc64le
>
> * Thu Jan 04 2018 Jeff Law  4.8.5-26
> - Avoid red zone probing for zero residual dynamic allocation (#1469697)
> - Avoid bogus CFIs for probes in noreturn fucntions on x86/x86_64 (#1469697)
>
> * Tue Nov 14 2017 Jeff Law  4.8.5-25
> - Avoid red zone probe on aarch64 (#1469697)
>
> * Mon Nov 06 2017 Jeff Law  4.8.5-24
> - Sync gcc48-rh1469697-13 patch to upstream (#1469697)
> - Avoid probing in the red zone for noreturn functions (#1507980, #1469697)
> - Avoid infinite loop if probing interval is less than guard size (#1469697)
> - Fix debug information for large probing interval on aarch64 (#1469697)
> - Fix ICE on ppc port with large probing interval (#1469697)
>
> - rebuild to remove static relocations not known to older linkers (#1508968)
>
> * Thu Nov 02 2017 Marek Polacek  4.8.5-23
> - rebuild to remove static relocations not known to older linkers (#1508968)
>
> * Thu Oct 19 2017 Marek Polacek  4.8.5-22
> - fix gcc.c-torture/execute/pr80692.x
> - fix divmod expansion (PR middle-end/78416)
>
> * Wed Oct 18 2017 Marek Polacek  4.8.5-21
> - fix 27_io/basic_fstream/53984.cc
> - fix for classes with bases with mutable members (PR c++/77375)
> - fix handling side-effects of parameters (PR c/77767)
> - fix combine's make_extraction (PR rtl-optimization/78378)
> - fix gimplification of const var initialization from COND_EXPR (PR c++/80129)
> - fix -A / -B to A / B folding (PR middle-end/80362)
> - fix comparison of decimal float zeroes (PR middle-end/80692)
> - fix __mulv[dt]i3 and expand_mul_overflow (PR target/82274)
>
> * Thu Oct 12 2017 Marek Polacek  4.8.5-20
> - handle exceptions in basic_istream::sentry (#1469384)
> - don't run pr63354.c on ppc (#1468546)
> - ensure proxy privatization safety (#1491395)
> - fix incorrect codegen from rdseed intrinsic use (#1482762, CVE-2017-11671)
> - on aarch64, remove libatomic.so (#1465510)
>
> * Sun Oct 08 2017 Jeff Law  4.8.5-19
> - Backport stack clash protection from upstream (#1469697)
>
> * Fri Oct 06 2017 Marek Polacek  4.8.5-18
> - backport several -mprofile-kernel fixes (#1468546)
>
> * Thu Sep 14 2017 Marek Polacek  4.8.5-17
> - fix -mcpu=power8 atomic expansion (#1437220, PR target/69644)
> - fix .toc alignment (#1487434)
>
> * Thu Jun 01 2017 Marek Polacek  4.8.5-16
> - disable emitting profiling in functions marked with a special
>   attribute (#1457969)
>
> * Wed May 31 2017 Marek Polacek  4.8.5-15
> - properly apply the PR70549 patch (#1349067)
>
> * Mon Mar 13 2017 Marek Polacek  4.8.5-14
> - promote reloads of a PLUS to RELOAD_OTHER (#1402585)
>
> * Fri Mar 10 2017 Jakub Jelinek  4.8.5-13
> - add -mstack-protector-guard={tls,global}, -mstack-protector-guard-reg=
>   and -mstack-protector-guard-offset= options on ppc* (#1415952,
>   PR target/78875)
>
> * Tue Mar 07 2017 Jakub Jelinek  4.8.5-12
> - fix up 

Re: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC IA32/X64 assembly

2018-09-27 Thread Shao, Ming
Hi Laszlo,



I build Ruiyu's code with gcc 4.8.5 as X64. And got below disassembled code:



[cid:image001.jpg@01D45689.F8A3F530]



So I didn't see register used as both destination and source of xadd 
instruction.



Then I build your patch, and got exactly the same disassembled code.



Both Ruiyu's patch and yours can pass my test.



Could you provide more details about your environment so I can reproduce this 
issue? Thanks.



My environment:

l  Ubuntu 18.04.1 LTS

l  gcc (Ubuntu 4.8.5-4ubuntu8) 4.8.5





-Ming


The content of this message is my personal opinion only and although I am an 
employee of Intel, the statements I make here in no way represent Intel's 
position on the issue, nor am I authorized to speak on behalf of Intel on this 
matter.





-Original Message-
From: Ni, Ruiyu
Sent: Wednesday, September 26, 2018 5:35 PM
To: Shao, Ming 
Subject: Fwd: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands 
in GCC IA32/X64 assembly









 Forwarded Message 

Subject: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC 
IA32/X64 assembly

Date: Tue, 25 Sep 2018 21:48:57 +0200

From: Laszlo Ersek mailto:ler...@redhat.com>>

To: edk2-devel-01 mailto:edk2-devel@lists.01.org>>

CC: Michael Kinney 
mailto:michael.d.kin...@intel.com>>, Ruiyu Ni 
mailto:ruiyu...@intel.com>>, Jiewen Yao 
mailto:jiewen@intel.com>>, Liming Gao 
mailto:liming@intel.com>>



Currently, "gcc-4.8.5-28.el7_5.1.x86_64" generates the following code for me, 
from the XADD inline assembly added to "X64/GccInline.c" in commit

17634d026f96:



> 4383 :

> UINT32

> EFIAPI

> InternalSyncIncrement (

>   IN  volatile UINT32*Value

>   )

> {

> 4383:   55  push   %rbp

> 4384:   48 89 e5mov%rsp,%rbp

> 4387:   48 83 ec 10 sub$0x10,%rsp

> 438b:   48 89 4d 10 mov%rcx,0x10(%rbp)

>   UINT32  Result;

>

>   __asm__ __volatile__ (

> 438f:   48 8b 55 10 mov0x10(%rbp),%rdx

> 4393:   48 8b 45 10 mov0x10(%rbp),%rax

> 4397:   b8 01 00 00 00  mov$0x1,%eax

> 439c:   f0 0f c1 00 lock xadd %eax,(%rax)

> 43a0:   ff c0   inc%eax

> 43a2:   89 45 fcmov%eax,-0x4(%rbp)

> : "m"  (*Value)   // %2

> : "memory",

>   "cc"

> );

>

>   return Result;

> 43a5:   8b 45 fcmov-0x4(%rbp),%eax

> }

> 43a8:   c9  leaveq

> 43a9:   c3  retq

>



The MOV $0X1,%EAX instruction corrupts the address of Value in %RAX before we 
reach the XADD instruction. In fact, it makes no sense for XADD to use %EAX as 
source operand and (%RAX) as destination operand at the same time.



The XADD instruction's destination operand is a read-write operand. The GCC 
documentation states:



> The ordinary output operands must be write-only; GCC will assume that

> the values in these operands before the instruction are dead and need

> not be generated. Extended asm supports input-output or read-write

> operands. Use the constraint character `+' to indicate such an operand

> and list it with the output operands. You should only use read-write

> operands when the constraints for the operand (or the operand in which

> only some of the bits are to be changed) allow a register.



(The above is intentionally quoted from the oldest GCC release that edk2 
supports, namely gcc-4.4:

<https://gcc.gnu.org/onlinedocs/gcc-4.4.7/gcc/Extended-Asm.html>.)



Fix the operand list accordingly.



With the patch applied, I get:



> 4383 :

> UINT32

> EFIAPI

> InternalSyncIncrement (

>   IN  volatile UINT32*Value

>   )

> {

> 4383:   55  push   %rbp

> 4384:   48 89 e5mov%rsp,%rbp

> 4387:   48 83 ec 10 sub$0x10,%rsp

> 438b:   48 89 4d 10 mov%rcx,0x10(%rbp)

>   UINT32  Result;

>

>   __asm__ __volatile__ (

> 438f:   48 8b 55 10 mov0x10(%rbp),%rdx

> 4393:   48 8b 45 10 mov0x10(%rbp),%rax

> 4397:   b8 01 00 00 00  mov$0x1,%eax

> 439c:   f0 0f c1 02 lock xadd %eax,(%rdx)

> 43a0:   ff c0   inc%eax

> 43a2:   89 45 fcmov%eax,-0x4(%rbp)

> : // no inputs that aren't also outputs

> : "memory",

>   "cc"

> );

>

>   return

Re: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC IA32/X64 assembly

2018-09-26 Thread Laszlo Ersek
On 09/26/18 11:34, Ni, Ruiyu wrote:
> On 9/26/2018 5:05 PM, Laszlo Ersek wrote:
>> Hi,
>>
>> On 09/25/18 21:48, Laszlo Ersek wrote:
>>> Currently, "gcc-4.8.5-28.el7_5.1.x86_64" generates the following code
>>> for
>>> me, from the XADD inline assembly added to "X64/GccInline.c" in commit
>>> 17634d026f96:
>>>
 4383 :
 UINT32
 EFIAPI
 InternalSyncIncrement (
    IN  volatile UINT32    *Value
    )
 {
  4383:   55  push   %rbp
  4384:   48 89 e5    mov    %rsp,%rbp
  4387:   48 83 ec 10 sub    $0x10,%rsp
  438b:   48 89 4d 10 mov    %rcx,0x10(%rbp)
    UINT32  Result;

    __asm__ __volatile__ (
  438f:   48 8b 55 10 mov    0x10(%rbp),%rdx
  4393:   48 8b 45 10 mov    0x10(%rbp),%rax
  4397:   b8 01 00 00 00  mov    $0x1,%eax
  439c:   f0 0f c1 00 lock xadd %eax,(%rax)
  43a0:   ff c0   inc    %eax
  43a2:   89 45 fc    mov    %eax,-0x4(%rbp)
  : "m"  (*Value)   // %2
  : "memory",
    "cc"
  );

    return Result;
  43a5:   8b 45 fc    mov    -0x4(%rbp),%eax
 }
  43a8:   c9  leaveq
  43a9:   c3  retq

>>>
>>> The MOV $0X1,%EAX instruction corrupts the address of Value in %RAX
>>> before
>>> we reach the XADD instruction. In fact, it makes no sense for XADD to
>>> use
>>> %EAX as source operand and (%RAX) as destination operand at the same
>>> time.
>>
>> may I get a fast review for this patch, please? The regression from
>> commit 17634d026f96 prevents OVMF from booting.
> 
> Sure. Reviewed-by: Ruiyu Ni 

Thanks, Ray! Commit 8a94eb9283fa.
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC IA32/X64 assembly

2018-09-26 Thread Ni, Ruiyu

On 9/26/2018 5:05 PM, Laszlo Ersek wrote:

Hi,

On 09/25/18 21:48, Laszlo Ersek wrote:

Currently, "gcc-4.8.5-28.el7_5.1.x86_64" generates the following code for
me, from the XADD inline assembly added to "X64/GccInline.c" in commit
17634d026f96:


4383 :
UINT32
EFIAPI
InternalSyncIncrement (
   IN  volatile UINT32*Value
   )
{
 4383:   55  push   %rbp
 4384:   48 89 e5mov%rsp,%rbp
 4387:   48 83 ec 10 sub$0x10,%rsp
 438b:   48 89 4d 10 mov%rcx,0x10(%rbp)
   UINT32  Result;

   __asm__ __volatile__ (
 438f:   48 8b 55 10 mov0x10(%rbp),%rdx
 4393:   48 8b 45 10 mov0x10(%rbp),%rax
 4397:   b8 01 00 00 00  mov$0x1,%eax
 439c:   f0 0f c1 00 lock xadd %eax,(%rax)
 43a0:   ff c0   inc%eax
 43a2:   89 45 fcmov%eax,-0x4(%rbp)
 : "m"  (*Value)   // %2
 : "memory",
   "cc"
 );

   return Result;
 43a5:   8b 45 fcmov-0x4(%rbp),%eax
}
 43a8:   c9  leaveq
 43a9:   c3  retq



The MOV $0X1,%EAX instruction corrupts the address of Value in %RAX before
we reach the XADD instruction. In fact, it makes no sense for XADD to use
%EAX as source operand and (%RAX) as destination operand at the same time.


may I get a fast review for this patch, please? The regression from
commit 17634d026f96 prevents OVMF from booting.


Sure. Reviewed-by: Ruiyu Ni 


Thanks
Laszlo




--
Thanks,
Ray
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC IA32/X64 assembly

2018-09-26 Thread Laszlo Ersek
Hi,

On 09/25/18 21:48, Laszlo Ersek wrote:
> Currently, "gcc-4.8.5-28.el7_5.1.x86_64" generates the following code for
> me, from the XADD inline assembly added to "X64/GccInline.c" in commit
> 17634d026f96:
> 
>> 4383 :
>> UINT32
>> EFIAPI
>> InternalSyncIncrement (
>>   IN  volatile UINT32*Value
>>   )
>> {
>> 4383:   55  push   %rbp
>> 4384:   48 89 e5mov%rsp,%rbp
>> 4387:   48 83 ec 10 sub$0x10,%rsp
>> 438b:   48 89 4d 10 mov%rcx,0x10(%rbp)
>>   UINT32  Result;
>>
>>   __asm__ __volatile__ (
>> 438f:   48 8b 55 10 mov0x10(%rbp),%rdx
>> 4393:   48 8b 45 10 mov0x10(%rbp),%rax
>> 4397:   b8 01 00 00 00  mov$0x1,%eax
>> 439c:   f0 0f c1 00 lock xadd %eax,(%rax)
>> 43a0:   ff c0   inc%eax
>> 43a2:   89 45 fcmov%eax,-0x4(%rbp)
>> : "m"  (*Value)   // %2
>> : "memory",
>>   "cc"
>> );
>>
>>   return Result;
>> 43a5:   8b 45 fcmov-0x4(%rbp),%eax
>> }
>> 43a8:   c9  leaveq
>> 43a9:   c3  retq
>>
> 
> The MOV $0X1,%EAX instruction corrupts the address of Value in %RAX before
> we reach the XADD instruction. In fact, it makes no sense for XADD to use
> %EAX as source operand and (%RAX) as destination operand at the same time.

may I get a fast review for this patch, please? The regression from
commit 17634d026f96 prevents OVMF from booting.

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


[edk2] [PATCH] MdePkg/BaseSynchronizationLib: fix XADD operands in GCC IA32/X64 assembly

2018-09-25 Thread Laszlo Ersek
Currently, "gcc-4.8.5-28.el7_5.1.x86_64" generates the following code for
me, from the XADD inline assembly added to "X64/GccInline.c" in commit
17634d026f96:

> 4383 :
> UINT32
> EFIAPI
> InternalSyncIncrement (
>   IN  volatile UINT32*Value
>   )
> {
> 4383:   55  push   %rbp
> 4384:   48 89 e5mov%rsp,%rbp
> 4387:   48 83 ec 10 sub$0x10,%rsp
> 438b:   48 89 4d 10 mov%rcx,0x10(%rbp)
>   UINT32  Result;
>
>   __asm__ __volatile__ (
> 438f:   48 8b 55 10 mov0x10(%rbp),%rdx
> 4393:   48 8b 45 10 mov0x10(%rbp),%rax
> 4397:   b8 01 00 00 00  mov$0x1,%eax
> 439c:   f0 0f c1 00 lock xadd %eax,(%rax)
> 43a0:   ff c0   inc%eax
> 43a2:   89 45 fcmov%eax,-0x4(%rbp)
> : "m"  (*Value)   // %2
> : "memory",
>   "cc"
> );
>
>   return Result;
> 43a5:   8b 45 fcmov-0x4(%rbp),%eax
> }
> 43a8:   c9  leaveq
> 43a9:   c3  retq
>

The MOV $0X1,%EAX instruction corrupts the address of Value in %RAX before
we reach the XADD instruction. In fact, it makes no sense for XADD to use
%EAX as source operand and (%RAX) as destination operand at the same time.

The XADD instruction's destination operand is a read-write operand. The
GCC documentation states:

> The ordinary output operands must be write-only; GCC will assume that
> the values in these operands before the instruction are dead and need
> not be generated. Extended asm supports input-output or read-write
> operands. Use the constraint character `+' to indicate such an operand
> and list it with the output operands. You should only use read-write
> operands when the constraints for the operand (or the operand in which
> only some of the bits are to be changed) allow a register.

(The above is intentionally quoted from the oldest GCC release that edk2
supports, namely gcc-4.4:
.)

Fix the operand list accordingly.

With the patch applied, I get:

> 4383 :
> UINT32
> EFIAPI
> InternalSyncIncrement (
>   IN  volatile UINT32*Value
>   )
> {
> 4383:   55  push   %rbp
> 4384:   48 89 e5mov%rsp,%rbp
> 4387:   48 83 ec 10 sub$0x10,%rsp
> 438b:   48 89 4d 10 mov%rcx,0x10(%rbp)
>   UINT32  Result;
>
>   __asm__ __volatile__ (
> 438f:   48 8b 55 10 mov0x10(%rbp),%rdx
> 4393:   48 8b 45 10 mov0x10(%rbp),%rax
> 4397:   b8 01 00 00 00  mov$0x1,%eax
> 439c:   f0 0f c1 02 lock xadd %eax,(%rdx)
> 43a0:   ff c0   inc%eax
> 43a2:   89 45 fcmov%eax,-0x4(%rbp)
> : // no inputs that aren't also outputs
> : "memory",
>   "cc"
> );
>
>   return Result;
> 43a5:   8b 45 fcmov-0x4(%rbp),%eax
> }
> 43a8:   c9  leaveq
> 43a9:   c3  retq

Note that some other bugs remain in
"BaseSynchronizationLib/*/GccInline.c"; those should be addressed later,
under .

Cc: Jiewen Yao 
Cc: Liming Gao 
Cc: Michael Kinney 
Cc: Ruiyu Ni 
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1207
Fixes: 17634d026f968c404b039a8d8431b6389dd396ea
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Laszlo Ersek 
---

Notes:
Repo:   https://github.com/lersek/edk2.git
Branch: xadd_rw

 MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c | 12 ++--
 MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c  | 12 ++--
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c 
b/MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
index d82e0205f553..fa2be7f4b35c 100644
--- a/MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
+++ b/MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
@@ -38,11 +38,11 @@ InternalSyncIncrement (
   __asm__ __volatile__ (
 "movl$1, %%eax  \n\t"
 "lock   \n\t"
-"xadd%%eax, %2  \n\t"
+"xadd%%eax, %1  \n\t"
 "inc %%eax  "
 : "=a" (Result),  // %0
-  "=m" (*Value)   // %1
-: "m"  (*Value)   // %2
+  "+m" (*Value)   // %1
+: // no inputs that aren't also outputs
 : "memory",
   "cc"
 );
@@ -75,11 +75,11 @@ InternalSyncDecrement (
   __asm__ __volatile__ (
 "movl$-1, %%eax  \n\t"
 "lock\n\t"
-"xadd%%eax, %2   \n\t"
+"xadd%%eax, %1   \n\t"