About 31109 - gprofng not built and installed in a combined binutils+gcc build

2024-01-30 Thread Vladimir Mezentsev

Hi,

I asked in https://sourceware.org/bugzilla/show_bug.cgi?id=31109
> I prepared a patch for the releases/gcc-13 branch.
> Richard Biener  rejected my patch for 
this branch.

> Which branch should I use? master, trunk or something else?

Do you really need gprofng in the gcc repo ?
if yes:
  the fix is trivial.
  I did for the releases/gcc-13 branch:
 git cherry-pick 24552056fd5fc677c0d032f54a5cad1c4303d312
  Can anyone do the same for the correct branch.
  I have no write permissions for gcc.gnu.org/git/gcc.git

  I maintain binutils-gdb/gprofng. Who will maintain gcc/gprofng ?

If no:
 may I close 31109 ?

Thank you,
-Vladimir

.



Fwd: [PATCH] gprofng: a new GNU profiler

2023-12-19 Thread Vladimir Mezentsev

PING.

If the patch is approved, can anyone apply this patch.
I don't have permissions for `git push`.

Thank you,
-Vladimir


 Forwarded Message 
Subject:[PATCH] gprofng: a new GNU profiler
Date:   Wed, 13 Dec 2023 17:23:01 -0800
From:   vladimir.mezent...@oracle.com
To: gcc-patches@gcc.gnu.org
CC: Vladimir Mezentsev 



From: Vladimir Mezentsev 

This is fixes for releases/gcc-13 for 31109 gprofng not built and 
installed in a combined binutils+gcc build
I only cherry-picked 24552056fd5fc677c0d032f54a5cad1c4303d312 and tested 
my build.


ChangeLog:

* Makefile.def: Add gprofng module.
* configure.ac: Add --enable-gprofng option.
* Makefile.in: Regenerate.
* configure: Regenerate.

include/ChangeLog:

* collectorAPI.h: New file.
* libcollector.h: New file.
* libfcollector.h: New file.
---
Makefile.def | 11 +
Makefile.in | 497 
configure | 18 ++
configure.ac | 14 ++
include/collectorAPI.h | 73 ++
include/libcollector.h | 89 +++
include/libfcollector.h | 42 
7 files changed, 744 insertions(+)
create mode 100644 include/collectorAPI.h
create mode 100644 include/libcollector.h
create mode 100644 include/libfcollector.h

diff --git a/Makefile.def b/Makefile.def
index 35e994eb77e..9d43b78b51b 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -72,6 +72,7 @@ host_modules= { module= isl; lib_path=.libs; 
bootstrap=true;

no_install= true; };
host_modules= { module= gold; bootstrap=true; };
host_modules= { module= gprof; };
+host_modules= { module= gprofng; };
// intl acts on 'host_shared' directly, and does not support --with-pic.
host_modules= { module= intl; bootstrap=true; };
host_modules= { module= tcl;
@@ -510,6 +511,16 @@ dependencies = { module=all-gprof; on=all-bfd; };
dependencies = { module=all-gprof; on=all-opcodes; };
dependencies = { module=all-gprof; on=all-intl; };
dependencies = { module=all-gprof; on=all-gas; };
+
+dependencies = { module=configure-gprofng; on=configure-intl; };
+dependencies = { module=all-gprofng; on=all-libiberty; };
+dependencies = { module=all-gprofng; on=all-bfd; };
+dependencies = { module=all-gprofng; on=all-opcodes; };
+dependencies = { module=all-gprofng; on=all-intl; };
+dependencies = { module=all-gprofng; on=all-gas; };
+dependencies = { module=install-gprofng; on=install-opcodes; };
+dependencies = { module=install-gprofng; on=install-bfd; };
+
dependencies = { module=configure-ld; on=configure-intl; };
dependencies = { module=all-ld; on=all-libiberty; };
dependencies = { module=all-ld; on=all-bfd; };
diff --git a/Makefile.in b/Makefile.in
index 06a9398e172..2ea8943dc3b 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -1084,6 +1084,7 @@ configure-host: \
maybe-configure-isl \
maybe-configure-gold \
maybe-configure-gprof \
+ maybe-configure-gprofng \
maybe-configure-intl \
maybe-configure-tcl \
maybe-configure-itcl \
@@ -1237,6 +1238,7 @@ all-host: maybe-all-isl
all-host: maybe-all-gold
@endif gold-no-bootstrap
all-host: maybe-all-gprof
+all-host: maybe-all-gprofng
@if intl-no-bootstrap
all-host: maybe-all-intl
@endif intl-no-bootstrap
@@ -1376,6 +1378,7 @@ info-host: maybe-info-mpc
info-host: maybe-info-isl
info-host: maybe-info-gold
info-host: maybe-info-gprof
+info-host: maybe-info-gprofng
info-host: maybe-info-intl
info-host: maybe-info-tcl
info-host: maybe-info-itcl
@@ -1466,6 +1469,7 @@ dvi-host: maybe-dvi-mpc
dvi-host: maybe-dvi-isl
dvi-host: maybe-dvi-gold
dvi-host: maybe-dvi-gprof
+dvi-host: maybe-dvi-gprofng
dvi-host: maybe-dvi-intl
dvi-host: maybe-dvi-tcl
dvi-host: maybe-dvi-itcl
@@ -1556,6 +1560,7 @@ pdf-host: maybe-pdf-mpc
pdf-host: maybe-pdf-isl
pdf-host: maybe-pdf-gold
pdf-host: maybe-pdf-gprof
+pdf-host: maybe-pdf-gprofng
pdf-host: maybe-pdf-intl
pdf-host: maybe-pdf-tcl
pdf-host: maybe-pdf-itcl
@@ -1646,6 +1651,7 @@ html-host: maybe-html-mpc
html-host: maybe-html-isl
html-host: maybe-html-gold
html-host: maybe-html-gprof
+html-host: maybe-html-gprofng
html-host: maybe-html-intl
html-host: maybe-html-tcl
html-host: maybe-html-itcl
@@ -1736,6 +1742,7 @@ TAGS-host: maybe-TAGS-mpc
TAGS-host: maybe-TAGS-isl
TAGS-host: maybe-TAGS-gold
TAGS-host: maybe-TAGS-gprof
+TAGS-host: maybe-TAGS-gprofng
TAGS-host: maybe-TAGS-intl
TAGS-host: maybe-TAGS-tcl
TAGS-host: maybe-TAGS-itcl
@@ -1826,6 +1833,7 @@ install-info-host: maybe-install-info-mpc
install-info-host: maybe-install-info-isl
install-info-host: maybe-install-info-gold
install-info-host: maybe-install-info-gprof
+install-info-host: maybe-install-info-gprofng
install-info-host: maybe-install-info-intl
install-info-host: maybe-install-info-tcl
install-info-host: maybe-install-info-itcl
@@ -1916,6 +1924,7 @@ install-dvi-host: maybe-install-dvi-mpc
install-dvi-host: maybe-install-dvi-isl
install-dvi-host: maybe-install-dvi-gold
install-dvi-host: maybe-install-dvi-gprof
+install-dvi-host: maybe-install-dvi-gprofng
install-dvi-host: maybe-install-dvi-intl
install-dvi-host: maybe-install-dvi-tcl
install-dvi-host: maybe-i

[PATCH] gprofng: a new GNU profiler

2023-12-13 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

This is fixes for releases/gcc-13 for 31109 gprofng not built and installed in 
a combined binutils+gcc build
I only cherry-picked 24552056fd5fc677c0d032f54a5cad1c4303d312 and tested my 
build.

ChangeLog:

* Makefile.def: Add gprofng module.
* configure.ac: Add --enable-gprofng option.
* Makefile.in: Regenerate.
* configure: Regenerate.

include/ChangeLog:

* collectorAPI.h: New file.
* libcollector.h: New file.
* libfcollector.h: New file.
---
 Makefile.def|  11 +
 Makefile.in | 497 
 configure   |  18 ++
 configure.ac|  14 ++
 include/collectorAPI.h  |  73 ++
 include/libcollector.h  |  89 +++
 include/libfcollector.h |  42 
 7 files changed, 744 insertions(+)
 create mode 100644 include/collectorAPI.h
 create mode 100644 include/libcollector.h
 create mode 100644 include/libfcollector.h

diff --git a/Makefile.def b/Makefile.def
index 35e994eb77e..9d43b78b51b 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -72,6 +72,7 @@ host_modules= { module= isl; lib_path=.libs; bootstrap=true;
no_install= true; };
 host_modules= { module= gold; bootstrap=true; };
 host_modules= { module= gprof; };
+host_modules= { module= gprofng; };
 // intl acts on 'host_shared' directly, and does not support --with-pic.
 host_modules= { module= intl; bootstrap=true; };
 host_modules= { module= tcl;
@@ -510,6 +511,16 @@ dependencies = { module=all-gprof; on=all-bfd; };
 dependencies = { module=all-gprof; on=all-opcodes; };
 dependencies = { module=all-gprof; on=all-intl; };
 dependencies = { module=all-gprof; on=all-gas; };
+
+dependencies = { module=configure-gprofng; on=configure-intl; };
+dependencies = { module=all-gprofng; on=all-libiberty; };
+dependencies = { module=all-gprofng; on=all-bfd; };
+dependencies = { module=all-gprofng; on=all-opcodes; };
+dependencies = { module=all-gprofng; on=all-intl; };
+dependencies = { module=all-gprofng; on=all-gas; };
+dependencies = { module=install-gprofng; on=install-opcodes; };
+dependencies = { module=install-gprofng; on=install-bfd; };
+
 dependencies = { module=configure-ld; on=configure-intl; };
 dependencies = { module=all-ld; on=all-libiberty; };
 dependencies = { module=all-ld; on=all-bfd; };
diff --git a/Makefile.in b/Makefile.in
index 06a9398e172..2ea8943dc3b 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -1084,6 +1084,7 @@ configure-host:  \
 maybe-configure-isl \
 maybe-configure-gold \
 maybe-configure-gprof \
+maybe-configure-gprofng \
 maybe-configure-intl \
 maybe-configure-tcl \
 maybe-configure-itcl \
@@ -1237,6 +1238,7 @@ all-host: maybe-all-isl
 all-host: maybe-all-gold
 @endif gold-no-bootstrap
 all-host: maybe-all-gprof
+all-host: maybe-all-gprofng
 @if intl-no-bootstrap
 all-host: maybe-all-intl
 @endif intl-no-bootstrap
@@ -1376,6 +1378,7 @@ info-host: maybe-info-mpc
 info-host: maybe-info-isl
 info-host: maybe-info-gold
 info-host: maybe-info-gprof
+info-host: maybe-info-gprofng
 info-host: maybe-info-intl
 info-host: maybe-info-tcl
 info-host: maybe-info-itcl
@@ -1466,6 +1469,7 @@ dvi-host: maybe-dvi-mpc
 dvi-host: maybe-dvi-isl
 dvi-host: maybe-dvi-gold
 dvi-host: maybe-dvi-gprof
+dvi-host: maybe-dvi-gprofng
 dvi-host: maybe-dvi-intl
 dvi-host: maybe-dvi-tcl
 dvi-host: maybe-dvi-itcl
@@ -1556,6 +1560,7 @@ pdf-host: maybe-pdf-mpc
 pdf-host: maybe-pdf-isl
 pdf-host: maybe-pdf-gold
 pdf-host: maybe-pdf-gprof
+pdf-host: maybe-pdf-gprofng
 pdf-host: maybe-pdf-intl
 pdf-host: maybe-pdf-tcl
 pdf-host: maybe-pdf-itcl
@@ -1646,6 +1651,7 @@ html-host: maybe-html-mpc
 html-host: maybe-html-isl
 html-host: maybe-html-gold
 html-host: maybe-html-gprof
+html-host: maybe-html-gprofng
 html-host: maybe-html-intl
 html-host: maybe-html-tcl
 html-host: maybe-html-itcl
@@ -1736,6 +1742,7 @@ TAGS-host: maybe-TAGS-mpc
 TAGS-host: maybe-TAGS-isl
 TAGS-host: maybe-TAGS-gold
 TAGS-host: maybe-TAGS-gprof
+TAGS-host: maybe-TAGS-gprofng
 TAGS-host: maybe-TAGS-intl
 TAGS-host: maybe-TAGS-tcl
 TAGS-host: maybe-TAGS-itcl
@@ -1826,6 +1833,7 @@ install-info-host: maybe-install-info-mpc
 install-info-host: maybe-install-info-isl
 install-info-host: maybe-install-info-gold
 install-info-host: maybe-install-info-gprof
+install-info-host: maybe-install-info-gprofng
 install-info-host: maybe-install-info-intl
 install-info-host: maybe-install-info-tcl
 install-info-host: maybe-install-info-itcl
@@ -1916,6 +1924,7 @@ install-dvi-host: maybe-install-dvi-mpc
 install-dvi-host: maybe-install-dvi-isl
 install-dvi-host: maybe-install-dvi-gold
 install-dvi-host: maybe-install-dvi-gprof
+install-dvi-host: maybe-install-dvi-gprofng
 install-dvi-host: maybe-install-dvi-intl
 install-dvi-host: maybe-install-dvi-tcl
 install-dvi-host: maybe-install-dvi-itcl
@@ -2006,6 +2015,7 @@ install-pdf-host: maybe-install-pdf-mpc
 install-pdf-host: maybe-install-pdf-isl
 install-pdf-h

Re: [PATCH] PR gcc/84923 - gcc.dg/attr-weakref-1.c failed on aarch64

2018-05-21 Thread vladimir . mezentsev


On 05/21/2018 01:30 PM, Jeff Law wrote:
> On 05/17/2018 02:42 AM, Richard Biener wrote:
>> On Thu, 17 May 2018, Kyrill Tkachov wrote:
>>
>>> Hi,
>>>
>>> Given this is a midend change it's a good idea to CC some of the maintainers
>>> of that area.
>>> I've copied richi and Honza.
>> The patch is ok for trunk (it's actually mine...) and for the branch
>> after a while.
> Installed.
>
> Vladimir -- do you have write access?  
  No. I don't.

> ISTM it would make sense for you
> to have access if you plan to continue submitting changes.

yes. I plan to continue submitting changes.

-Vladimir

>
> Jeff



Re: [PATCH] PR gcc/84923 - gcc.dg/attr-weakref-1.c failed on aarch64

2018-05-16 Thread vladimir . mezentsev
Ping.

-Vladimir


On 05/10/2018 11:30 PM, vladimir.mezent...@oracle.com wrote:
> From: Vladimir Mezentsev 
>
> When weakref_targets is not empty a target cannot be removed from the weak 
> list.
> A small example is below when 'wv12' is removed from the weak list on aarch64:
>   static vtype Wv12 __attribute__((weakref ("wv12")));
>   extern vtype wv12 __attribute__((weak));
>
> Bootstrapped on aarch64-unknown-linux-gnu including (c,c++ and go).
> Tested on aarch64-linux-gnu.
> No regression. The attr-weakref-1.c test passed.
>
> ChangeLog:
> 2018-05-10  Vladimir Mezentsev  
>
> PR gcc/84923
> * varasm.c (weak_finish): clean up weak_decls
> ---
>  gcc/varasm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index 85296b4..8cf6e1e 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -5652,7 +5652,8 @@ weak_finish (void)
>tree alias_decl = TREE_PURPOSE (t);
>tree target = ultimate_transparent_alias_target (&TREE_VALUE (t));
>  
> -  if (! TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (alias_decl)))
> +  if (! TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (alias_decl))
> + || TREE_SYMBOL_REFERENCED (target))
>   /* Remove alias_decl from the weak list, but leave entries for
>  the target alone.  */
>   target = NULL_TREE;



[PATCH] PR gcc/84923 - gcc.dg/attr-weakref-1.c failed on aarch64

2018-05-10 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

When weakref_targets is not empty a target cannot be removed from the weak list.
A small example is below when 'wv12' is removed from the weak list on aarch64:
  static vtype Wv12 __attribute__((weakref ("wv12")));
  extern vtype wv12 __attribute__((weak));

Bootstrapped on aarch64-unknown-linux-gnu including (c,c++ and go).
Tested on aarch64-linux-gnu.
No regression. The attr-weakref-1.c test passed.

ChangeLog:
2018-05-10  Vladimir Mezentsev  

PR gcc/84923
* varasm.c (weak_finish): clean up weak_decls
---
 gcc/varasm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 85296b4..8cf6e1e 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -5652,7 +5652,8 @@ weak_finish (void)
   tree alias_decl = TREE_PURPOSE (t);
   tree target = ultimate_transparent_alias_target (&TREE_VALUE (t));
 
-  if (! TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (alias_decl)))
+  if (! TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (alias_decl))
+ || TREE_SYMBOL_REFERENCED (target))
/* Remove alias_decl from the weak list, but leave entries for
   the target alone.  */
target = NULL_TREE;
-- 
1.8.3.1



[PATCH] PR gcc/84923 - gcc.dg/attr-weakref-1.c failed on aarch64

2018-04-12 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

When weakref_targets is not empty a target cannot be removed from weak_decls.
A small example is below when 'wv12' is removed from the weak list on aarch64:
  static vtype Wv12 __attribute__((weakref ("wv12")));
  extern vtype wv12 __attribute__((weak));

Bootstrapped on aarch64-unknown-linux-gnu including (c,c++ and go).
Tested on aarch64-linux-gnu.
No regression. The attr-weakref-1.c test passed.

ChangeLog:
2018-04-12  Vladimir Mezentsev  

PR gcc/84923
* varasm.c (weak_finish): clean up weak_decls
---
 gcc/varasm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/varasm.c b/gcc/varasm.c
index d24bac4..2a70234 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -5683,8 +5683,7 @@ weak_finish (void)
   nor multiple .weak directives for the latter.  */
for (p = &weak_decls; (t2 = *p) ; )
  {
-   if (TREE_VALUE (t2) == alias_decl
-   || target == DECL_ASSEMBLER_NAME (TREE_VALUE (t2)))
+   if (TREE_VALUE (t2) == alias_decl)
  *p = TREE_CHAIN (t2);
else
  p = &TREE_CHAIN (t2);
-- 
1.8.3.1



Re: [PATCH] PR gcc/68256 Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.

2018-03-15 Thread vladimir . mezentsev

On 03/15/2018 07:07 AM, Bin.Cheng wrote:
> On Fri, Feb 16, 2018 at 5:18 PM,   wrote:
>> From: Vladimir Mezentsev 
>>
>> Ramana Radhakrishnan made a workaround in gcc/config/aarch64/aarch64.c to 
>> resolve
>> bootstrap comparison failure (2015-11-10, commit 
>> bc443a71dafa2e707bae4b2fa74f83b05dea37ab).
>> The real bug is in gcc/varasm.c.
>> hash_section() returns an unstable value.
>> As result, two blocks are created in get_block_for_section() for one unnamed 
>> section.
>> A list of objects in these blocks depends on the -gtoggle option.
>> I removed Ramana's workaround in gcc/config/aarch64/aarch64.c and  
>> I fixed hash_section() in gcc/varasm.c
>>
>> Bootstrapped on aarch64-unknown-linux-gnu including (c,c++ and go).
>> Testing finished ok.
> Hi Vladimir,
> Thanks for fixing the long standing issue, but this change causes below 
> failure,
> could you have a look?  Thanks
I did not see this failure.
I cannot tell why.
I use one CFARM machine (gcc116) and when my fix was approved I clean up
my directories
on this machine (including binaries and log files).

ramana.radhakrish...@arm.com implemented (2015-11-06, commit
cd4fcdb8ff16ec2dad56f9e736ac4552f8f52001)
a new feature ('Switch constant pools to separate rodata sections').
An additional fix is needed for this feature.

The test below demonstrates problem:
% cat t.c
extern void abort(void);
typedef int vtype;
static vtype Wv12 __attribute__((weakref ("wv12")));
extern vtype wv12 __attribute__((weak));

#define chk(p) do { if (!p) abort (); } while (0)

int main () {
  chk (!&Wv12);
  chk (!&wv12);
  return (0);
}

% gcc  t.c
/tmp/cciZgRxK.o:(.rodata+0x0): undefined reference to `wv12'
/tmp/cciZgRxK.o:(.rodata+0x8): undefined reference to `wv12'

% gcc  t.c -S
% cat t.s

    .size   main, .-main
*    .weakref    Wv12,wv12   *<<<<<<   Not here. This should be
after definitions of Wv12 and wv12.
Wv12    .section    .rodata
    .align  3
.LC0:
    .xword  Wv12
    .align  3
.LC1:
    .xword  wv12


I can file a new PR and fix it by Tuesday/Wednesday
or we can  temporary restore Ramana's workaround in 
aarch64_use_blocks_for_constant_p:
% git diff
gcc/config/aarch64/aarch64.c
   

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4b5183b..222ea33 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7735,7 +7735,11 @@ aarch64_use_blocks_for_constant_p (machine_mode,
const_rtx)
 {
   /* We can't use blocks for constants when we're using a per-function
  constant pool.  */
-  return !aarch64_can_use_per_function_literal_pools_p ();
+  /* Fixme::
+ return !aarch64_can_use_per_function_literal_pools_p ();
+ This return statement breaks gcc.dg/attr-weakref-1.c test.
+ For now we workaround this by returning false here.  */
+  return false;
 }
 
 /* Select appropriate section for constants depending


-Vladimir

>
> Failures:
>     gcc.dg/attr-weakref-1.c
>
> Bisected to:
>
>
> commit 536728c16d6a0173930ecfe370302baa471c299e
> Author: rguenth 
> Date:   Thu Mar 15 08:55:04 2018 +
>
> 2018-03-15  Vladimir Mezentsev  
>
> PR target/68256
> * varasm.c (hash_section): Return an unchangeble hash value
> * config/aarch64/aarch64.c (aarch64_use_blocks_for_constant_p):
> Return !aarch64_can_use_per_function_literal_pools_p ().
>
>
> Thanks,
> bin
>> ChangeLog:
>> 2018-02-15  Vladimir Mezentsev  
>>
>> PR gcc/68256
>> * varasm.c (hash_section): Return an unchangeble hash value
>> * config/aarch64/aarch64.c (aarch64_use_blocks_for_constant_p):
>> Return !aarch64_can_use_per_function_literal_pools_p ();
>> ---
>>  gcc/config/aarch64/aarch64.c | 8 +++-
>>  gcc/varasm.c | 2 +-
>>  2 files changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 174310c..a0a495d 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -7596,11 +7596,9 @@ aarch64_can_use_per_function_literal_pools_p (void)
>>  static bool
>>  aarch64_use_blocks_for_constant_p (machine_mode, const_rtx)
>>  {
>> -  /* Fixme:: In an ideal world this would work similar
>> - to the logic in aarch64_select_rtx_section but this
>> - breaks bootstrap in gcc go.  For now we workaround
>> - this by returning false here.  */
>> -  return false;
>> +  /* We can't use blocks for constants when we're using a per-function
>

[PATCH] PR gcc/68256 Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.

2018-02-16 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

Ramana Radhakrishnan made a workaround in gcc/config/aarch64/aarch64.c to 
resolve
bootstrap comparison failure (2015-11-10, commit 
bc443a71dafa2e707bae4b2fa74f83b05dea37ab).
The real bug is in gcc/varasm.c.
hash_section() returns an unstable value.
As result, two blocks are created in get_block_for_section() for one unnamed 
section.
A list of objects in these blocks depends on the -gtoggle option.
I removed Ramana's workaround in gcc/config/aarch64/aarch64.c and
I fixed hash_section() in gcc/varasm.c

Bootstrapped on aarch64-unknown-linux-gnu including (c,c++ and go).
Testing finished ok.

ChangeLog:
2018-02-15  Vladimir Mezentsev  

PR gcc/68256
* varasm.c (hash_section): Return an unchangeble hash value
* config/aarch64/aarch64.c (aarch64_use_blocks_for_constant_p):
Return !aarch64_can_use_per_function_literal_pools_p ();
---
 gcc/config/aarch64/aarch64.c | 8 +++-
 gcc/varasm.c | 2 +-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 174310c..a0a495d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7596,11 +7596,9 @@ aarch64_can_use_per_function_literal_pools_p (void)
 static bool
 aarch64_use_blocks_for_constant_p (machine_mode, const_rtx)
 {
-  /* Fixme:: In an ideal world this would work similar
- to the logic in aarch64_select_rtx_section but this
- breaks bootstrap in gcc go.  For now we workaround
- this by returning false here.  */
-  return false;
+  /* We can't use blocks for constants when we're using a per-function
+ constant pool.  */
+  return !aarch64_can_use_per_function_literal_pools_p ();
 }
 
 /* Select appropriate section for constants depending
diff --git a/gcc/varasm.c b/gcc/varasm.c
index b045efa..5aae5b4 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -225,7 +225,7 @@ hash_section (section *sect)
 {
   if (sect->common.flags & SECTION_NAMED)
 return htab_hash_string (sect->named.name);
-  return sect->common.flags;
+  return sect->common.flags & ~SECTION_DECLARED;
 }
 
 /* Helper routines for maintaining object_block_htab.  */
-- 
1.8.3.1



Re: [PATCH] PR libgcc/59714 complex division is surprising on aarch64

2018-02-06 Thread vladimir . mezentsev
On 02/06/2018 08:53 AM, Joseph Myers wrote:
> The files in libgcc/soft-fp must be verbatim copies of the master sources
> in glibc.  So you can't make any local changes to them, and if you think 
> changes are needed you need to patch things upstream in glibc first, with 
> a proper extended explanation of why the fix is needed and why it is safe 
> and the appropriate design for the fix.  There's nothing at all in this 
> patch submission to explain that change.

It is a bug/typo  in gcc/libgcc/soft-fp/quad.h when _FP_W_TYPE_SIZE is
64 (for example, on aarch64) .
It looks like a code after line 202 was never used.
% cat -n libgcc/soft-fp/quad.h
...
   201   
   202    #else   /* not _FP_W_TYPE_SIZE < 64 */
   203    union _FP_UNION_Q
   204    {
   205      TFtype flt /* __attribute__ ((mode (TF))) */ ;
   206      struct *_FP_STRUCT_LAYOUT*
   207      {
   208        _FP_W_TYPE a, b;
   209      } longs;
   210      struct *_FP_STRUCT_LAYOUT*
   211      {
   212    # if __BYTE_ORDER == __BIG_ENDIAN
   213        unsigned sign    : 1;
   214        unsigned exp : _FP_EXPBITS_Q;
   215        _FP_W_TYPE frac1 : _FP_FRACBITS_Q - (_FP_IMPLBIT_Q != 0) -
_FP_W_TYPE_SIZE;
   216        _FP_W_TYPE frac0 : _FP_W_TYPE_SIZE;
   217    # else
   218        _FP_W_TYPE frac0 : _FP_W_TYPE_SIZE;
   219        _FP_W_TYPE frac1 : _FP_FRACBITS_Q - (_FP_IMPLBIT_Q != 0) -
_FP_W_TYPE_SIZE;
   220        unsigned exp : _FP_EXPBITS_Q;
   221        unsigned sign    : 1;
   222    # endif
   223      } bits;
   224    };

We see 'struct _FP_STRUCT_LAYOUT' is declared twice (in lines 206 and
210) inside union _FP_UNION_Q.
Compiler reports warning.


-Vladimir

P.S.:
gcc/libgcc/soft-fp/ and  glibc/soft-fp/ are not synchronized now. For
example:

% diff gcc/libgcc/soft-fp/quad.h  glibc/soft-fp/
3c3
<    Copyright (C) 1997-2016 Free Software Foundation, Inc.
---
>    Copyright (C) 1997-2018 Free Software Foundation, Inc.
96c96
<   } bits __attribute__ ((packed));
---
>   } bits;



Re: [PATCH] PR libgcc/59714 complex division is surprising on aarch64

2018-02-06 Thread vladimir . mezentsev
On 01/29/2018 12:51 PM, Joseph Myers wrote:
> On Mon, 29 Jan 2018, vladimir.mezent...@oracle.com wrote:
>
>>> What about powerpc __divkc3?
>>>
>>> What was the rationale for using soft-fp rather than adding appropriate 
>>> built-in functions as suggested in a comment?
>> I had a version with built-in functions and I can restore it.
>>
>> Let's design what we want to use soft-fp or built-in function.
>> I'd prefer to use built-in functions but performance is in two times worse.
>>
>> The test below demonstrates performance degradation:
> So, this is comparing __builtin_frexp (extracting both exponent and 
> mantissa, including appropriate handling of exponents of subnormal values) 
> with just extracting the exponent and with no special handling of zero / 
> infinity / NaN / subnormal values.
 I compared with __builtin_ilogb. There is a same performance degradation.

> We need to consider what functionality is actually needed for this 
> scaling, if we do wish to use such scaling based on integer exponents.  If 
> what's needed corresponds exactly to some standard functions such as ilogb 
> and scalbn with all their special cases, then built-in versions of those 
> standard functions may make sense.

 The IEEE format for double is      .
We need a function like extract_exponent_field.

  All standard function make an additional work to get  exponent.
It is a reason why we see performance degradation.

% cat r2.c
#include 
int main(int argc, char** argv)
{
  double d = 0x0.004p-1023;
  printf("d=%-24.13a  exp=%d\n", d, __builtin_ilogb(d));
  return 0;
}

% gcc -O2 -lm r2.c ; ./a.out
d=0x0.00200p-1022   *exp=-1033
*
-Vladimir

>   If what's needed does not correspond 
> to standard functions and does not need to handle such special cases, 
> that's more of an indication for examining the representation in libgcc - 
> but it's still necessary to handle the many different variant 
> floating-point formats supported, or to safely avoid using the new code 
> for formats it can't handle.
>



[PATCH] PR libgcc/59714 complex division is surprising on aarch64

2018-02-05 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

Tested on aarch64-linux-gnu, x86_64-pc-linux-gnu and sparc64-unknown-linux-gnu.
No regression. New tests now passed.
There is a performance degradation for complex double type:

  failed cases  |time
   old   new|   old  new
complex-32  150 |  3.51  3.56
complex-64  190 |  1.88  2.34
complex-128 340 |  3.71  2.88

The formula for complex division (a+bi)/(c+di) is (a*c+b*d)/(c*c+d*d) + 
(b*c-a*d)/(c*c+d*d)i
The current implementation avoids overflow in (c*c + d*d) using
  x=MAX(|c|, |d|);  c1=c/x; d1=d/x;
and then calculating the result as
  (a*c1+b*d1)/(c1*c1+d1*d1)*(1.0/x) + (b*c1-a*d1)/(c1*c1+d1*d1)*(1.0/x)i

This approach has two issues:
  1. Unexpected rounding when x is not a power of 2.0
 For example, (1.0+3.0i) /(1.0+3.0i) should be 1.0
 but it is (1.0 - 1.66533e-17i) on aarch64 (PR59714)
  2. Unexpected underflow in c/x or d/x.
 For example, (0x1.0p+1023 + 0x0.8p-1022 i)/(0x1.0p+677 + 0x1.0p-677 i)
 should be (0x1.0p+346 - 0x1.0p-1008 i) but it is (0x1.0p+346)

My implementation avoids these issues.
I use soft-fp instead built-in functions because performance with built-in
functions (like __builtin_ilogb) is in two times worse.
Also libm is needed for built-in functions.

Restrictions:
  An old implementation is used when an architecture doesn't provide 
sfp-machine.h
  An old implementation is used for __divkc3 (I have no access on rs6000 for 
testing)

ChangeLog:
2018-02-05  Vladimir Mezentsev  

PR libgcc/59714
* libgcc/libgcc2.c: New complex division implementation
* libgcc/soft-fp/quad.h: struct _FP_STRUCT_LAYOUT was redefined
* libgcc/config/no-sfp-machine.h: Add _NO_SFP_MACHINE_H
* libgcc/config/sparc/sfp-machine.h: New
* gcc/testsuite/gcc.c-torture/execute/pr59714_128.c: New test
* gcc/testsuite/gcc.c-torture/execute/pr59714_32.c: New test
* gcc/testsuite/gcc.c-torture/execute/pr59714_64.c: New test
---
 gcc/testsuite/gcc.c-torture/execute/pr59714_128.c | 506 ++
 gcc/testsuite/gcc.c-torture/execute/pr59714_32.c  | 506 ++
 gcc/testsuite/gcc.c-torture/execute/pr59714_64.c  | 506 ++
 libgcc/config/no-sfp-machine.h|   3 +
 libgcc/config/sparc/sfp-machine.h |  14 +
 libgcc/libgcc2.c  | 178 +++-
 libgcc/soft-fp/quad.h |   2 +-
 7 files changed, 1710 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr59714_128.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr59714_32.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr59714_64.c
 create mode 100644 libgcc/config/sparc/sfp-machine.h

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr59714_128.c 
b/gcc/testsuite/gcc.c-torture/execute/pr59714_128.c
new file mode 100644
index 000..e26f40a
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr59714_128.c
@@ -0,0 +1,506 @@
+#include 
+#include 
+
+extern void abort (void);
+#define FABS(x) ((x) >= 0 ? (x) : -(x))
+#define FTYPE   long double
+#define FMT "%-39.28La"
+
+/*
+   Tables of numbers for complex division testing
+   All values in hex format
+   z[i] == correctly rounded value of x[i]/y[i];
+ */
+
+#define NUMCNT 100
+
+__complex FTYPE x[NUMCNT] =
+  {
+0x1.0p0L + 0x1.0p0Li,  /* #1 */
+0x1.0p0L + 0x1.0p0Li,  /* #2 */
+0x1.0p1023L + 0x1.0p-1023Li,   /* #3 */
+0x1.0p1023L + 0x1.0p1023Li,/* #4 */
+0x1.0p1020L + 0x1.0p-844Li,/* #5 */
+0x1.0p-71L + 0x1.0p1021Li, /* #6 */
+0x1.0p-347L + 0x1.0p-54Li, /* #7 */
+0x1.0p-1074L + 0x1.0p-1074Li,  /* #8 */
+0x1.0p1015L + 0x1.0p-989Li,/* #9 */
+0x1.0p-622L + 0x1.0p-1071Li,   /* #10 */
+0x1.0p0L + 0x3.0p0Li,  /* #11 */
+-0x1.0p0L + 0x3.0p0Li, /* #12 */
+0x1.0p0L - 0x3.0p0Li,  /* #13 */
+0x3.0p0L + 0x1.0p0Li,  /* #14 */
+0x3.0p0L + 0x4.0p0Li,  /* #15 */
+0x1.0p0L - 0x3.1p-20Li,/* #16 */
+0x1.0p0L + 0x3.001p-20Li,  /* #17 */
+0x1.0p0L + 0x1.80002p1Li,  /* #18 */
+0x1.0p0L + 0x1.80004p1Li,  /* #19 */
+0x1.0p0L + 0x1.80008p1Li,  /* #20 */
+0x1.0p0L + 0x1.80010p1Li,  /* #21 */
+0x1.0p0L + 0x1.80020p1Li,  /* #22 */
+0x1.0p0L + 0x1.80040p1Li,  /* #23 */
+0x1.0p0L + 0x1.80080p1Li,  /* #24 */
+0x1.0p0L + 0x1.7ff8p1Li,   /* #25 */
+0x1.0p0L + 0x1.7ff0p1Li,   /* #26 */
+0x1.0p0L + 0x1.7f80p1Li,   /* #27 */
+0x1.0p0L + 0x1.7f00p1Li,   /* #28 */
+0x5.0p0L + 0x5.0p0Li,  /* #29 */
+0x4.0p0L + 0x3.0p0Li,  /* #30 */
+-0x4.0p0L + 0x3.1p0Li, /* #31 */
+0x3.001p-20L + 0x1.0p0Li,  /* #32 */
+  

Re: [PATCH] PR libgcc/59714 complex division is surprising on aarch64

2018-01-29 Thread vladimir . mezentsev

On 01/25/2018 04:14 PM, Joseph Myers wrote:
> On Thu, 25 Jan 2018, vladimir.mezent...@oracle.com wrote:
>
>> From: Vladimir Mezentsev 
>>
>> Tested on aarch64-linux-gnu, x86_64-pc-linux-gnu and 
>> sparc64-unknown-linux-gnu.
>> No regression. New tests now passed.
>> There is a performance degradation for complex double type:
> This is, of course, something to consider for GCC 9; it's not suitable for 
> the current development stage.
  Why ?

>
> Could you provide a more extended writeup of the issue (starting with the 
> argument for it being a bug at all), the approach you took and the 
> rationale for the approach, when you resubmit the patch for GCC 9 stage 1?

The formula for complex division (a+bi)/(c+di) is (a*c+b*d)/(c*c+d*d) +
(b*c-a*d)/(c*c+d*d)i

The current implementation avoids overflow in (c*c + d*d) using
x=MAX(|c|, |d|);  c1=c/x; d1=d/x;
and then calculating the result as
(a*c1+b*d1)/(c1*c1+d1*d1)*(1.0/x) + (b*c1-a*d1)/(c1*c1+d1*d1)*(1.0/x)i

This approach has two issues:
1. Unexpected rounding when x is not a power of 2.0
For example, (1.0+3.0i) /(1.0+3.0i) should be 1.0
but it is (1.0 - 1.66533e-17i) on aarch64 (PR59714)

2. Unexpected underflow in c/x or d/x.
For example, (0x1.0p+1023 + 0x0.8p-1022 i)/(0x1.0p+677 + 0x1.0p-677 i)
should be (0x1.0p+346 - 0x1.0p-1008 i) but it is (0x1.0p+346)

My implementation avoids these issues.


>
>> * libgcc/config/sparc/sfp-machine.h: New
> Are you introducing a requirement for all architectures to provide 
> sfp-machine.h?  If so, did you determine that SPARC was the only one 
> lacking it?  If not the only one lacking it, you need to make sure that 
> you do not break any existing architecture in GCC.
>
> What about architectures using non-IEEE formats?  soft-fp is not suitable 
> for those, but you mustn't break them.  Remember that e.g. TFmode can be 
> IBM long double, and other ?Fmode similarly need not be IEEE; the name 
> only indicates how many bytes are in the format, nothing else about it.
Agree. 
An additional fix is needed.
Build  can be broken without sfp-mashine.h or on  architectures using
non-IEEE format.

>
> What about powerpc __divkc3?
>
> What was the rationale for using soft-fp rather than adding appropriate 
> built-in functions as suggested in a comment?

I had a version with built-in functions and I can restore it.

Let's design what we want to use soft-fp or built-in function.
I'd prefer to use built-in functions but performance is in two times worse.

The test below demonstrates performance degradation:
% cat r1.c
#include 

int main(int argc, char** argv)
{
  int i;
  long long sum = 0;
  for(i = 1; i < 10; i++)
    {
  int exp;
  double d = (double) i;
#ifdef BUITIN
  double v = __builtin_frexp(d, &exp);
#else
  union _FP_UNION_D
  {
    double flt;
    struct _FP_STRUCT_LAYOUT
    {
  unsigned frac0 : 32;
  unsigned frac1 : 20;
  unsigned exp   : 11;
  unsigned sign  : 1;
    } bits __attribute__ ((packed));
    long long LL;
  } u;
  u.flt = d;
  exp = u.bits.exp - 1022;
#endif
  sum += exp * (i % 2 == 0 ? 1 : -1);
    }
  printf("sum = %d\n", sum);
  return 0;
}

%gcc -O2 -DBUITIN r1.c ; time ./a.out
real    0m4.247s
user    0m4.243s
sys    0m0.001s

%gcc -O2 r1.c ; time ./a.out
real    0m1.977s
user    0m1.973s
sys    0m0.001s


  -Vladimir


[PATCH] PR libgcc/59714 complex division is surprising on aarch64

2018-01-25 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

Tested on aarch64-linux-gnu, x86_64-pc-linux-gnu and sparc64-unknown-linux-gnu.
No regression. New tests now passed.
There is a performance degradation for complex double type:

  failed cases  |time
   old   new|   old  new
complex-32  150 |  3.51  3.56
complex-64  190 |  1.88  2.34
complex-128 340 |  3.71  2.88

ChangeLog:
2018-01-25  Vladimir Mezentsev  

PR libgcc/59714
* libgcc/libgcc2.c: New complex division implementation
* libgcc/config/sparc/sfp-machine.h: New
* gcc/testsuite/gcc.c-torture/execute/pr59714_128.c: New test
* gcc/testsuite/gcc.c-torture/execute/pr59714_32.c: New test
* gcc/testsuite/gcc.c-torture/execute/pr59714_64.c: New test
---
 gcc/testsuite/gcc.c-torture/execute/pr59714_128.c | 506 ++
 gcc/testsuite/gcc.c-torture/execute/pr59714_32.c  | 506 ++
 gcc/testsuite/gcc.c-torture/execute/pr59714_64.c  | 506 ++
 libgcc/config/sparc/sfp-machine.h |  13 +
 libgcc/libgcc2.c  | 175 +++-
 5 files changed, 1700 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr59714_128.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr59714_32.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr59714_64.c
 create mode 100644 libgcc/config/sparc/sfp-machine.h

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr59714_128.c 
b/gcc/testsuite/gcc.c-torture/execute/pr59714_128.c
new file mode 100644
index 000..e26f40a
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr59714_128.c
@@ -0,0 +1,506 @@
+#include 
+#include 
+
+extern void abort (void);
+#define FABS(x) ((x) >= 0 ? (x) : -(x))
+#define FTYPE   long double
+#define FMT "%-39.28La"
+
+/*
+   Tables of numbers for complex division testing
+   All values in hex format
+   z[i] == correctly rounded value of x[i]/y[i];
+ */
+
+#define NUMCNT 100
+
+__complex FTYPE x[NUMCNT] =
+  {
+0x1.0p0L + 0x1.0p0Li,  /* #1 */
+0x1.0p0L + 0x1.0p0Li,  /* #2 */
+0x1.0p1023L + 0x1.0p-1023Li,   /* #3 */
+0x1.0p1023L + 0x1.0p1023Li,/* #4 */
+0x1.0p1020L + 0x1.0p-844Li,/* #5 */
+0x1.0p-71L + 0x1.0p1021Li, /* #6 */
+0x1.0p-347L + 0x1.0p-54Li, /* #7 */
+0x1.0p-1074L + 0x1.0p-1074Li,  /* #8 */
+0x1.0p1015L + 0x1.0p-989Li,/* #9 */
+0x1.0p-622L + 0x1.0p-1071Li,   /* #10 */
+0x1.0p0L + 0x3.0p0Li,  /* #11 */
+-0x1.0p0L + 0x3.0p0Li, /* #12 */
+0x1.0p0L - 0x3.0p0Li,  /* #13 */
+0x3.0p0L + 0x1.0p0Li,  /* #14 */
+0x3.0p0L + 0x4.0p0Li,  /* #15 */
+0x1.0p0L - 0x3.1p-20Li,/* #16 */
+0x1.0p0L + 0x3.001p-20Li,  /* #17 */
+0x1.0p0L + 0x1.80002p1Li,  /* #18 */
+0x1.0p0L + 0x1.80004p1Li,  /* #19 */
+0x1.0p0L + 0x1.80008p1Li,  /* #20 */
+0x1.0p0L + 0x1.80010p1Li,  /* #21 */
+0x1.0p0L + 0x1.80020p1Li,  /* #22 */
+0x1.0p0L + 0x1.80040p1Li,  /* #23 */
+0x1.0p0L + 0x1.80080p1Li,  /* #24 */
+0x1.0p0L + 0x1.7ff8p1Li,   /* #25 */
+0x1.0p0L + 0x1.7ff0p1Li,   /* #26 */
+0x1.0p0L + 0x1.7f80p1Li,   /* #27 */
+0x1.0p0L + 0x1.7f00p1Li,   /* #28 */
+0x5.0p0L + 0x5.0p0Li,  /* #29 */
+0x4.0p0L + 0x3.0p0Li,  /* #30 */
+-0x4.0p0L + 0x3.1p0Li, /* #31 */
+0x3.001p-20L + 0x1.0p0Li,  /* #32 */
+0x1.001p-80L + 0x1.0p0Li,  /* #33 */
+0x1.001p-8000L + 0x1.0p0Li,/* #34 */
+0x1.0p0L + 0x1.001p-8000Li,/* #35 */
+-0x1.0p0L + 0x1.001p-8000Li,/* #36 */
+-0x1.0p0L + 0x1.001p-8000Li,/* #37 */
+-0x1.0p0L + 0x1.001p-8000Li,/* #38 */
+0x1.0p0L - 0x1.001p-8000Li,/* #39 */
+0x1.0p1000L - 0x1.001p-8000Li, /* #40 */
+0x1.97530p1000L - 0x1.001p-8000Li, /* #41 */
+0x1.97530p3000L - 0x1.001p-8000Li, /* #42 */
+0x1.97530p3770L - 0x1.001p-8000Li, /* #43 */
+0x1.97530p4770L - 0x1.001p-7000Li, /* #44 */
+0x1.97530p0L - 0x1.001p-12000Li,   /* #45 */
+0x1.97530p1000L - 0x1.001p-11000Li,/* #46 */
+0x1.97530p2000L - 0x1.001p-9000Li, /* #47 */
+0x1.97530p3000L - 0x1.001p-8000Li, /* #48 */
+0x1.97530p4000L - 0x1.001p-7000Li, /* #49 */
+0x1.97530p5000L - 0x1.001p-6000Li, /* #50 */
+0x1.97530p6000L - 0x1.001p-5000Li, /* #51 */
+0x1.97530p7000L - 0x1.001p-4000Li, /* #52 */
+0x1.97530p8000L - 0x1.001p-1000Li, /* #53 */
+0x1.97530p9000L - 0x1.001p-0Li,/* #54 */
+0x1.97530p1L + 0x1.001p2000Li, /* #55 */
+0x1

Re: [PATCH] PR libgcc/59714 complex division is surprising on aarch64

2017-10-25 Thread vladimir . mezentsev
On 10/25/2017 10:28 AM, Joseph Myers wrote:
> On Wed, 25 Oct 2017, vladimir.mezent...@oracle.com wrote:
>
>> +# Disable FMA (floating-point multiply-add) instructions for complex 
>> division.
>> +# These instructions can produce different result if two operations 
>> executed separately.
>> +LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
>> +LIB2_DIV3_FUNCS = _divdc3 _divhc3 _divsc3
> Without regard to whether the change is appropriate in the first place, 
> I'm doubtful of the logic for selecting floating-point modes.  I'd expect 
> something that applies to all floating-point modes.  Even if nothing has 
> fused XFmode operations, some architectures (e.g. s390, powerpc) have 
> fused operations on TFmode / KFmode.

   Hello Joseph and GCC-team,
Different people are giving me different directions.
I am glad to change my fix but please provide clear directions on how
you want to see it.

  michael.hud...@linaro.org created  PR 59714 and suggested to set 
-ffp-contract=off:

> int main(int argc, char** argv)
> {
>  __complex double c = 1.0 + 3.0i;
>  printf("%g\n", __imag__ (c/c));
> }
> ubuntu@arm64:~$ gcc cplx.c -o cplx
> ubuntu@arm64:~$ ./cplx
> -1.66533e-17
>
> This is because libgcc2.c is compiled in a way that lets the compiler used 
> fused multiply add instructions.
> It shouldn't be! > ...

> In general, fma makes it hard to reason about expressions
> like "a*b-c*d" -- which of the multiplications is being done at the
higher precision?
> But I guess arguing to fp-contract to default to off is a war I don't
want to get into fighting.

Thank you,
-Vladimir



[PATCH] PR libgcc/59714 complex division is surprising on aarch64

2017-10-25 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

FMA (floating-point multiply-add) instructions are supported on aarch64.
These instructions can produce different result if two operations executed 
separately.
-ffp-contract=off doesn't allow the FMA instructions.

Tested on aarch64-unknown-linux-gnu. No regression. Two failed tests now passed.

ChangeLog:
2017-10-25  Vladimir Mezentsev  

PR libgcc/59714
* libgcc/Makefile.in: Add -ffp-contract=off
---
 libgcc/Makefile.in | 8 
 1 file changed, 8 insertions(+)

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index a1a392d..f313556 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -455,6 +455,14 @@ ifeq ($(LIB2_SIDITI_CONV_FUNCS),)
   lib2funcs += $(subst XX,di,$(dwfloatfuncs))
 endif
 
+# Disable FMA (floating-point multiply-add) instructions for complex division.
+# These instructions can produce different result if two operations executed 
separately.
+LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
+LIB2_DIV3_FUNCS = _divdc3 _divhc3 _divsc3
+lib2-div3-o = $(patsubst %,%$(objext),$(LIB2_DIV3_FUNCS))
+lib2-div3-s-o = $(patsubst %,%_s$(objext),$(LIB2_DIV3_FUNCS))
+$(lib2-div3-o) $(lib2-div3-s-o): LIBGCC2_CFLAGS += 
$(LIBGCC2_FFP_CONTRAST_CFLAGS)
+
 # These might cause a divide overflow trap and so are compiled with
 # unwinder info.
 LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _divmoddi4 \
-- 
1.8.3.1



Re: [PATCH] PR libgcc/59714 complex division is surprising on aarch64

2017-10-24 Thread vladimir . mezentsev
On 10/19/2017 06:07 AM, Wilco Dijkstra wrote:
> Vladimir wrote:
>
> +# Disable floating-point expression contraction
> +LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
> +
>
> It looks like this disables fp-contract in all of libgcc...
> What is the the number of FMAs in libgcc before/after?
In original (without my fix) libgcc_s.so:
% objdump -d libgcc_s.so| egrep '(fmadd|fmsub|fnmadd|fnmsub)'|wc -l
  38

In my new libgcc_s.so ( see my fix below)
% objdump -d libgcc_s.so| egrep '(fmadd|fmsub|fnmadd|fnmsub)'|wc -l
8
> If it disables anything other than the ones in complex division, it
> would have an unintended performance impact. It's best to do
> this just for complex division.

On 10/19/2017 10:17 AM, Wilco Dijkstra wrote:
> Vladimir Mezentsev 
>> On 10/19/2017 06:37 AM, Richard Earnshaw (lists) wrote:
>>> On 19/10/17 14:07, Wilco Dijkstra wrote:
>>>> Vladimir wrote:
>>>>
>>>> +# Disable floating-point expression contraction
>>>> +LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
>>>> +
>>>>
>>>> It looks like this disables fp-contract in all of libgcc...
>>>> What is the the number of FMAs in libgcc before/after?
>>    How can I find this number ?
>> dis  | grep  | wc
> Eg. objdump -d ~/install/gcc/lib64/libgcc_s.so | grep -c fmadd
> Also grep fmsub, fnmadd, fnmsub.
>
>>> It's probably better to do this with an attribute
>>>
>>>     __attribute__((optimize("fp-contract=off")))
>>>
>>> on the affected functions.
>>    I like your suggestion.
> I don't think this will work in general, IIRC only a few targets correctly
> implement the optimize pragma. Changing the makefile to build only
> __divdc3/__divtc3/__divsc3/__divhc3 without FMA looks like a better
> approach.

Only 3 (_divdc3 / _divhc3 / _divsc3) have FMAs.
The other FMAs are in:
_fixunsdfdi.o _fixunsdfdi_s.o _fixunssfdi.o _fixunssfdi_s.o _muldc3.o
_muldc3_s.o _mulhc3.o _mulhc3_s.o _mulsc3.o _mulsc3_s.o

I suggest the following fix:

% git diff
diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index a1a392d..f313556 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -455,6 +455,14 @@ ifeq ($(LIB2_SIDITI_CONV_FUNCS),)
   lib2funcs += $(subst XX,di,$(dwfloatfuncs))
 endif
 
+# Disable FMA (floating-point multiply-add) instructions for complex
division.
+# These instructions can produce different result if two operations
executed separately.
+LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
+LIB2_DIV3_FUNCS = _divdc3 _divhc3 _divsc3
+lib2-div3-o = $(patsubst %,%$(objext),$(LIB2_DIV3_FUNCS))
+lib2-div3-s-o = $(patsubst %,%_s$(objext),$(LIB2_DIV3_FUNCS))
+$(lib2-div3-o) $(lib2-div3-s-o): LIBGCC2_CFLAGS +=
$(LIBGCC2_FFP_CONTRAST_CFLAGS)
+



 If my fix will be approved I send ChangeLog and a patch file tomorrow.

-Vladimir

> However we also need to decide whether this is the best possible fix.
> It will affect accuracy of other inputs as well...
>
> Wilco



Re: [PATCH] PR libgcc/59714 complex division is surprising on aarch64

2017-10-19 Thread Vladimir Mezentsev

On 10/19/2017 06:37 AM, Richard Earnshaw (lists) wrote:

On 19/10/17 14:07, Wilco Dijkstra wrote:

Vladimir wrote:

+# Disable floating-point expression contraction
+LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
+

It looks like this disables fp-contract in all of libgcc...
What is the the number of FMAs in libgcc before/after?


  How can I find this number ?
dis  | grep  | wc

Or are there the other simple ways ?



If it disables anything other than the ones in complex division, it
would have an unintended performance impact. It's best to do
this just for complex division.

Wilco


It's probably better to do this with an attribute

__attribute__((optimize("fp-contract=off")))

on the affected functions.


  I like your suggestion.

-Vladimir



R.





[PATCH] PR libgcc/59714 complex division is surprising on aarch64

2017-10-18 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

FMA (floating-point multiply-add) instructions are supported on aarch64.
These instructions can produce different result if two operations executed 
separately.
-ffp-contract=off doesn't allow the FMA instructions.

Tested on two platforms:
  aarch64-unknown-linux-gnu: No regression. Two failed tests now passed.
  sparc64-unknown-linux-gnu: No regression.

ChangeLog:
2017-10-18  Vladimir Mezentsev  

PR libgcc/59714
* libgcc/Makefile.in: Add -ffp-contract=off
---
 libgcc/Makefile.in |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index a1a392d..b771875 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -237,12 +237,16 @@ else
 DECNUMINC =
 endif
 
+# Disable floating-point expression contraction
+LIBGCC2_FFP_CONTRAST_CFLAGS = -ffp-contract=off
+
 # Options to use when compiling libgcc2.a.
 #
 LIBGCC2_DEBUG_CFLAGS = -g
 LIBGCC2_CFLAGS = -O2 $(LIBGCC2_INCLUDES) $(GCC_CFLAGS) $(HOST_LIBGCC2_CFLAGS) \
 $(LIBGCC2_DEBUG_CFLAGS) -DIN_LIBGCC2 \
 -fbuilding-libgcc -fno-stack-protector \
+$(LIBGCC2_FFP_CONTRAST_CFLAGS) \
 $(INHIBIT_LIBC_CFLAGS)
 
 # Additional options to use when compiling libgcc2.a.
-- 
1.7.1



Re: [PATCH 2/2] PR libgcc/59714 complex division is surprising on aarch64

2017-10-13 Thread vladimir . mezentsev
On 10/12/2017 03:40 AM, Richard Earnshaw wrote:
> On 12/10/17 06:21, vladimir.mezent...@oracle.com wrote:
>> From: Vladimir Mezentsev 
>>
>> FMA (floating-point multiply-add) instructions are supported on aarch64.
>> These instructions can produce different result if two operations executed 
>> separately.
>> -ffp-contract=off doesn't allow the FMA instructions.
>>
>> Tested on aarch64-linux-gnu.
>> No regression. Two failed tests now passed.
>>
>> ChangeLog:
>> 2017-10-11  Vladimir Mezentsev  
>>
>> PR libgcc/59714
>> * libgcc/config/aarch64/t-aarch64 (HOST_LIBGCC2_CFLAGS): Add 
>> -ffp-contract=off
>> ---
>>  libgcc/config/aarch64/t-aarch64 | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/libgcc/config/aarch64/t-aarch64 
>> b/libgcc/config/aarch64/t-aarch64
>> index 3af933c..e33bef0 100644
>> --- a/libgcc/config/aarch64/t-aarch64
>> +++ b/libgcc/config/aarch64/t-aarch64
>> @@ -18,4 +18,5 @@
>>  # along with GCC; see the file COPYING3.  If not see
>>  # <http://www.gnu.org/licenses/>.
>>  
>> +HOST_LIBGCC2_CFLAGS += -ffp-contract=off
>>  LIB2ADD += $(srcdir)/config/aarch64/sync-cache.c
>>
> Why would we want to do this on AArch64 only?  If it's right for us,
> then surely it would be right for everyone and the option should be
> applied at the top level.

  It is a machine dependent option.
We don't need this option, for example, on sparc or intel machines.

-Vladimir
>
> Hint: I'm not convinced on the evidence here that it is right across the
> whole of libgcc.  Before moving forward on this particular PR I think we
> need to understand what behaviour we do want out of the compiler.
>
> R.



[PATCH 2/2] PR libgcc/59714 complex division is surprising on aarch64

2017-10-11 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

FMA (floating-point multiply-add) instructions are supported on aarch64.
These instructions can produce different result if two operations executed 
separately.
-ffp-contract=off doesn't allow the FMA instructions.

Tested on aarch64-linux-gnu.
No regression. Two failed tests now passed.

ChangeLog:
2017-10-11  Vladimir Mezentsev  

PR libgcc/59714
* libgcc/config/aarch64/t-aarch64 (HOST_LIBGCC2_CFLAGS): Add -ffp-contract=off
---
 libgcc/config/aarch64/t-aarch64 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libgcc/config/aarch64/t-aarch64 b/libgcc/config/aarch64/t-aarch64
index 3af933c..e33bef0 100644
--- a/libgcc/config/aarch64/t-aarch64
+++ b/libgcc/config/aarch64/t-aarch64
@@ -18,4 +18,5 @@
 # along with GCC; see the file COPYING3.  If not see
 # <http://www.gnu.org/licenses/>.
 
+HOST_LIBGCC2_CFLAGS += -ffp-contract=off
 LIB2ADD += $(srcdir)/config/aarch64/sync-cache.c
-- 
1.8.3.1



Re: [PATCH] PR target/68256 Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64

2017-10-05 Thread vladimir . mezentsev


On 10/05/2017 07:09 AM, Richard Earnshaw (lists) wrote:
> On 04/10/17 00:50, vladimir.mezent...@oracle.com wrote:
>> From: Vladimir Mezentsev 
>>
>> Tested on aarch64-linux-gnu.
>> No regression.
>> No bootstrap failure.
>>
>> gcc/ChangeLog:
>> 2017-09-26  Vladimir Mezentsev  
>>
>> * gcc/config/aarch64/aarch64.c: restore fix in 
>> aarch64_use_blocks_for_constant_p
> Vladimir,
>
> Thanks for the patch, but I can't accept this as it stands.  The patch
> lacks a description of what is motivating the change, so I can't really
> review it sensibly.  What's the problem you are trying to address?  Why
> do you believe this is the right fix?  Do you have a test case?

 Hello Richard,
 I wanted to fix *Bug 68256*
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68256> - Defining
TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.

Ramana Radhakrishnan  ( )  made a
workaround for this bug:
commit bc443a71dafa2e707bae4b2fa74f83b05dea37ab
Author: ramana 
Date:   Tue Nov 10 08:35:21 2015 +
Workaround PR68256 on AArch64

I removed Raman's fixes andI cannot reproduce the problem.
I used the gcc116.fsffrance.org machine:
% uname -a Linux gcc116 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28
20:45:34 UTC 2016 aarch64 aarch64 aarch64 GNU/Linux

I run
% ../gcc/configure --enable-languages=c,c++,go --enable-bootstrap
--enable-multilib
% make -j8 bootstrap

  Is it correct steps to reproduce the go bootstrap failure ?
If yes:
   There were no bootstrap comparison failure and no regression in testing.
   My suggestion is to remove Raman's workaround and close PR68256.

If no:
   What is wrong in my steps?

Thank you,
-Vladimir

>
> The ChangeLog entry doesn't help me either: restore what fix?
>
> Is the patch associated with a bugzilla report?  If so, then please
> include that in the ChangeLog in the form "PR /.
>
> Incidentally, ChangeLog entries should be complete sentences (begin with
> a capital letter and end with a full stop).
>
> R.
>
>
>> ---
>>  gcc/config/aarch64/aarch64.c | 8 +++-
>>  1 file changed, 3 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 1c14008..b377bc7 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -6193,11 +6193,9 @@ aarch64_can_use_per_function_literal_pools_p (void)
>>  static bool
>>  aarch64_use_blocks_for_constant_p (machine_mode, const_rtx)
>>  {
>> -  /* Fixme:: In an ideal world this would work similar
>> - to the logic in aarch64_select_rtx_section but this
>> - breaks bootstrap in gcc go.  For now we workaround
>> - this by returning false here.  */
>> -  return false;
>> +   /* We can't use blocks for constants when we're using a per-function
>> +  constant pool.  */
>> +  return !aarch64_can_use_per_function_literal_pools_p ();
>>  }
>>  
>>  /* Select appropriate section for constants depending
>>



[PATCH] PR target/68256 Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64

2017-10-03 Thread vladimir . mezentsev
From: Vladimir Mezentsev 

Tested on aarch64-linux-gnu.
No regression.
No bootstrap failure.

gcc/ChangeLog:
2017-09-26  Vladimir Mezentsev  

* gcc/config/aarch64/aarch64.c: restore fix in aarch64_use_blocks_for_constant_p
---
 gcc/config/aarch64/aarch64.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c14008..b377bc7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6193,11 +6193,9 @@ aarch64_can_use_per_function_literal_pools_p (void)
 static bool
 aarch64_use_blocks_for_constant_p (machine_mode, const_rtx)
 {
-  /* Fixme:: In an ideal world this would work similar
- to the logic in aarch64_select_rtx_section but this
- breaks bootstrap in gcc go.  For now we workaround
- this by returning false here.  */
-  return false;
+   /* We can't use blocks for constants when we're using a per-function
+  constant pool.  */
+  return !aarch64_can_use_per_function_literal_pools_p ();
 }
 
 /* Select appropriate section for constants depending
-- 
1.8.3.1