Re: [PATCH] Use the middle-end boolean_type_node

2017-01-04 Thread Janne Blomqvist
On Tue, Jan 3, 2017 at 9:20 PM, FX  wrote:
>> The regression with 8 bit boolean types surfaced with the z10 machine. The 
>> ABI is much older. Since
>> we cannot change it anymore we should rather talk to the hardware guys to 
>> add the instruction we
>> need. So for now we probably have to live with the regression in the Fortran 
>> cases since as I
>> understand it your change fixes an actual problem.
>
> As far as I understand (and Jane will correct me if I am wrong),

Sure, allow me to correct you: It's Janne, with two n in the middle. :)

> the patch does not fix anything in particular. The idea was that, by 
> transitioning from having all boolean expressions from “int” to “bool” (in C 
> terms), and thus from 32-bit to 8-bit on “typical” targets, the optimizer 
> might be able to emit more compact code.

My motivations were:

- As you say, hopefully more compact code.

- Fixing the co-array intrinsics which pass arguments as
boolean_type_node and on the library side the corresponding arguments
are C _Bool's. I think this has worked previously accidentally, as
arguments are passed in registers anyway, and stack slots are 8
bytes(?) on x86-64. But of course, we shouldn't let this excuse us
from actually fixing it.

- Redefining an ABI type specified by the middle end is bad form, IMHO.

> I am not sure this was tested.
>
> So: maybe it is a case of "Profile. Don't speculate.”

I don't have access to spec2k6, but I just checked the polyhedron 2011
benchmark suite. Target x86_64-pc-linux-gnu, compile options -O3
-funroll-loops -ffast-math -ftree-loop-linear -march=native (native ==
amdfam10 here). The line counts of the .s files (~number of
instructions) are (working directory is current trunk,
gf_trunk_boolean_type_node is trunk with my patch reverted):

  6856 ac.s
  6856 ../gf_trunk_boolean_type_node/ac.s
 235206 aermod.s
 235337 ../gf_trunk_boolean_type_node/aermod.s
 22310 air.s
 22477 ../gf_trunk_boolean_type_node/air.s
 15221 capacita.s
 15252 ../gf_trunk_boolean_type_node/capacita.s
  4905 channel2.s
  4859 ../gf_trunk_boolean_type_node/channel2.s
  34269 doduc.s
  33610 ../gf_trunk_boolean_type_node/doduc.s
 11963 fatigue2.s
 11959 ../gf_trunk_boolean_type_node/fatigue2.s
 17369 gas_dyn2.s
 17304 ../gf_trunk_boolean_type_node/gas_dyn2.s
  31098 induct2.s
  31098 ../gf_trunk_boolean_type_node/induct2.s
  8366 linpk.s
  8367 ../gf_trunk_boolean_type_node/linpk.s
 15139 mdbx.s
 15155 ../gf_trunk_boolean_type_node/mdbx.s
  5302 mp_prop_design.s
  5314 ../gf_trunk_boolean_type_node/mp_prop_design.s
 17420 nf.s
 17457 ../gf_trunk_boolean_type_node/nf.s
 20804 protein.s
 20713 ../gf_trunk_boolean_type_node/protein.s
  52476 rnflow.s
  52449 ../gf_trunk_boolean_type_node/rnflow.s
  37339 test_fpu2.s
  37377 ../gf_trunk_boolean_type_node/test_fpu2.s
  8466 tfft2.s
  8448 ../gf_trunk_boolean_type_node/tfft2.s


So 7 of 17 benchmarks are slightly bigger after the patch (at least
for channel2 it seems due to trunk unrolling a loop which didn't
happen with the patch reverted, haven't checked others), 8 benchmarks
are slightly smaller after the patch, and 2 are equally long. Runtime
differences seem to be a wash, though I didn't test that very
carefully.


-- 
Janne Blomqvist


Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Jakub Jelinek
On Tue, Jan 03, 2017 at 09:40:02PM +0200, Janne Blomqvist wrote:
> I do think this is fixable without waiting for potential improved
> hardware.  The vast majority of uses of boolean_type_node in the
> Fortran frontend is code like
> 
>   tmp = fold_build2_loc (input_location, GT_EXPR,
>  boolean_type_node, from_len,
>  build_zero_cst (TREE_TYPE (from_len)));
>   tmp = fold_build3_loc (input_location, COND_EXPR,
>  void_type_node, tmp, extcopy, stdcopy);
> 
> where the result of a boolean expression is returned as a
> boolean_type_node.  This is a more like high-level language semantics,
> whereas on the asm level boolean expressions are of course evaluated
> with integer arithmetic (or are there targets where this is not
> true?).
> 
> So while the Fortran frontend shouldn't redefine the ABI specified
> boolean_type_node, for situations like the above we could instead use
> some fast_scalar_non_abi_bool_type_node (better name suggestions
> welcome?), if we had some mechanism for returning such information
> from the backend? Or is there some universal best choice here?
> 
> Or since boolean types are not actually present at the asm level, what
> is the best choice in cases like the above in order to generate code
> that can be lowered to as efficient code as possible? Should the
> result of the fold_build2_loc be of, say, TREE_TYPE(from_len) instead
> of boolean_type_node (or fast_scalar_non_abi_bool_type_node or
> whatever we come up with)?
> 
> 
> (Repost without html formatting, sorry for the spam)

Well, generally if it helps Fortran on z10, it will likely help C and C++
too, so you probably instead want some type promotion pass which will know
that on the particular target it is beneficial to promote booleans to
32-bits and know under what circumstances (the loads and stores of course
need to remain the same size for ABI reasons, but as soon as you have it in
SSA_NAME, it depends on what you use it for, sometimes it will be beneficial
to promote it, sometimes not.  Of course, that is something that is too late
for GCC 7.

Jakub


Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Janne Blomqvist
On Tue, Jan 3, 2017 at 6:50 PM, Andreas Krebbel
 wrote:
> The regression with 8 bit boolean types surfaced with the z10 machine. The 
> ABI is much older. Since
> we cannot change it anymore we should rather talk to the hardware guys to add 
> the instruction we
> need. So for now we probably have to live with the regression in the Fortran 
> cases since as I
> understand it your change fixes an actual problem.

I do think this is fixable without waiting for potential improved
hardware.  The vast majority of uses of boolean_type_node in the
Fortran frontend is code like

  tmp = fold_build2_loc (input_location, GT_EXPR,
 boolean_type_node, from_len,
 build_zero_cst (TREE_TYPE (from_len)));
  tmp = fold_build3_loc (input_location, COND_EXPR,
 void_type_node, tmp, extcopy, stdcopy);

where the result of a boolean expression is returned as a
boolean_type_node.  This is a more like high-level language semantics,
whereas on the asm level boolean expressions are of course evaluated
with integer arithmetic (or are there targets where this is not
true?).

So while the Fortran frontend shouldn't redefine the ABI specified
boolean_type_node, for situations like the above we could instead use
some fast_scalar_non_abi_bool_type_node (better name suggestions
welcome?), if we had some mechanism for returning such information
from the backend? Or is there some universal best choice here?

Or since boolean types are not actually present at the asm level, what
is the best choice in cases like the above in order to generate code
that can be lowered to as efficient code as possible? Should the
result of the fold_build2_loc be of, say, TREE_TYPE(from_len) instead
of boolean_type_node (or fast_scalar_non_abi_bool_type_node or
whatever we come up with)?


(Repost without html formatting, sorry for the spam)

-- 
Janne Blomqvist


Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread FX
> The regression with 8 bit boolean types surfaced with the z10 machine. The 
> ABI is much older. Since
> we cannot change it anymore we should rather talk to the hardware guys to add 
> the instruction we
> need. So for now we probably have to live with the regression in the Fortran 
> cases since as I
> understand it your change fixes an actual problem.

As far as I understand (and Jane will correct me if I am wrong), the patch does 
not fix anything in particular. The idea was that, by transitioning from having 
all boolean expressions from “int” to “bool” (in C terms), and thus from 32-bit 
to 8-bit on “typical” targets, the optimizer might be able to emit more compact 
code. I am not sure this was tested.

So: maybe it is a case of "Profile. Don't speculate.”

FX

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread FX
> The gfc_init_types change is an ABI change, at least if the fortran FE
> bool type is ever stored in memory and accessed by multiple TUs, or
> passed as argument etc.  And the difference between the C/C++ _Bool/bool
> and fortran FE bool has caused lots of issues in the past, so if it can be
> the same type, it is preferrable.

The patch committed doesn’t change the way Fortran LOGICAL types are emitted. 
The fact that the default LOGICAL kind is different (on most platforms) from 
C/C++ boolean is due to the standard, and not our own choice of ABI.

As Jane says, boolean_type_node in the Fortran front-end is only used for 
intermediate values and temporaries; it is used every time we build a 
COND_EXPR. It is not part of the ABI.

FX

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Andreas Krebbel
On 01/03/2017 03:31 PM, Janne Blomqvist wrote:
> On Tue, Jan 3, 2017 at 4:20 PM, Jakub Jelinek  wrote:
>> On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
>>> This patch costs several thousand additional instructions in
>>> Spec2006 on s390x ("lines" = instructions):
>>>
>>>   410.bwaves: +28 lines (2 funcs bigger)
>>>   437.leslie3d:   +43 lines (5 funcs bigger)
>>>   434.zeusmp:   +2650 lines (15 funcs bigger)
>>>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>>>   454.calculix:  +474 lines (23 funcs bigger)
>>>   465.tonto:+2182 lines (221 funcs bigger)
>>>   481.wrf:  +4988 lines (117 funcs bigger)
>>>   416.gamess:   +3723 lines (466 funcs bigger)
>>>
>>> s390x has a "compare with immediate and jump relative" instruction
>>> for 32 bit, but for an 8 bit quantities it needs separate compare
>>> and jump instructions, e.g.
>>>
>>>   cijne   %r1,0,... 
>>>
>>> ->
>>>
>>>   tmll%r1,1
>>>   jne ... 
>>>
>>> Instead of hard coding a specific type, should one ask the backend
>>> for the preferred type?
> 
> Hmm, that's sort of the opposite of what I had hoped for.. :-/
> 
> Is there some way to ask the backend what the preferred type is, then?
> 
> (The snide answer, why didn't the s390 ABi define
> bool/_Bool/boolean_type_node to be a 32 bit type if 8 bit types are
> problematic? But that's of course water under the bridge by now...)

The regression with 8 bit boolean types surfaced with the z10 machine. The ABI 
is much older. Since
we cannot change it anymore we should rather talk to the hardware guys to add 
the instruction we
need. So for now we probably have to live with the regression in the Fortran 
cases since as I
understand it your change fixes an actual problem.

-Andreas-

> 
>> The gfc_init_types change is an ABI change, at least if the fortran FE
>> bool type is ever stored in memory and accessed by multiple TUs, or
>> passed as argument etc.
> 
> Based on the quick audit I did when I wrote the patch, the only time
> it's used except as a local temp variable, is for a couple of the
> co-array intrinsics, where the corresponding library implementation
> actually uses C _Bool (I suspect it has worked by accident if the args
> are passed in registers).
> 
>>  And the difference between the C/C++ _Bool/bool
>> and fortran FE bool has caused lots of issues in the past, so if it can be
>> the same type, it is preferrable.
>>
>> Jakub
> 
> 
> 



Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Janne Blomqvist
On Tue, Jan 3, 2017 at 4:20 PM, Jakub Jelinek  wrote:
> On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
>> This patch costs several thousand additional instructions in
>> Spec2006 on s390x ("lines" = instructions):
>>
>>   410.bwaves: +28 lines (2 funcs bigger)
>>   437.leslie3d:   +43 lines (5 funcs bigger)
>>   434.zeusmp:   +2650 lines (15 funcs bigger)
>>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>>   454.calculix:  +474 lines (23 funcs bigger)
>>   465.tonto:+2182 lines (221 funcs bigger)
>>   481.wrf:  +4988 lines (117 funcs bigger)
>>   416.gamess:   +3723 lines (466 funcs bigger)
>>
>> s390x has a "compare with immediate and jump relative" instruction
>> for 32 bit, but for an 8 bit quantities it needs separate compare
>> and jump instructions, e.g.
>>
>>   cijne   %r1,0,... 
>>
>> ->
>>
>>   tmll%r1,1
>>   jne ... 
>>
>> Instead of hard coding a specific type, should one ask the backend
>> for the preferred type?

Hmm, that's sort of the opposite of what I had hoped for.. :-/

Is there some way to ask the backend what the preferred type is, then?

(The snide answer, why didn't the s390 ABi define
bool/_Bool/boolean_type_node to be a 32 bit type if 8 bit types are
problematic? But that's of course water under the bridge by now...)

> The gfc_init_types change is an ABI change, at least if the fortran FE
> bool type is ever stored in memory and accessed by multiple TUs, or
> passed as argument etc.

Based on the quick audit I did when I wrote the patch, the only time
it's used except as a local temp variable, is for a couple of the
co-array intrinsics, where the corresponding library implementation
actually uses C _Bool (I suspect it has worked by accident if the args
are passed in registers).

>  And the difference between the C/C++ _Bool/bool
> and fortran FE bool has caused lots of issues in the past, so if it can be
> the same type, it is preferrable.
>
> Jakub



-- 
Janne Blomqvist


Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Jakub Jelinek
On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
> This patch costs several thousand additional instructions in
> Spec2006 on s390x ("lines" = instructions):
> 
>   410.bwaves: +28 lines (2 funcs bigger)
>   437.leslie3d:   +43 lines (5 funcs bigger)
>   434.zeusmp:   +2650 lines (15 funcs bigger)
>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>   454.calculix:  +474 lines (23 funcs bigger)
>   465.tonto:+2182 lines (221 funcs bigger)
>   481.wrf:  +4988 lines (117 funcs bigger)
>   416.gamess:   +3723 lines (466 funcs bigger)
> 
> s390x has a "compare with immediate and jump relative" instruction
> for 32 bit, but for an 8 bit quantities it needs separate compare
> and jump instructions, e.g.
> 
>   cijne   %r1,0,... 
> 
> ->
> 
>   tmll%r1,1
>   jne ... 
> 
> Instead of hard coding a specific type, should one ask the backend
> for the preferred type?

The gfc_init_types change is an ABI change, at least if the fortran FE
bool type is ever stored in memory and accessed by multiple TUs, or
passed as argument etc.  And the difference between the C/C++ _Bool/bool
and fortran FE bool has caused lots of issues in the past, so if it can be
the same type, it is preferrable.

Jakub


Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Dominik Vogt
On Tue, Dec 13, 2016 at 10:59:09PM +0200, Janne Blomqvist wrote:
> Use the boolean_type_node setup by the middle-end instead of
> redefining it. boolean_type_node is not used in GFortran for any
> ABI-visible stuff, only internally as the type of boolean
> expressions. There appears to be one exception to this, namely the
> caf_get* and caf_send* calls which have boolean_type_node
> arguments. However, on the library side they seem to use C _Bool, so I
> suspect this might be a case of a argument mismatch that hasn't
> affected anything so far.
> 
> The practical effect of this is that the size of such variables will
> be the same as a C _Bool or C++ bool, that is, on most targets a
> single byte. Previously we redefined boolean_type_node to be a Fortran
> default logical kind sized variable, that is 4 or 8 bytes depending on
> compile options. This might enable slightly more compact code, in case
> the optimizer determines that the result of such a generated
> comparison expression needs to be stored in some temporary location
> rather than being used immediately.

This patch costs several thousand additional instructions in
Spec2006 on s390x ("lines" = instructions):

  410.bwaves: +28 lines (2 funcs bigger)
  437.leslie3d:   +43 lines (5 funcs bigger)
  434.zeusmp:   +2650 lines (15 funcs bigger)
  459.GemsFDTD:   +65 lines (7 funcs bigger)
  454.calculix:  +474 lines (23 funcs bigger)
  465.tonto:+2182 lines (221 funcs bigger)
  481.wrf:  +4988 lines (117 funcs bigger)
  416.gamess:   +3723 lines (466 funcs bigger)

s390x has a "compare with immediate and jump relative" instruction
for 32 bit, but for an 8 bit quantities it needs separate compare
and jump instructions, e.g.

  cijne   %r1,0,... 

->

  tmll%r1,1
  jne ... 

Instead of hard coding a specific type, should one ask the backend
for the preferred type?

> Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
> 
> 2016-12-13  Janne Blomqvist  
> 
>   * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> ---
>  gcc/fortran/trans-types.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 354308f..e8dafa0 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -961,10 +961,6 @@ gfc_init_types (void)
>   wi::mask (n, UNSIGNED,
> TYPE_PRECISION (size_type_node)));
>  
> -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> -
>/* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
>gfc_charlen_int_kind = 4;
>gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> -- 
> 2.7.4


Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] Use the middle-end boolean_type_node

2016-12-20 Thread Steve Kargl
Thought I gave an 'ok', but apparently never sent email.
Sorry about that.  Yes, ok to commit.

-- 
steve

On Tue, Dec 20, 2016 at 04:56:51PM +0200, Janne Blomqvist wrote:
> PING!
> 
> On Tue, Dec 13, 2016 at 10:59 PM, Janne Blomqvist
>  wrote:
> > Use the boolean_type_node setup by the middle-end instead of
> > redefining it. boolean_type_node is not used in GFortran for any
> > ABI-visible stuff, only internally as the type of boolean
> > expressions. There appears to be one exception to this, namely the
> > caf_get* and caf_send* calls which have boolean_type_node
> > arguments. However, on the library side they seem to use C _Bool, so I
> > suspect this might be a case of a argument mismatch that hasn't
> > affected anything so far.
> >
> > The practical effect of this is that the size of such variables will
> > be the same as a C _Bool or C++ bool, that is, on most targets a
> > single byte. Previously we redefined boolean_type_node to be a Fortran
> > default logical kind sized variable, that is 4 or 8 bytes depending on
> > compile options. This might enable slightly more compact code, in case
> > the optimizer determines that the result of such a generated
> > comparison expression needs to be stored in some temporary location
> > rather than being used immediately.
> >
> > Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
> >
> > 2016-12-13  Janne Blomqvist  
> >
> > * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> > ---
> >  gcc/fortran/trans-types.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> > index 354308f..e8dafa0 100644
> > --- a/gcc/fortran/trans-types.c
> > +++ b/gcc/fortran/trans-types.c
> > @@ -961,10 +961,6 @@ gfc_init_types (void)
> > wi::mask (n, UNSIGNED,
> >   TYPE_PRECISION (size_type_node)));
> >
> > -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> > -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> > -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> > -
> >/* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
> >gfc_charlen_int_kind = 4;
> >gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> > --
> > 2.7.4
> >
> 
> 
> 
> -- 
> Janne Blomqvist

-- 
Steve


Re: [PATCH] Use the middle-end boolean_type_node

2016-12-20 Thread Janne Blomqvist
PING!

On Tue, Dec 13, 2016 at 10:59 PM, Janne Blomqvist
 wrote:
> Use the boolean_type_node setup by the middle-end instead of
> redefining it. boolean_type_node is not used in GFortran for any
> ABI-visible stuff, only internally as the type of boolean
> expressions. There appears to be one exception to this, namely the
> caf_get* and caf_send* calls which have boolean_type_node
> arguments. However, on the library side they seem to use C _Bool, so I
> suspect this might be a case of a argument mismatch that hasn't
> affected anything so far.
>
> The practical effect of this is that the size of such variables will
> be the same as a C _Bool or C++ bool, that is, on most targets a
> single byte. Previously we redefined boolean_type_node to be a Fortran
> default logical kind sized variable, that is 4 or 8 bytes depending on
> compile options. This might enable slightly more compact code, in case
> the optimizer determines that the result of such a generated
> comparison expression needs to be stored in some temporary location
> rather than being used immediately.
>
> Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
>
> 2016-12-13  Janne Blomqvist  
>
> * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> ---
>  gcc/fortran/trans-types.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 354308f..e8dafa0 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -961,10 +961,6 @@ gfc_init_types (void)
> wi::mask (n, UNSIGNED,
>   TYPE_PRECISION (size_type_node)));
>
> -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> -
>/* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
>gfc_charlen_int_kind = 4;
>gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> --
> 2.7.4
>



-- 
Janne Blomqvist