Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Biener
On Tue, Apr 22, 2014 at 11:51 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
  wrote:
> Richard Biener  writes:
>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>>  wrote:
>>> wide-int fails to build libitm because of a bad interaction between:
>>>
>>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>>that these modes could actually be used for computation.  They are
>>>only holders for vectors during data movement.  */
>>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>>
>>> and the memcpy folding code:
>>>
>>>   /* Make sure we are not copying using a floating-point mode or
>>>  a type whose size possibly does not match its precision.  */
>>>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>>   || TREE_CODE (desttype) == BOOLEAN_TYPE
>>>   || TREE_CODE (desttype) == ENUMERAL_TYPE)
>>> {
>>>   /* A more suitable int_mode_for_mode would return a vector
>>>  integer mode for a vector float mode or a integer complex
>>>  mode for a float complex mode if there isn't a regular
>>>  integer mode covering the mode of desttype.  */
>>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
>>> (desttype));
>>>   if (mode == BLKmode)
>>> desttype = NULL_TREE;
>>>   else
>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>>> (mode),
>>>1);
>>> }
>>>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>>   || TREE_CODE (srctype) == BOOLEAN_TYPE
>>>   || TREE_CODE (srctype) == ENUMERAL_TYPE)
>>> {
>>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
>>> (srctype));
>>>   if (mode == BLKmode)
>>> srctype = NULL_TREE;
>>>   else
>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>>> (mode),
>>>   1);
>>> }
>>>
>>> The failure occurs for complex long double, which we try to copy as
>>> a 256-bit integer type (OImode).
>>>
>>> This patch tries to do what the comment suggests by introducing a new
>>> form of int_mode_for_mode that replaces vector modes with vector modes
>>> and complex modes with complex modes.  The fallback case of using a
>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>>> 128 bits on x86_64.
>>>
>>> The question then is what to do about 128-bit types for i386.
>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>>> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
>>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>>
>>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>>> even if the original type wasn't a vector.
>>
>> Hmm.  Sounds reasonable unless there are very weird targets that
>> cannot efficiently load/store vectors unaligned but can handle
>> efficient load/store of unaligned scalars.
>
> Yeah, in general there's no guarantee that even int_mode_for_mode
> will return a mode with the same alignment as the original.  Callers
> need to check that (like the memcpy folder does).
>
>>> It might be that other callers to int_mode_for_mode should use
>>> the new function too, but I'll look at that separately.
>>>
>>> I used the attached testcase (with printfs added to gcc) to check that
>>> the right modes and types were being chosen.  The patch fixes the
>>> complex float and complex double cases, since the integer type that we
>>> previously picked had a larger alignment than the original complex type.
>>
>> As of complex int modes - are we positively sure that targets even
>> try to do sth "optimal" for loads/stores of those?
>
> Complex modes usually aren't handled directly by .md patterns,
> either int or float.  They're really treated as a pair of values.
> So IMO it still makes sense to fold this case.
>
>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>>> is that vectors are copied as integer vectors if the target supports
>>> them directly but are copied as float vectors otherwise, since in the
>>> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
>>> test are copied as vector floats and vector doubles both before and
>>> after the patch.
>>
>> That wasn't intended ... the fo

Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
>>>  wrote:
 Richard Biener  writes:
> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>  wrote:
>> wide-int fails to build libitm because of a bad interaction between:
>>
>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>that these modes could actually be used for computation.  They are
>>only holders for vectors during data movement.  */
>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>
>> and the memcpy folding code:
>>
>>   /* Make sure we are not copying using a floating-point mode or
>>  a type whose size possibly does not match its precision.  */
>>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>   || TREE_CODE (desttype) == BOOLEAN_TYPE
>>   || TREE_CODE (desttype) == ENUMERAL_TYPE)
>> {
>>   /* A more suitable int_mode_for_mode would return a vector
>>  integer mode for a vector float mode or a integer complex
>>  mode for a float complex mode if there isn't a regular
>>  integer mode covering the mode of desttype.  */
>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
>> (desttype));
>>   if (mode == BLKmode)
>> desttype = NULL_TREE;
>>   else
>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>> (mode),
>>1);
>> }
>>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>   || TREE_CODE (srctype) == BOOLEAN_TYPE
>>   || TREE_CODE (srctype) == ENUMERAL_TYPE)
>> {
>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
>> (srctype));
>>   if (mode == BLKmode)
>> srctype = NULL_TREE;
>>   else
>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>> (mode),
>>   1);
>> }
>>
>> The failure occurs for complex long double, which we try to copy as
>> a 256-bit integer type (OImode).
>>
>> This patch tries to do what the comment suggests by introducing a new
>> form of int_mode_for_mode that replaces vector modes with vector modes
>> and complex modes with complex modes.  The fallback case of using a
>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>> 128 bits on x86_64.
>>
>> The question then is what to do about 128-bit types for i386.
>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>
>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>> even if the original type wasn't a vector.
>
> Hmm.  Sounds reasonable unless there are very weird targets that
> cannot efficiently load/store vectors unaligned but can handle
> efficient load/store of unaligned scalars.

 Yeah, in general there's no guarantee that even int_mode_for_mode
 will return a mode with the same alignment as the original.  Callers
 need to check that (like the memcpy folder does).

>> It might be that other callers to int_mode_for_mode should use
>> the new function too, but I'll look at that separately.
>>
>> I used the attached testcase (with printfs added to gcc) to check that
>> the right modes and types were being chosen.  The patch fixes the
>> complex float and complex double cases, since the integer type that we
>> previously picked had a larger alignment than the original complex type.
>
> As of complex int modes - are we positively sure that targets even
> try to do sth "optimal" for loads/stores of those?

 Complex modes usually aren't handled directly by .md patterns,
 either int or float.  They're really treated as a pair of values.
 So IMO it still makes sense to fold this case.

>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>> is that vectors are copied as integer vectors if the target supports
>> them directly but are copied as float vectors otherwise, since in the
>> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
>> test are copied as vector floats and vector doubles both before and
>> after the patch.
>
> That wasn't intended ... the folding should have failed if we can't
> copy using an integer mode ...

 Does that mean that the fold give up if TYPE_MODE is BLKmode?
 I can do t

Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Biener
On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
  wrote:
> wide-int fails to build libitm because of a bad interaction between:
>
> /* Keep the OI and XI modes from confusing the compiler into thinking
>that these modes could actually be used for computation.  They are
>only holders for vectors during data movement.  */
> #define MAX_BITSIZE_MODE_ANY_INT (128)
>
> and the memcpy folding code:
>
>   /* Make sure we are not copying using a floating-point mode or
>  a type whose size possibly does not match its precision.  */
>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
>   || TREE_CODE (desttype) == BOOLEAN_TYPE
>   || TREE_CODE (desttype) == ENUMERAL_TYPE)
> {
>   /* A more suitable int_mode_for_mode would return a vector
>  integer mode for a vector float mode or a integer complex
>  mode for a float complex mode if there isn't a regular
>  integer mode covering the mode of desttype.  */
>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
> (desttype));
>   if (mode == BLKmode)
> desttype = NULL_TREE;
>   else
> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
> (mode),
>1);
> }
>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
>   || TREE_CODE (srctype) == BOOLEAN_TYPE
>   || TREE_CODE (srctype) == ENUMERAL_TYPE)
> {
>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
> (srctype));
>   if (mode == BLKmode)
> srctype = NULL_TREE;
>   else
> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
> (mode),
>   1);
> }
>
> The failure occurs for complex long double, which we try to copy as
> a 256-bit integer type (OImode).
>
> This patch tries to do what the comment suggests by introducing a new
> form of int_mode_for_mode that replaces vector modes with vector modes
> and complex modes with complex modes.  The fallback case of using a
> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
> 128 bits on x86_64.
>
> The question then is what to do about 128-bit types for i386.
> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
> for -m32 -msse2 because we use int128_t to copy a float128_t.
>
> I handled that by allowing MODE_VECTOR_INT to be used instead of
> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
> even if the original type wasn't a vector.

 Hmm.  Sounds reasonable unless there are very weird targets that
 cannot efficiently load/store vectors unaligned but can handle
 efficient load/store of unaligned scalars.
>>>
>>> Yeah, in general there's no guarantee that even int_mode_for_mode
>>> will return a mode with the same alignment as the original.  Callers
>>> need to check that (like the memcpy folder does).
>>>
> It might be that other callers to int_mode_for_mode should use
> the new function too, but I'll look at that separately.
>
> I used the attached testcase (with printfs added to gcc) to check that
> the right modes and types were being chosen.  The patch fixes the
> complex float and complex double cases, since the integer type that we
> previously picked had a larger alignment than the original complex type.

 As of complex int modes - are we positively sure that targets even
 try to do sth "optimal" for loads/stores of those?
>>>
>>> Complex modes usually aren't handled directly by .md patterns,
>>> either int or float.  They're really treated as a pair of values.
>>> So IMO it still makes sense to fold this case.
>>>
> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
> is that vectors are copied as integer vectors if the target supports
> them directly but are copied as float vectors otherwise, since in the
> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
> test are copied as vector floats and vector doubles both before and
> after the patch.

 That wasn't intended ... the folding should have failed if we can't
 copy using an integer mode ...
>>>
>>> Does that mean that the fold give up if TYPE_MODE is BLKmode?
>>> I can do that as a separate patch if so.
>>
>> Looking at the code again it should always choose an integer mode/type
>> via setting destty

Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>>>  wrote:
 wide-int fails to build libitm because of a bad interaction between:

 /* Keep the OI and XI modes from confusing the compiler into thinking
that these modes could actually be used for computation.  They are
only holders for vectors during data movement.  */
 #define MAX_BITSIZE_MODE_ANY_INT (128)

 and the memcpy folding code:

   /* Make sure we are not copying using a floating-point mode or
  a type whose size possibly does not match its precision.  */
   if (FLOAT_MODE_P (TYPE_MODE (desttype))
   || TREE_CODE (desttype) == BOOLEAN_TYPE
   || TREE_CODE (desttype) == ENUMERAL_TYPE)
 {
   /* A more suitable int_mode_for_mode would return a vector
  integer mode for a vector float mode or a integer complex
  mode for a float complex mode if there isn't a regular
  integer mode covering the mode of desttype.  */
   enum machine_mode mode = int_mode_for_mode (TYPE_MODE 
 (desttype));
   if (mode == BLKmode)
 desttype = NULL_TREE;
   else
 desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
 (mode),
1);
 }
   if (FLOAT_MODE_P (TYPE_MODE (srctype))
   || TREE_CODE (srctype) == BOOLEAN_TYPE
   || TREE_CODE (srctype) == ENUMERAL_TYPE)
 {
   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
   if (mode == BLKmode)
 srctype = NULL_TREE;
   else
 srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
 (mode),
   1);
 }

 The failure occurs for complex long double, which we try to copy as
 a 256-bit integer type (OImode).

 This patch tries to do what the comment suggests by introducing a new
 form of int_mode_for_mode that replaces vector modes with vector modes
 and complex modes with complex modes.  The fallback case of using a
 MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
 128 bits on x86_64.

 The question then is what to do about 128-bit types for i386.
 MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
 used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
 for -m32 -msse2 because we use int128_t to copy a float128_t.

 I handled that by allowing MODE_VECTOR_INT to be used instead of
 MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
 even if the original type wasn't a vector.
>>>
>>> Hmm.  Sounds reasonable unless there are very weird targets that
>>> cannot efficiently load/store vectors unaligned but can handle
>>> efficient load/store of unaligned scalars.
>>
>> Yeah, in general there's no guarantee that even int_mode_for_mode
>> will return a mode with the same alignment as the original.  Callers
>> need to check that (like the memcpy folder does).
>>
 It might be that other callers to int_mode_for_mode should use
 the new function too, but I'll look at that separately.

 I used the attached testcase (with printfs added to gcc) to check that
 the right modes and types were being chosen.  The patch fixes the
 complex float and complex double cases, since the integer type that we
 previously picked had a larger alignment than the original complex type.
>>>
>>> As of complex int modes - are we positively sure that targets even
>>> try to do sth "optimal" for loads/stores of those?
>>
>> Complex modes usually aren't handled directly by .md patterns,
>> either int or float.  They're really treated as a pair of values.
>> So IMO it still makes sense to fold this case.
>>
 One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
 is that vectors are copied as integer vectors if the target supports
 them directly but are copied as float vectors otherwise, since in the
 latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
 test are copied as vector floats and vector doubles both before and
 after the patch.
>>>
>>> That wasn't intended ... the folding should have failed if we can't
>>> copy using an integer mode ...
>>
>> Does that mean that the fold give up if TYPE_MODE is BLKmode?
>> I can do that as a separate patch if so.
>
> Looking at the code again it should always choose an integer mode/type
> via setting desttype/srctype to NULL for BLKmode and
>
>   if (!srctype)
> srctype = desttype;
>   if (!desttype)
> desttype = srctype;
>   if (!srctype)
>  

Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Biener
On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>>  wrote:
>>> wide-int fails to build libitm because of a bad interaction between:
>>>
>>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>>that these modes could actually be used for computation.  They are
>>>only holders for vectors during data movement.  */
>>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>>
>>> and the memcpy folding code:
>>>
>>>   /* Make sure we are not copying using a floating-point mode or
>>>  a type whose size possibly does not match its precision.  */
>>>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>>   || TREE_CODE (desttype) == BOOLEAN_TYPE
>>>   || TREE_CODE (desttype) == ENUMERAL_TYPE)
>>> {
>>>   /* A more suitable int_mode_for_mode would return a vector
>>>  integer mode for a vector float mode or a integer complex
>>>  mode for a float complex mode if there isn't a regular
>>>  integer mode covering the mode of desttype.  */
>>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
>>>   if (mode == BLKmode)
>>> desttype = NULL_TREE;
>>>   else
>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>>> (mode),
>>>1);
>>> }
>>>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>>   || TREE_CODE (srctype) == BOOLEAN_TYPE
>>>   || TREE_CODE (srctype) == ENUMERAL_TYPE)
>>> {
>>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
>>>   if (mode == BLKmode)
>>> srctype = NULL_TREE;
>>>   else
>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>>> (mode),
>>>   1);
>>> }
>>>
>>> The failure occurs for complex long double, which we try to copy as
>>> a 256-bit integer type (OImode).
>>>
>>> This patch tries to do what the comment suggests by introducing a new
>>> form of int_mode_for_mode that replaces vector modes with vector modes
>>> and complex modes with complex modes.  The fallback case of using a
>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>>> 128 bits on x86_64.
>>>
>>> The question then is what to do about 128-bit types for i386.
>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>>> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
>>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>>
>>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>>> even if the original type wasn't a vector.
>>
>> Hmm.  Sounds reasonable unless there are very weird targets that
>> cannot efficiently load/store vectors unaligned but can handle
>> efficient load/store of unaligned scalars.
>
> Yeah, in general there's no guarantee that even int_mode_for_mode
> will return a mode with the same alignment as the original.  Callers
> need to check that (like the memcpy folder does).
>
>>> It might be that other callers to int_mode_for_mode should use
>>> the new function too, but I'll look at that separately.
>>>
>>> I used the attached testcase (with printfs added to gcc) to check that
>>> the right modes and types were being chosen.  The patch fixes the
>>> complex float and complex double cases, since the integer type that we
>>> previously picked had a larger alignment than the original complex type.
>>
>> As of complex int modes - are we positively sure that targets even
>> try to do sth "optimal" for loads/stores of those?
>
> Complex modes usually aren't handled directly by .md patterns,
> either int or float.  They're really treated as a pair of values.
> So IMO it still makes sense to fold this case.
>
>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>>> is that vectors are copied as integer vectors if the target supports
>>> them directly but are copied as float vectors otherwise, since in the
>>> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
>>> test are copied as vector floats and vector doubles both before and
>>> after the patch.
>>
>> That wasn't intended ... the folding should have failed if we can't
>> copy using an integer mode ...
>
> Does that mean that the fold give up if TYPE_MODE is BLKmode?
> I can do that as a separate patch if so.

Looking at the code again it should always choose an integer mode/type
via setting desttype/srctype to NULL for BLKmode and

  if (!srctype)
srctype = desttype;
  if (!desttype)
desttype = srctype;
  if (!srctype)
return NULL_TREE;

no?  Thus if we can't get a integer type for either src or dest then
we fail.  But we should never end up with srctype or dest

Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Sandiford
Richard Biener  writes:
> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>  wrote:
>> wide-int fails to build libitm because of a bad interaction between:
>>
>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>that these modes could actually be used for computation.  They are
>>only holders for vectors during data movement.  */
>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>
>> and the memcpy folding code:
>>
>>   /* Make sure we are not copying using a floating-point mode or
>>  a type whose size possibly does not match its precision.  */
>>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>   || TREE_CODE (desttype) == BOOLEAN_TYPE
>>   || TREE_CODE (desttype) == ENUMERAL_TYPE)
>> {
>>   /* A more suitable int_mode_for_mode would return a vector
>>  integer mode for a vector float mode or a integer complex
>>  mode for a float complex mode if there isn't a regular
>>  integer mode covering the mode of desttype.  */
>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
>>   if (mode == BLKmode)
>> desttype = NULL_TREE;
>>   else
>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>> (mode),
>>1);
>> }
>>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>   || TREE_CODE (srctype) == BOOLEAN_TYPE
>>   || TREE_CODE (srctype) == ENUMERAL_TYPE)
>> {
>>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
>>   if (mode == BLKmode)
>> srctype = NULL_TREE;
>>   else
>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>> (mode),
>>   1);
>> }
>>
>> The failure occurs for complex long double, which we try to copy as
>> a 256-bit integer type (OImode).
>>
>> This patch tries to do what the comment suggests by introducing a new
>> form of int_mode_for_mode that replaces vector modes with vector modes
>> and complex modes with complex modes.  The fallback case of using a
>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>> 128 bits on x86_64.
>>
>> The question then is what to do about 128-bit types for i386.
>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>
>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>> even if the original type wasn't a vector.
>
> Hmm.  Sounds reasonable unless there are very weird targets that
> cannot efficiently load/store vectors unaligned but can handle
> efficient load/store of unaligned scalars.

Yeah, in general there's no guarantee that even int_mode_for_mode
will return a mode with the same alignment as the original.  Callers
need to check that (like the memcpy folder does).

>> It might be that other callers to int_mode_for_mode should use
>> the new function too, but I'll look at that separately.
>>
>> I used the attached testcase (with printfs added to gcc) to check that
>> the right modes and types were being chosen.  The patch fixes the
>> complex float and complex double cases, since the integer type that we
>> previously picked had a larger alignment than the original complex type.
>
> As of complex int modes - are we positively sure that targets even
> try to do sth "optimal" for loads/stores of those?

Complex modes usually aren't handled directly by .md patterns,
either int or float.  They're really treated as a pair of values.
So IMO it still makes sense to fold this case.

>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>> is that vectors are copied as integer vectors if the target supports
>> them directly but are copied as float vectors otherwise, since in the
>> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
>> test are copied as vector floats and vector doubles both before and
>> after the patch.
>
> That wasn't intended ... the folding should have failed if we can't
> copy using an integer mode ...

Does that mean that the fold give up if TYPE_MODE is BLKmode?
I can do that as a separate patch if so.

Thanks,
Richard


Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Biener
On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
 wrote:
> wide-int fails to build libitm because of a bad interaction between:
>
> /* Keep the OI and XI modes from confusing the compiler into thinking
>that these modes could actually be used for computation.  They are
>only holders for vectors during data movement.  */
> #define MAX_BITSIZE_MODE_ANY_INT (128)
>
> and the memcpy folding code:
>
>   /* Make sure we are not copying using a floating-point mode or
>  a type whose size possibly does not match its precision.  */
>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
>   || TREE_CODE (desttype) == BOOLEAN_TYPE
>   || TREE_CODE (desttype) == ENUMERAL_TYPE)
> {
>   /* A more suitable int_mode_for_mode would return a vector
>  integer mode for a vector float mode or a integer complex
>  mode for a float complex mode if there isn't a regular
>  integer mode covering the mode of desttype.  */
>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
>   if (mode == BLKmode)
> desttype = NULL_TREE;
>   else
> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
> (mode),
>1);
> }
>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
>   || TREE_CODE (srctype) == BOOLEAN_TYPE
>   || TREE_CODE (srctype) == ENUMERAL_TYPE)
> {
>   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
>   if (mode == BLKmode)
> srctype = NULL_TREE;
>   else
> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
>   1);
> }
>
> The failure occurs for complex long double, which we try to copy as
> a 256-bit integer type (OImode).
>
> This patch tries to do what the comment suggests by introducing a new
> form of int_mode_for_mode that replaces vector modes with vector modes
> and complex modes with complex modes.  The fallback case of using a
> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
> 128 bits on x86_64.
>
> The question then is what to do about 128-bit types for i386.
> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
> for -m32 -msse2 because we use int128_t to copy a float128_t.
>
> I handled that by allowing MODE_VECTOR_INT to be used instead of
> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
> even if the original type wasn't a vector.

Hmm.  Sounds reasonable unless there are very weird targets that
cannot efficiently load/store vectors unaligned but can handle
efficient load/store of unaligned scalars.

> It might be that other callers to int_mode_for_mode should use
> the new function too, but I'll look at that separately.
>
> I used the attached testcase (with printfs added to gcc) to check that
> the right modes and types were being chosen.  The patch fixes the
> complex float and complex double cases, since the integer type that we
> previously picked had a larger alignment than the original complex type.

As of complex int modes - are we positively sure that targets even
try to do sth "optimal" for loads/stores of those?

> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
> is that vectors are copied as integer vectors if the target supports
> them directly but are copied as float vectors otherwise, since in the
> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
> test are copied as vector floats and vector doubles both before and
> after the patch.

That wasn't intended ... the folding should have failed if we can't
copy using an integer mode ...

Richard.

>
> Tested against trunk with x86_64-linux-gnu {,-m32}.  OK to install?
>
> Thanks,
> Richard
>
>
> gcc/
> * machmode.h (bitwise_mode_for_size): Declare.
> * stor-layout.h (bitwise_type_for_mode): Likewise.
> * stor-layout.c (bitwise_mode_for_size): New function.
> (bitwise_type_for_mode): Likewise.
> * builtins.c (fold_builtin_memory_op): Use it instead of
> int_mode_for_mode and build_nonstandard_integer_type.
>
> gcc/testsuite/
> * gcc.dg/memcpy-5.c: New test.
>
> Index: gcc/machmode.h
> ===
> --- gcc/machmode.h  2014-04-18 11:16:12.706092658 +0100
> +++ gcc/machmode.h  2014-04-18 11:16:38.179299261 +0100
> @@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f
>
>  extern enum machine_mode int_mode_for_mode (enum machine_mode);
>
> +extern enum machine_mode bitwise_mode_for_size (enum machine_mode);
> +
>  /* Return a mode that is suitable for representing a vector,
> or BLKmode on failure.  */
>
> Index: gcc/stor-layout.h
> ==

Re: RFA: tweak integer type used for memcpy folding

2014-04-22 Thread Richard Sandiford
Richard Sandiford  writes:
> wide-int fails to build libitm because of a bad interaction between:
>
> /* Keep the OI and XI modes from confusing the compiler into thinking
>that these modes could actually be used for computation.  They are
>only holders for vectors during data movement.  */
> #define MAX_BITSIZE_MODE_ANY_INT (128)
>
> and the memcpy folding code:
>
>   /* Make sure we are not copying using a floating-point mode or
>  a type whose size possibly does not match its precision.  */
>   if (FLOAT_MODE_P (TYPE_MODE (desttype))
> || TREE_CODE (desttype) == BOOLEAN_TYPE
> || TREE_CODE (desttype) == ENUMERAL_TYPE)
>   {
> /* A more suitable int_mode_for_mode would return a vector
>integer mode for a vector float mode or a integer complex
>mode for a float complex mode if there isn't a regular
>integer mode covering the mode of desttype.  */
> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
> if (mode == BLKmode)
>   desttype = NULL_TREE;
> else
>   desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
>  1);
>   }
>   if (FLOAT_MODE_P (TYPE_MODE (srctype))
> || TREE_CODE (srctype) == BOOLEAN_TYPE
> || TREE_CODE (srctype) == ENUMERAL_TYPE)
>   {
> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
> if (mode == BLKmode)
>   srctype = NULL_TREE;
> else
>   srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
> 1);
>   }
>
> The failure occurs for complex long double, which we try to copy as
> a 256-bit integer type (OImode).
>
> This patch tries to do what the comment suggests by introducing a new
> form of int_mode_for_mode that replaces vector modes with vector modes
> and complex modes with complex modes.  The fallback case of using a
> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
> 128 bits on x86_64.
>
> The question then is what to do about 128-bit types for i386.
> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
> for -m32 -msse2 because we use int128_t to copy a float128_t.
>
> I handled that by allowing MODE_VECTOR_INT to be used instead of
> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
> even if the original type wasn't a vector.
>
> It might be that other callers to int_mode_for_mode should use
> the new function too, but I'll look at that separately.
>
> I used the attached testcase (with printfs added to gcc) to check that
> the right modes and types were being chosen.  The patch fixes the
> complex float and complex double cases, since the integer type that we
> previously picked had a larger alignment than the original complex type.
>
> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
> is that vectors are copied as integer vectors if the target supports
> them directly but are copied as float vectors otherwise, since in the
> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
> test are copied as vector floats and vector doubles both before and
> after the patch.
>
> Tested against trunk with x86_64-linux-gnu {,-m32}.  OK to install?

There was a typo in the declaration of the mode->mode function,
should have been as follows.

Thanks,
Richard


gcc/
* machmode.h (bitwise_mode_for_mode): Declare.
* stor-layout.h (bitwise_type_for_mode): Likewise.
* stor-layout.c (bitwise_mode_for_mode): New function.
(bitwise_type_for_mode): Likewise.
* builtins.c (fold_builtin_memory_op): Use it instead of
int_mode_for_mode and build_nonstandard_integer_type.

gcc/testsuite/
* gcc.dg/memcpy-5.c: New test.

Index: gcc/machmode.h
===
--- gcc/machmode.h  2014-04-21 10:35:17.611603989 +0100
+++ gcc/machmode.h  2014-04-21 13:58:59.403884452 +0100
@@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f
 
 extern enum machine_mode int_mode_for_mode (enum machine_mode);
 
+extern enum machine_mode bitwise_mode_for_mode (enum machine_mode);
+
 /* Return a mode that is suitable for representing a vector,
or BLKmode on failure.  */
 
Index: gcc/stor-layout.h
===
--- gcc/stor-layout.h   2014-04-21 10:35:17.611603989 +0100
+++ gcc/stor-layout.h   2014-04-21 13:58:59.405878960 +0100
@@ -98,6 +98,8 @@ extern tree make_unsigned_type (int);
mode_for_size, but is passed a tree.  */
 extern enum machine_mode mode_for_size_tree (const_tree, enum mode_class, int);
 
+extern tree bitwise_type_for_mode (enum machine_mode);
+
 /* Given a VAR_DECL, PARM_DECL or RESULT_DECL, clears the resul

RFA: tweak integer type used for memcpy folding

2014-04-19 Thread Richard Sandiford
wide-int fails to build libitm because of a bad interaction between:

/* Keep the OI and XI modes from confusing the compiler into thinking
   that these modes could actually be used for computation.  They are
   only holders for vectors during data movement.  */
#define MAX_BITSIZE_MODE_ANY_INT (128)

and the memcpy folding code:

  /* Make sure we are not copying using a floating-point mode or
 a type whose size possibly does not match its precision.  */
  if (FLOAT_MODE_P (TYPE_MODE (desttype))
  || TREE_CODE (desttype) == BOOLEAN_TYPE
  || TREE_CODE (desttype) == ENUMERAL_TYPE)
{
  /* A more suitable int_mode_for_mode would return a vector
 integer mode for a vector float mode or a integer complex
 mode for a float complex mode if there isn't a regular
 integer mode covering the mode of desttype.  */
  enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
  if (mode == BLKmode)
desttype = NULL_TREE;
  else
desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
   1);
}
  if (FLOAT_MODE_P (TYPE_MODE (srctype))
  || TREE_CODE (srctype) == BOOLEAN_TYPE
  || TREE_CODE (srctype) == ENUMERAL_TYPE)
{
  enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
  if (mode == BLKmode)
srctype = NULL_TREE;
  else
srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
  1);
}

The failure occurs for complex long double, which we try to copy as
a 256-bit integer type (OImode).

This patch tries to do what the comment suggests by introducing a new
form of int_mode_for_mode that replaces vector modes with vector modes
and complex modes with complex modes.  The fallback case of using a
MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
128 bits on x86_64.

The question then is what to do about 128-bit types for i386.
MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
for -m32 -msse2 because we use int128_t to copy a float128_t.

I handled that by allowing MODE_VECTOR_INT to be used instead of
MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
even if the original type wasn't a vector.

It might be that other callers to int_mode_for_mode should use
the new function too, but I'll look at that separately.

I used the attached testcase (with printfs added to gcc) to check that
the right modes and types were being chosen.  The patch fixes the
complex float and complex double cases, since the integer type that we
previously picked had a larger alignment than the original complex type.

One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
is that vectors are copied as integer vectors if the target supports
them directly but are copied as float vectors otherwise, since in the
latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
test are copied as vector floats and vector doubles both before and
after the patch.

Tested against trunk with x86_64-linux-gnu {,-m32}.  OK to install?

Thanks,
Richard


gcc/
* machmode.h (bitwise_mode_for_size): Declare.
* stor-layout.h (bitwise_type_for_mode): Likewise.
* stor-layout.c (bitwise_mode_for_size): New function.
(bitwise_type_for_mode): Likewise.
* builtins.c (fold_builtin_memory_op): Use it instead of
int_mode_for_mode and build_nonstandard_integer_type.

gcc/testsuite/
* gcc.dg/memcpy-5.c: New test.

Index: gcc/machmode.h
===
--- gcc/machmode.h  2014-04-18 11:16:12.706092658 +0100
+++ gcc/machmode.h  2014-04-18 11:16:38.179299261 +0100
@@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f
 
 extern enum machine_mode int_mode_for_mode (enum machine_mode);
 
+extern enum machine_mode bitwise_mode_for_size (enum machine_mode);
+
 /* Return a mode that is suitable for representing a vector,
or BLKmode on failure.  */
 
Index: gcc/stor-layout.h
===
--- gcc/stor-layout.h   2014-04-18 11:16:12.707092667 +0100
+++ gcc/stor-layout.h   2014-04-18 11:16:12.830093665 +0100
@@ -98,6 +98,8 @@ extern tree make_unsigned_type (int);
mode_for_size, but is passed a tree.  */
 extern enum machine_mode mode_for_size_tree (const_tree, enum mode_class, int);
 
+extern tree bitwise_type_for_mode (enum machine_mode);
+
 /* Given a VAR_DECL, PARM_DECL or RESULT_DECL, clears the results of
a previous call to layout_decl and calls it again.  */
 extern void relayout_decl (tree);
Index: gcc/stor-layout.c
===
--- gcc/stor-