Richard Biener <richard.guent...@gmail.com> writes: > On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford > <rdsandif...@googlemail.com> wrote: >> Richard Biener <richard.guent...@gmail.com> writes: >>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford >>> <rdsandif...@googlemail.com> wrote: >>>> Richard Biener <richard.guent...@gmail.com> writes: >>>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford >>>>> <rdsandif...@googlemail.com> wrote: >>>>>> wide-int fails to build libitm because of a bad interaction between: >>>>>> >>>>>> /* Keep the OI and XI modes from confusing the compiler into thinking >>>>>> that these modes could actually be used for computation. They are >>>>>> only holders for vectors during data movement. */ >>>>>> #define MAX_BITSIZE_MODE_ANY_INT (128) >>>>>> >>>>>> and the memcpy folding code: >>>>>> >>>>>> /* Make sure we are not copying using a floating-point mode or >>>>>> a type whose size possibly does not match its precision. */ >>>>>> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>>> || TREE_CODE (desttype) == BOOLEAN_TYPE >>>>>> || TREE_CODE (desttype) == ENUMERAL_TYPE) >>>>>> { >>>>>> /* A more suitable int_mode_for_mode would return a vector >>>>>> integer mode for a vector float mode or a integer complex >>>>>> mode for a float complex mode if there isn't a regular >>>>>> integer mode covering the mode of desttype. */ >>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE >>>>>> (desttype)); >>>>>> if (mode == BLKmode) >>>>>> desttype = NULL_TREE; >>>>>> else >>>>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE >>>>>> (mode), >>>>>> 1); >>>>>> } >>>>>> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >>>>>> || TREE_CODE (srctype) == BOOLEAN_TYPE >>>>>> || TREE_CODE (srctype) == ENUMERAL_TYPE) >>>>>> { >>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE >>>>>> (srctype)); >>>>>> if (mode == BLKmode) >>>>>> srctype = NULL_TREE; >>>>>> else >>>>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE >>>>>> (mode), >>>>>> 1); >>>>>> } >>>>>> >>>>>> The failure occurs for complex long double, which we try to copy as >>>>>> a 256-bit integer type (OImode). >>>>>> >>>>>> This patch tries to do what the comment suggests by introducing a new >>>>>> form of int_mode_for_mode that replaces vector modes with vector modes >>>>>> and complex modes with complex modes. The fallback case of using a >>>>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >>>>>> 128 bits on x86_64. >>>>>> >>>>>> The question then is what to do about 128-bit types for i386. >>>>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >>>>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >>>>>> for -m32 -msse2 because we use int128_t to copy a float128_t. >>>>>> >>>>>> I handled that by allowing MODE_VECTOR_INT to be used instead of >>>>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >>>>>> even if the original type wasn't a vector. >>>>> >>>>> Hmm. Sounds reasonable unless there are very weird targets that >>>>> cannot efficiently load/store vectors unaligned but can handle >>>>> efficient load/store of unaligned scalars. >>>> >>>> Yeah, in general there's no guarantee that even int_mode_for_mode >>>> will return a mode with the same alignment as the original. Callers >>>> need to check that (like the memcpy folder does). >>>> >>>>>> It might be that other callers to int_mode_for_mode should use >>>>>> the new function too, but I'll look at that separately. >>>>>> >>>>>> I used the attached testcase (with printfs added to gcc) to check that >>>>>> the right modes and types were being chosen. The patch fixes the >>>>>> complex float and complex double cases, since the integer type that we >>>>>> previously picked had a larger alignment than the original complex type. >>>>> >>>>> As of complex int modes - are we positively sure that targets even >>>>> try to do sth "optimal" for loads/stores of those? >>>> >>>> Complex modes usually aren't handled directly by .md patterns, >>>> either int or float. They're really treated as a pair of values. >>>> So IMO it still makes sense to fold this case. >>>> >>>>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>>> is that vectors are copied as integer vectors if the target supports >>>>>> them directly but are copied as float vectors otherwise, since in the >>>>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >>>>>> test are copied as vector floats and vector doubles both before and >>>>>> after the patch. >>>>> >>>>> That wasn't intended ... the folding should have failed if we can't >>>>> copy using an integer mode ... >>>> >>>> Does that mean that the fold give up if TYPE_MODE is BLKmode? >>>> I can do that as a separate patch if so. >>> >>> Looking at the code again it should always choose an integer mode/type >>> via setting desttype/srctype to NULL for BLKmode and >>> >>> if (!srctype) >>> srctype = desttype; >>> if (!desttype) >>> desttype = srctype; >>> if (!srctype) >>> return NULL_TREE; >>> >>> no? Thus if we can't get a integer type for either src or dest then >>> we fail. But we should never end up with srctype or desttype >>> being a float mode. No? >> >> Right, they don't have a float _mode_, because the target doesn't >> support 1024-bit modes. Like I say, they have BLKmode instead. >> But they have a float type. > > Ok. I clearly used FLOAT_MODE_P (TYPE_MODE ()) because > I assumed that only float-modes will eventually end up in FP > registers (that's what the bug was about - we load into a FP register > and that load can trigger a FP exception).
Yeah. I just flagged it up because it wasn't obvious that native vector float types need to be copied as vector ints while non-native ones don't. A non-native vector float might be scalarised into individual float operations and thus might end up using an FP register load. Thanks, Richard