On Tue, 20 Sep 2016, Jeff Law wrote:
> On 09/20/2016 06:00 AM, Tamar Christina wrote:
> > On 16/09/16 20:49, Jeff Law wrote:
> > > On 09/12/2016 10:19 AM, Tamar Christina wrote:
> > > > Hi All,
> > > > +
> > > > + /* Re-interpret the float as an unsigned integer type
> > > > + with equal precision. */
> > > > + int_arg_type = build_nonstandard_integer_type (TYPE_PRECISION
> > > > (type), 0);
> > > > + int_arg = fold_build1_loc (loc, INDIRECT_REF, int_arg_type,
> > > > + fold_build1_loc (loc, NOP_EXPR,
> > > > + build_pointer_type (int_arg_type),
> > > > + fold_build1_loc (loc, ADDR_EXPR,
> > > > + build_pointer_type (type), arg)));
> > > Doesn't this make ARG addressable? Which in turn means ARG won't be
> > > exposed to the gimple/ssa optimizers. Or is it the case that when
> > > fpclassify is used its argument is already in memory (and thus
> > > addressable?)
> > >
> > I believe that it is the case that when fpclassify is use the argument
> > is already addressable, but I am not 100% certain. I may be able to do
> > this differently so I'll come back to you on this one.
> The more I think about it, the more I suspect ARG is only going to already be
> marked as addressable if it has already had its address taken.
Sure, if it has it's address taken ... but I don't see how
fpclassify requires the arg to be address taken.
> But I think we can look at this as an opportunity. If ARG is already
> addressable, then it's most likely going to be living in memory (there are
> exceptions). If ARG is most likely going to be living in memory, then we
> clearly want to use your fast integer path, regardless of the target.
> If ARG is not addressable, then it's not as clear as the object is likely
> going to be assigned into an FP register. Integer operations on the an FP
> register likely will force a sequence where we dump the register into memory,
> load from memory into a GPR, then bit test on the GPR. That gets very
> expensive on some architectures.
> Could we defer lowering in the case where the object is not addressable until
> gimple->rtl expansion time? That's the best time to introduce target
> dependencies into the code we generate.
Note that GIMPLE doesn't require sth to be addressable just because
you access random pieces of it. The IL has tricks like allowing
MEM[&decl + CST] w/o actually marking decl TREE_ADDRESSABLE (and the
expanders trying to cope with that) and there is of course
BIT_FIELD_REF which you can use to extract arbitrary bits off any
entity without it living in memory (and again the expanders trying to
cope with that).
So may I suggest to move the "folding" from builtins.c to gimplify.c
and simply emit GIMPLE directly there? That would make it also clearer
that we are dealing with a lowering process rather than a "folding".
Doing it in GIMPLE lowering is another possibility - we lower things
like posix_memalign and setjmp there as well.