Hi,
(Forward from cfe-commits, where some backend stuff has come up).
This is an issue I've been thinking about quite a bit recently, and I agree
that the biggest problem is the one below:
> * The big thing still missing here is that there is no logic to check how
> many VFP registers have already been used for other arguments. When deciding
> whether to pass an argument as a homogeneous aggregate, one of the criteria
> is that the entire aggregate has to fit into the remaining unused argument
> registers, right?
I tend to think that if every front-end has to implement the entire VFP PCS to
decide how to pass an HFA, something has gone wrong. So I've come to the
conclusion that the real flaw is LLVM not exposing enough information to the
target-dependent backend code for it to do the right thing. By the time the
target is involved, all that remains of any composite type is:
* The fields completely separated if it was naturally by value. {float,
float} just gives you two "float" parameters for example.
* i32, the ByValSize and ByValAlign if it was a byval pointer: e.g. "{float,
float}* byval".
Even in the first case there's no indication of where a composite type begins
and ends. The latter could be bludgeoned to mean "this is an HFA, put it in VFP
regs", but it would be unspeakably ugly.
I believe that if the LLVM original Type* pointer is exposed to TargetLowering
(perhaps as part of InputArg/OutputArg), then LLVM itself can decide what to do
with both Small Structures and HFAs in a sane manner: writing a front-end which
adheres to the PCS would be much easier for any source language. The worry is
the apparent layering violation by passing a Type* further down. But I'd argue
that the TargetLowering functions involved are constructing a DAG from nothing
rather than transforming an existing DAG; giving them LLVM source-level
information is justifiable.
Given that, the simpler implementation is via byval pointers, but they have
some issues with efficiency (phases like ScalarRepl can't get to work replacing
getelementptrs with extracts since the implicit alloca happens during DAG
construction -- just look at what happens to mips small structs now). With more
work, the truly natural equivalence would be possible and a front-end could
simply "call void @foo({float, float} %val)" and everything would work.
Of course, while the second approach is nice in isolation, it may not exactly
fit in with what other backends do.
Any thoughts?
Tim.
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits