Hi,

(Forward from cfe-commits, where some backend stuff has come up).

This is an issue I've been thinking about quite a bit recently, and I agree 
that the biggest problem is the one below:

> * The big thing still missing here is that there is no logic to check how 
> many VFP registers have already been used for other arguments.  When deciding 
> whether to pass an argument as a homogeneous aggregate, one of the criteria 
> is that the entire aggregate has to fit into the remaining unused argument 
> registers, right?

I tend to think that if every front-end has to implement the entire VFP PCS to 
decide how to pass an HFA, something has gone wrong. So I've come to the 
conclusion that the real flaw is LLVM not exposing enough information to the 
target-dependent backend code for it to do the right thing. By the time the 
target is involved, all that remains of any composite type is:
  * The fields completely separated if it was naturally by value. {float, 
float} just gives you two "float" parameters for example.
  * i32, the ByValSize and ByValAlign if it was a byval pointer: e.g. "{float, 
float}* byval".

Even in the first case there's no indication of where a composite type begins 
and ends. The latter could be bludgeoned to mean "this is an HFA, put it in VFP 
regs", but it would be unspeakably ugly.

I believe that if the LLVM original Type* pointer is exposed to TargetLowering 
(perhaps as part of InputArg/OutputArg), then LLVM itself can decide what to do 
with both Small Structures and HFAs in a sane manner: writing a front-end which 
adheres to the PCS would be much easier for any source language. The worry is 
the apparent layering violation by passing a Type* further down. But I'd argue 
that the TargetLowering functions involved are constructing a DAG from nothing 
rather than transforming an existing DAG; giving them LLVM source-level 
information is justifiable.

Given that, the simpler implementation is via byval pointers, but they have 
some issues with efficiency (phases like ScalarRepl can't get to work replacing 
getelementptrs with extracts since the implicit alloca happens during DAG 
construction -- just look at what happens to mips small structs now). With more 
work, the truly natural equivalence would be possible and a front-end could 
simply "call void @foo({float, float} %val)" and everything would work.

Of course, while the second approach is nice in isolation, it may not exactly 
fit in with what other backends do.

Any thoughts?

Tim.

-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.


_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to