On 18.08.2017 12:01, Richard Biener wrote:
On Wed, Aug 16, 2017 at 3:32 PM, Georg-Johann Lay <a...@gjlay.de> wrote:
On 28.07.2017 09:34, Richard Biener wrote:
On Thu, Jul 27, 2017 at 3:32 PM, Georg-Johann Lay <a...@gjlay.de> wrote:
On 27.07.2017 14:34, Richard Biener wrote:
On Thu, Jul 27, 2017 at 2:29 PM, Georg-Johann Lay <a...@gjlay.de> wrote:

For some targets, the best place to put read-only lookup tables as
generated by -ftree-switch-conversion is not the generic address space
but some target specific address space.

This is the case for AVR, where for most devices .rodata must be
located in RAM.

Part #1 adds a new, optional target hook that queries the back-end
about the desired address space.  There is currently only one user:
tree-switch-conversion.c::build_one_array() which re-builds value_type
and array_type if the address space returned by the backend is not
the generic one.

Part #2 is the AVR part that implements the new hook and adds some
sugar around it.

Given that switch-conversion just creates a constant initializer doesn't
AVR
benefit from handling those uniformly (at RTL expansion?).  Not sure but
I think it goes through the regular constant pool handling.

Richard.

avr doesn't use constant pools because they are not profitable for
several reasons.

Moreover, it's not sufficient to just patch the pool's section, you'd
also have to patch *all* accesses to also use the exact same address
space so that they use the correct instruction (LPM instead of LD).

Sure.  So then not handle it in constant pool handling but in expanding
the initializers of global vars.  I think the entry for this is
varasm.c:assemble_variable - that sets DECL_RTL for the variable.

That cannot work, and here is why; just for completeness:

cgraphunit.c::symbol_table::compile()

runs
   ...
   expand_all_functions ();
   output_variables (); // runs assemble_variable
   ...

So any patching of DECL_RTL will result in wrong code: The address
space of the decl won't match the address space used by the code
accessing decl.

Ok, so it's more tricky to hack it that way (you'd need to catch the
time the decl gets its DECL_RTL set, not sure where that happens
for globals).

Too late, IMO.  Problem is that any reference must also have the
same AS.  Hence if somewhere, before patching decl_rtl, some code
uses TREE_TYPE on respective decl, that type will miss the AS,
same for all variables / pointers created with that type.

Been there, seen it...

That leaves doing a more broad transform.  One convenient place
to hook in would be the ipa_single_use pass which computes
the varpool_node.used_by_single_function flag.  All such variables
would be suitable for promotion (with some additional constraints
I suppose).  You'd add a transform phase to the pass that rewrites
the decls and performs GIMPLE IL adjustments (I think you need
to patch memory references to use the address-space).

Rewriting DECLs is not enough.  Only the place that cooks up
the decl (implicitly) knows whether it's appropriate to use
different AS and whether is has control over diffusion into
TREE_TYPEs. And as we have strong filter (see below) which
includes DECL_ARTIFICIAL, almost nothing will remain to be
transformed anyway...

Of course there would need to be a target hook determining
if/what section/address-space a variable should be promoted to
which somehow would allow switch-conversion to use that
as well.  Ho humm.

That said, do you think the switch-conversion created decl is
the only one that benefits from compiler-directed promotion
to different address-space?

Yes. Only compiler-generated lookup-tables may be transformed,
and the only such tables I know of are CSWTCH and vtables.

The conditions so far are:

* TREE_STATIC
* !TREE_PUBLIC
* TREE_READONLY (at least for the case this PR is after)
* DECL_ARTIFICIAL (or otherwise exclude inline asm)
* decl must not be cooked up by non-C FE (i.e. vtables are out).
* Not weak, comdat or linkonce (again, vtables are out).
* Not for debug purpose (like what dwarf2asm does).
* No section attribute (actually also CSWTCH could be mentioned
  in linker script, but we can argue that artificials are
  managed by the compiler + user has option to control CSWTCH).

What remains is CSWTCH.

Johann

Reply via email to