http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020

--- Comment #41 from Gary Funck <gary at intrepid dot com> 2012-08-15 13:47:37 
UTC ---
(In reply to comment #38)
> What are the code generation deficiencies you are targeting with this?  For
> testcase #1 I get
> 
> sptr_result:
> .LFB0:
>         .cfi_startproc
>         movq    S+8(%rip), %rdx
>         movq    S(%rip), %rax
>         ret
> 
> what would you expect instead?
> 
> I don't think we should change MAX_FIXED_MODE_SIZE, nor make use of
> TImode unconditionally.

All three test cases were designed simply to make them easy to check via an RTL
scan for the presence/use of TImode.  I chose three arbitrary small test cases
with the only criteria that they used the struct's differently.

As far as the motivation for suggesting the change, I noted that several other
targets assign 128-bit struct's into TImode values.  Therefore, I assumed that
there must be some benefit, and that this was an oversight in the x86_64
implementation.

In the GUPC implementation of the UPC programming language, a pointer into
shared space has three components (virtual-offset,thread,phase). This
pointer-to-shared (PTS) can be represented in a "packed" mode, where it uses 64
bits, but gives up some range for each of the fields.  The more general form is
the "struct" representation which provides full range for the fields and is
128-bits -- as the name implies, the underlying PTS representation manipulated
by the compiler is a struct.  Note that the packed representation could be a
struct also (with bit fields), but back when this project was started code
generation for structs and bit-fields wasn't very good; the code quality was
better if the compiler hand-crafted the necessary bit field manipulations.

Since UPC programs use PTS's frequently, we found that targeting them to TImode
containers improved various micro-benchmarks.  We noted that other targets like
MIPS and PPC did this and assumed it would be a good idea for x86_64 to follow
suit.  We don't have any hard data on the level of performance improvement,
though as Chip noted, some modest space savings were gained in libstdc++.

I can re-run some UPC tests, if that helps, but they have a rather specific
usage pattern.  Maybe something like the SPEC benchmarks would be more
compelling, but I don't have access to them.

Reply via email to