date:20101116

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Joern Rennecke


Quoting Ian Lance Taylor :


The scheme that Paolo describes avoids virtual functions.  But for this
usage I personally would prefer virtual functions, since there is no
efficiency cost compared to a target hook.


Well, actually, there is: you first fetch the object pointer, then you
find the vtable pointer, and then you load the function pointer.

With the target hook, you load the function pointer.

And with the function-name-valued macro, you directly call the function.

Does it matter?  I don't know, but I would guess it doesn't.

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Ian Lance Taylor

Nathan Froyd  writes:

> On Wed, Nov 17, 2010 at 03:40:39AM +0100, Paolo Bonzini wrote:
>> True, but you can hide that cast in a base class.  For example you
>> can use a hierarchy
>> 
>> Target   // abstract base
>> TargetImplBase   // provides strong typing
>> TargetI386   // actual implementation
>> 
>> The Target class would indeed take a void *, but the middle class
>> would let TargetI386 think in terms of TargetI386::CumulativeArgs
>> with something like
>> 
>> void f(void *x) {
>> // T needs to provide void T::f(T::CumulativeArgs *)
>> f(static_cast (x));
>> }
>> 
>> The most similar thing in C (though not suitable for multitarget) is
>> a struct, which is why I suggest using that now rather than void *
>> (which would be an implementation detail).
>
> I am admittedly a C++ newbie; the first thing I thought of was:
>
> class gcc::cumulative_args {
>   virtual void advance (...) = 0;
>   virtual rtx arg (...) = 0;
>   virtual rtx incoming_arg (...) { return this->arg (...); };
>   virtual int arg_partial_bytes (...) = 0;
>   // ...and so on for many of the hooks that take CUMULATIVE_ARGS *
>   // possibly with default implementations instead of pure virtual
>   // functions.
> };
>
> class i386::cumulative_args : gcc::cumulative_args {
>   // concrete implementations of virtual functions
> };
>
> // the hook interface is then solely for the backend to return
> // `cumulative_args *' things (the current INIT_*_ARGS macros), which
> // are then manipulated via the virtual functions above.
>
> AFAICS, this eliminates the casting issues Joern described.  What are
> the advantages of the scheme you describe above?  (Honest question.)  Or
> are we talking about the same thing in slightly different terms?

The scheme that Paolo describes avoids virtual functions.  But for this
usage I personally would prefer virtual functions, since there is no
efficiency cost compared to a target hook.

Ian

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Joern Rennecke


Quoting Nathan Froyd :


I am admittedly a C++ newbie; the first thing I thought of was:

class gcc::cumulative_args {
  virtual void advance (...) = 0;
  virtual rtx arg (...) = 0;
  virtual rtx incoming_arg (...) { return this->arg (...); };
  virtual int arg_partial_bytes (...) = 0;
  // ...and so on for many of the hooks that take CUMULATIVE_ARGS *
  // possibly with default implementations instead of pure virtual
  // functions.
};


Trying to put a target-derived object of that into struct rtl_data would
be nonsentical.  You might store a pointer, of course.
But at any rate, the member function implementations would not be part
of the globally-visible target vector.  They are in a smaller vector, and
only the pieces of the middle end that deal with argument passing get to
see them.
Does that mean you acknowledge that we shouldn't have CUMULATIVE_ARGS
taking hooks in the global target vector?

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Nathan Froyd

On Wed, Nov 17, 2010 at 03:40:39AM +0100, Paolo Bonzini wrote:
> True, but you can hide that cast in a base class.  For example you
> can use a hierarchy
> 
> Target   // abstract base
> TargetImplBase   // provides strong typing
> TargetI386   // actual implementation
> 
> The Target class would indeed take a void *, but the middle class
> would let TargetI386 think in terms of TargetI386::CumulativeArgs
> with something like
> 
> void f(void *x) {
> // T needs to provide void T::f(T::CumulativeArgs *)
> f(static_cast (x));
> }
> 
> The most similar thing in C (though not suitable for multitarget) is
> a struct, which is why I suggest using that now rather than void *
> (which would be an implementation detail).

I am admittedly a C++ newbie; the first thing I thought of was:

class gcc::cumulative_args {
  virtual void advance (...) = 0;
  virtual rtx arg (...) = 0;
  virtual rtx incoming_arg (...) { return this->arg (...); };
  virtual int arg_partial_bytes (...) = 0;
  // ...and so on for many of the hooks that take CUMULATIVE_ARGS *
  // possibly with default implementations instead of pure virtual
  // functions.
};

class i386::cumulative_args : gcc::cumulative_args {
  // concrete implementations of virtual functions
};

// the hook interface is then solely for the backend to return
// `cumulative_args *' things (the current INIT_*_ARGS macros), which
// are then manipulated via the virtual functions above.

AFAICS, this eliminates the casting issues Joern described.  What are
the advantages of the scheme you describe above?  (Honest question.)  Or
are we talking about the same thing in slightly different terms?

-Nathan

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Paolo Bonzini


On 11/17/2010 03:10 AM, Ian Lance Taylor wrote:

Joern Rennecke  writes:

I don't see how going to a struct cumulative_args gets us closer
to a viable solution for a multi-target executable, even if you
threw in C++.  Having the target describe a type, and shoe-horning
this through a target hook interface that is decribed in supposedly
target-independent terms will require a cast at some point. [...]
Converting an empty base class to a derived class is not really
safer than converting a void * to a struct pointer.


True, but you can hide that cast in a base class.  For example you can 
use a hierarchy


Target   // abstract base
TargetImplBase   // provides strong typing
TargetI386   // actual implementation

The Target class would indeed take a void *, but the middle class would 
let TargetI386 think in terms of TargetI386::CumulativeArgs with 
something like


void f(void *x) {
// T needs to provide void T::f(T::CumulativeArgs *)
f(static_cast (x));
}

The most similar thing in C (though not suitable for multitarget) is a 
struct, which is why I suggest using that now rather than void * (which 
would be an implementation detail).


Paolo

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Ian Lance Taylor

Joern Rennecke  writes:

> I don't see how going to a struct cumulative_args gets us closer to
> a viable solution for a multi-target executable, even if you threw in
> C++.  Having the target describe a type, and shoe-horning this through
> a target
> hook interface that is decribed in supposedly target-independent terms
> will require a cast at some point - either of the hook argument that
> describes the cumulative args, the hook pointer (not valid C / C++), or
> a pointer to the target vector, or a pointer to some factored-out part of
> the target vector.  Converting an empty base class to a derived class
> is not really safer than converting a void * to a struct pointer.
> And switching to a dynamically typed language is not really on the
> agenda...

In C++ we would use a pure abstract base class in the target hooks and
the targets would have to provide an implementation for the base class.

Ian

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Joern Rennecke


Quoting Paolo Bonzini :


I think a multi-target executable would be just too ugly in C due to
issues such as this.  I don't think it's worthwhile to sacrifice type
safety now, so a struct cumulative_args is preferrable.


I don't see how going to a struct cumulative_args gets us closer to
a viable solution for a multi-target executable, even if you threw in  
C++.  Having the target describe a type, and shoe-horning this through  
a target

hook interface that is decribed in supposedly target-independent terms
will require a cast at some point - either of the hook argument that
describes the cumulative args, the hook pointer (not valid C / C++), or
a pointer to the target vector, or a pointer to some factored-out part of
the target vector.  Converting an empty base class to a derived class
is not really safer than converting a void * to a struct pointer.
And switching to a dynamically typed language is not really on the
agenda...

Fully hookizing the CUMULATIVE_ARGS taking macro has really landed us with
this typing mess.  If we had only used targhooks.c wrappers around the
original macros, we could still enjoy type safety for the
targhooks.c / target interface, a sane include hierarchy, and easy extension
to a multi-target compiler.
I'm afraid the only sane way to have these hooks is changing the  
CUMULATIVE_ARGS pointers into void pointers.  As I said before, we can

make this more readable by using a typedef cumulative_args_t;
but there has to be a cast in every CUMULATIVE_ARGS taking target hook
implementation, or in a helper function which the hook uses (unless the
argument is unused).

All in all it's a 136 KB patch; I'm currently writing the ChangeLog
and running 38 builds.


I've tried auto-generating a union before, and for some targets there are
macros that cause conflicts.  To get a cumulative_args union reliably would
require separate header files for each target's definition.  And you'd still
have to select the target's field inside of each hook implementation - that
is a direct consequence of an interface that connects not the target-specific
middle-end to one target, but all parts of the compiler to potentially
every target.

The alternative would be to undo the the hookization of the CUMULATIVE_ARGS
taking hooks.  Tying the middle-end code that deals with calls again a bit
closer to the target, but allowing all the other parts of the compiler to
be blissfully ignorant of these interfaces.

In C++, you could make the middle-end be a template that takes the target
as a parameter, including a CUMULATIVE_ARGS type.  But that's not much more
than syntactic sugar for having the targets setting different macros and
compiling the middle-end accordingly.

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Paolo Bonzini


On 11/16/2010 10:17 PM, Ian Lance Taylor wrote:

I don't know how we want to get there, but it seems to me that the place
we want to end up is with the target hooks defined to take an argument
of type struct cumulative_args * (or a better name if we can think of
one).


Actually, this doesn't work, because then different target vectors have
different types.  You might get away with it now, but LTO on a multi-target
compiler would fail.


Good point.


I think we should just
typedef void *cumulative_args_t;

and use that for our hooks.


Another area where we can do something much nicer when we move to C++.


This something could be something like target_i386::cumulative_args, 
implemented e.g. using the curiously recurring template pattern 
(http://en.wikipedia.org/wiki/Curiously_recurring_template_pattern).


I think a multi-target executable would be just too ugly in C due to 
issues such as this.  I don't think it's worthwhile to sacrifice type 
safety now, so a struct cumulative_args is preferrable.


Paolo

RE: __ghtread_recursive_mutex_destroy missing

2010-11-16 Thread Nicola Pero


> The gthreads portability layer is missing a function for destroying a
> __ghtread_recursive_mutex object.
>
> For pthreads-based models the recursive mutex type is the same as the
> normal mutex type so __gthread_mutex_destroy handles both, but they're
> distinct types for (at least) gthr-win32.h, so we can't properly
> cleanup recursive mutexes in libstdc++
>
> Any objections if I prepare a patch to add
> __gthread_recursive_mutex_destroy to each gthr header?

It makes sense.  libobjc could use these as well; all mutexes in libobjc
are recursive mutexes.  At the moment libobjc uses __gthread_objc_mutex_xxx
and similar, but it should probably move to use __gthread_recursive_mutex_xxx.

Thanks

gcc-4.4-20101116 is now available

2010-11-16 Thread gccadmin

Snapshot gcc-4.4-20101116 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20101116/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 166830

You'll find:

 gcc-4.4-20101116.tar.bz2 Complete GCC (includes all of below)

  MD5=97ccc9bf753f6de5efed685b93a9b49c
  SHA1=0df0f8f102ee40f05fa5141805c36cee712d448c

 gcc-core-4.4-20101116.tar.bz2C front end and core compiler

  MD5=9cee461ea45ad893964e5b3ce8ae0c15
  SHA1=961c76a219af48778e72d472e55ce73cf03e1292

 gcc-ada-4.4-20101116.tar.bz2 Ada front end and runtime

  MD5=0cf9434083986e61d0a5db6ab07b330b
  SHA1=6ec6e632b612bf1c9ae03af0d3829b0d2d19a840

 gcc-fortran-4.4-20101116.tar.bz2 Fortran front end and runtime

  MD5=02b1543c6c9a0906757ed63dfe1ed9cc
  SHA1=f38c7ddfb83a50ef6171da06933be1001ad28f13

 gcc-g++-4.4-20101116.tar.bz2 C++ front end and runtime

  MD5=a44b703fc3b75265095ca8b17d8e9733
  SHA1=31ac3d39a382e90516c643c9589fe80b306bafa7

 gcc-java-4.4-20101116.tar.bz2Java front end and runtime

  MD5=0c398e643705f2bc5f31c6f5ebf203ef
  SHA1=d31c275e188c5e22ad395be88d720ca95e50e72f

 gcc-objc-4.4-20101116.tar.bz2Objective-C front end and runtime

  MD5=a5fd4c4adc4c0e825163b0df2813b02c
  SHA1=23d03525473e2b11f63afdb757d77b7d0be5db43

 gcc-testsuite-4.4-20101116.tar.bz2   The GCC testsuite

  MD5=433962a9cfbcd076fb0dfd381aaeca66
  SHA1=ac5af875f8d3ea4443ed8a10221db73cd9eefcc3

Diffs from 4.4-20101109 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Ian Lance Taylor

Joern Rennecke  writes:

> Quoting Ian Lance Taylor :
>
>> Joern Rennecke  writes:
>>
>>> Before I go and make all these target changes & test them, is there at
>>> least agreemwent that this is the right approach, i.e replacing
>>> CUMULATIVE_ARG *
>>> with void *, and splitting up x_rtl into two variables.
>>
>> I don't know how we want to get there, but it seems to me that the place
>> we want to end up is with the target hooks defined to take an argument
>> of type struct cumulative_args * (or a better name if we can think of
>> one).
>
> Actually, this doesn't work, because then different target vectors have
> different types.  You might get away with it now, but LTO on a multi-target
> compiler would fail.

Good point.

> I think we should just
> typedef void *cumulative_args_t;
>
> and use that for our hooks.

Another area where we can do something much nicer when we move to C++.

Ian

Re: Mailing lists for back-end development?

2010-11-16 Thread Mark Mitchell

On 11/16/2010 11:24 AM, Dave Korn wrote:

>   I think it's probably an over-engineered solution to a problem we could
> really address best by remembering to use []-tags in the subject lines. 

OK, that seems to be as close to consensus as we're probably going to
get.  Let's try and do that.

Thank you,

-- 
Mark Mitchell
CodeSourcery
m...@codesourcery.com
(650) 331-3385 x713

CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

2010-11-16 Thread Joern Rennecke


Quoting Ian Lance Taylor :


Joern Rennecke  writes:


Before I go and make all these target changes & test them, is there at
least agreemwent that this is the right approach, i.e replacing
CUMULATIVE_ARG *
with void *, and splitting up x_rtl into two variables.


I don't know how we want to get there, but it seems to me that the place
we want to end up is with the target hooks defined to take an argument
of type struct cumulative_args * (or a better name if we can think of
one).


Actually, this doesn't work, because then different target vectors have
different types.  You might get away with it now, but LTO on a multi-target
compiler would fail.

I think we should just
typedef void *cumulative_args_t;

and use that for our hooks.

Re: RFC: semi-automatic hookization

2010-11-16 Thread Joern Rennecke


Quoting Ian Lance Taylor :


Joern Rennecke  writes:


Before I go and make all these target changes & test them, is there at
least agreemwent that this is the right approach, i.e replacing
CUMULATIVE_ARG *
with void *, and splitting up x_rtl into two variables.


I don't know how we want to get there, but it seems to me that the place
we want to end up is with the target hooks defined to take an argument
of type struct cumulative_args * (or a better name if we can think of
one).  We could consider moving the struct definition into CPU.c, and
having the target structure just report the size, or perhaps a combined
allocation/INIT_CUMULATIVE_ARGS function.

Ian

Re: Mailing lists for back-end development?

2010-11-16 Thread Dave Korn

On 16/11/2010 17:29, Mark Mitchell wrote:
> I spoke with a partner today who suggested that perhaps it would be a
> bit easier to follow the voluminous GCC mailing list if we had separate

  (Do you mean "the voluminous gcc-patches mailing list" perhaps?)

> lists for patches related to particular back-ends (e.g., ARM, MIPS,
> Power, SuperH, x86, etc.).

  I think it's probably an over-engineered solution to a problem we could
really address best by remembering to use []-tags in the subject lines.  If
usenet taught us anything, it's that you can't solve real problems just by
renaming (or subdividing) groups.  I think it would also be more-or-less
counter-productive; as all the back-ends share a common interface, I think
most backend maintainers need to keep an eye on what's going on with other
backends anyway, even when not directly involved.  So I think we'd all just
end up subscribed to a dozen-plus mailing lists instead of one and still have
pretty much the same amount of incoming mail to sift through anyway.  That
being so, doing it at our clients by filtering on tags makes as much sense as
anything else.

cheers,
  DaveK

Re: Mailing lists for back-end development?

2010-11-16 Thread Diego Novillo

On Tue, Nov 16, 2010 at 09:57, Richard Henderson  wrote:

> I think that splitting things all the way down to $arch is probably
> not useful in that things that affect all backends will not get
> addressed promptly if backend reviewers are so narrowly focused.

Agreed.  A backend specific list may work, but I don't think it would
be useful to make it too specific.  I'm not too sanguine about the
whole idea, though.

Perhaps encourage the use of [prefix] tags like we do for branches and
large modules?  I would rather use tagging than a fixed taxonomy.
It's more flexible and easier to change if our needs change.

Diego.

Re: Mailing lists for back-end development?

2010-11-16 Thread Richard Henderson

On 11/16/2010 09:29 AM, Mark Mitchell wrote:
> The idea here is that (as with libstdc++), we'd send patches to
> gcc-patches@ and gcc-$arch@, but that reviewers for a particular
> back-end would find it easier to keep track of things on the
> architecture-specific lists, and also that this would make it easier
> when trying to track down patches to backport to distribution versions
> of the compiler.
> 
> What do people think about this idea?

I think that splitting things all the way down to $arch is probably
not useful in that things that affect all backends will not get
addressed promptly if backend reviewers are so narrowly focused.

I would, however, be amenable to a gcc-backend list, and lets say a
strong suggestion that all messages to that list have [$arch] or [all]
as a subject line prefix.

Unless I miss the purpose of these lists?

r~

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-16 Thread Xinliang David Li

On Tue, Nov 16, 2010 at 6:35 AM, Jan Hubicka  wrote:
>> More FDO related performance numbers
>>
>> Experiment 1:  trunk gcc O2 + FDO vs O2:      FDO improves performance
>> by 5% geomean
>> Experiment 2: our internal gcc compiler (4.4.3 based with many local
>> patches) O2 + FDO vs O2 (trunk gcc):   FDO improves perf by 6.6%
>> geomean
>> Experiment 3: our internal gcc (4.4.3 with local patchs) O2 + LIPO vs
>> O2 (trunk gcc):  LIPO improves by 12%
>> Experiment 4: trunk gcc O2 + LTO + fwhole-program + FDO vs O2:  LTO +
>> FDO improves by 10.8%
>>
>>
>> 1. Trunk gcc FDO vs O2  (5%)
>>
>>             164.gzip                1324                1302     -1.64%
>>              175.vpr                1694                1725      1.84%
>>              176.gcc                2293                2387      4.07%
>>              181.mcf                1772                1756     -0.88%
>>           186.crafty                2320                2280     -1.75%
>>           197.parser                1166                1556     33.42%
>>              252.eon                2443                2552      4.45%
>>          253.perlbmk                2410                2586      7.28%
>>              254.gap                1987                2021      1.71%
>>           255.vortex                2392                2720     13.71%
>>            256.bzip2                1719                1717     -0.12%
>>            300.twolf                2288                2331      1.86%
>>
>> 2. 4.4.3 gcc with local patch FDO vs trunk O2 (6.6%)
>
> Interesting, any idea from where this 1.6% is comming?

Probably due to local patches (inliner, lrs, etc) we have, but I have
not studied it.

>  I guess LIPO this might
> be also reason for that 2% difference in LIPO results (in general LTO
> -fwhole-program + FDO should be stronger, but it is not tunned at all yet).
>
> Since the LIPO branch was updated to mainline some time ago, it would be nice
> to compare the LIPO from the branch with mainline LTO.  i guess more fair 
> comparsion
> should be O2+FDO+LTO WRT O2+LIPO as LIPO makes no whole program assumptions
> at all, right?

Yes. Raksit maintains the upstream lipo branch, but it has not been
tuned for performance yet.  We have open sourced our compiler changes
via android. It is better to use that  if any one is interested.

Thanks,

David


>
> Honza
>

Re: Mailing lists for back-end development?

2010-11-16 Thread Andrew Pinski

On Tue, Nov 16, 2010 at 9:29 AM, Mark Mitchell  wrote:
> What do people think about this idea?

I think this is really bad idea.  A lot of the time, back-end patches
for one target inspires some folks to do patches for another target.
Or for an example, look at how FMA has been done recently.  Those
patches would have a lot of overlapping.  More mailing list is a bad
idea as it signals that GCC development is splitting up.

-- Pinski

Mailing lists for back-end development?

2010-11-16 Thread Mark Mitchell

I spoke with a partner today who suggested that perhaps it would be a
bit easier to follow the voluminous GCC mailing list if we had separate
lists for patches related to particular back-ends (e.g., ARM, MIPS,
Power, SuperH, x86, etc.).

The idea here is that (as with libstdc++), we'd send patches to
gcc-patches@ and gcc-$arch@, but that reviewers for a particular
back-end would find it easier to keep track of things on the
architecture-specific lists, and also that this would make it easier
when trying to track down patches to backport to distribution versions
of the compiler.

What do people think about this idea?

Thank you,

-- 
Mark Mitchell
CodeSourcery
m...@codesourcery.com
(650) 331-3385 x713

Invoking atomic functions from a C++ shared lib (or should I force linking with -lgcc?)

2010-11-16 Thread Christophe Lyon

Hi,

I have been investigating a problem I have while building Qt-embedded
with GCC-4.5.0 for ARM/Linux, and managed to produce the reduced test
case as follows.

Consider this shared library (C++):
 atomic.cxx
int atomicIncrement(int volatile* addend)
{ return __sync_fetch_and_add(addend, 1) + 1; }

Compiled with:
$ arm-linux-g++ atomic.cxx -fPIC -shared -o libatomic.so

Now the main program:
 atomain.cxx
extern int atomicIncrement(int volatile* addend);
volatile int myvar;
int main()
{ return atomicIncrement(&myvar); }

Compiled & linked with:
$ arm-linux-g++ atomain.cxx -o atomain -L. -latomic
.../ld: atomain: hidden symbol `__sync_fetch_and_add_4' in
/.../libgcc.a(linux-atomic.o) is referenced by DSO


What I have found is that g++ (unlike gcc) links with -lgcc_s instead of
-lgcc and that the atomic functions are present in libgcc.a and not in
libgcc_s.so.

If I create libatomic.so with -lgcc, it works.

What I don't understand is if this is the intended behaviour and that
adding -lgcc is the right fix, or not?

[This surprises me, because as I said, I faced this problem when
compiling Qt-embedded for ARM/Linux and I don't think I am the only one
doing that, so I expected it to just work ;-)]

Thanks,

Christophe.

Re: decimal float, LIBGCC2_FLOAT_WORDS_BIG_ENDIAN, and ARM ABI issues

2010-11-16 Thread Joseph S. Myers

On Tue, 16 Nov 2010, Nathan Froyd wrote:

> The saving grace here is that decimal float is not enabled by default
> for arm platforms, so there are likely very few, if any, users of
> decimal float on ARM; it might be worthwhile to go ahead and fix things,
> ignoring the fallout from earlier versions.

Not enabled by default generally implies not usable at all for decimal 
floating point; you generally need at least some support for the ABI, 
saying what modes are allowed in what registers, etc. - and the current 
revision of AAPCS doesn't include decimal floating point at all so there 
is no ABI to follow at present.  It seems likely to me that if you enabled 
decimal floating point on ARM you'd get ICEs.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFC: semi-automatic hookization

2010-11-16 Thread Joern Rennecke


Quoting Ian Lance Taylor :


Joern Rennecke  writes:


Before I go and make all these target changes & test them, is there at
least agreemwent that this is the right approach, i.e replacing
CUMULATIVE_ARG *
with void *, and splitting up x_rtl into two variables.


I don't know how we want to get there, but it seems to me that the place
we want to end up is with the target hooks defined to take an argument
of type struct cumulative_args * (or a better name if we can think of
one).  We could consider moving the struct definition into CPU.c, and
having the target structure just report the size, or perhaps a combined
allocation/INIT_CUMULATIVE_ARGS function.


If every target defines struct cumulative_args, allocation is straightforward.
ctmrtl (or if you think a better name, propose one) is a macro for the
global variable x_tm_rtl, which is defined in target-oriented middle-end
code that includes tm.h .

What is not quite clear is what is to happen with the args member of x_rtl.
Should I remove the info member from struct incoming_args, and shift that
to x_tm_rtl, or should I rather move the entire args member of x_rtl to
x_tm_rtl?  The latter would mean that struct incoming_args would remain
intact - but OTOH more churn in config/*/*, because every access to crtl->args
will have to be changed.

Or maybe we should leave the target-specific stuff in x_rtl / crtl and
instead move out the stuff that emit-rtl.h makes visible to non-rtl code,
e.g. x_first_insn, x_last_insn ...

decimal float, LIBGCC2_FLOAT_WORDS_BIG_ENDIAN, and ARM ABI issues

2010-11-16 Thread Nathan Froyd

The easiest way to deal with the use of LIBGCC2_FLOAT_WORDS_BIG_ENDIAN
in libgcc is to define a preprocessor macro __FLOAT_WORD_ORDER__ similar
to how WORDS_BIG_ENDIAN was converted.  That is, cppbuiltin.c will do:

  cpp_define_formatted (FOO, "__FLOAT_WORD_ORDER__=%s",
(FLOAT_WORDS_BIG_ENDIAN
 ? "__ORDER_BIG_ENDIAN__"
 : "__ORDER_LITTLE_ENDIAN__"));

and change any uses of LIBGCC2_FLOAT_WORDS_BIG_ENDIAN to consult
__FLOAT_WORD_ORDER__ instead.

A grep reveals that there are no target definitions of
LIBGCC2_FLOAT_WORDS_BIG_ENDIAN, so we should be OK with the
straightforward conversion, right?

This runs into a curious case in the arm backend, though, which has:

#define FLOAT_WORDS_BIG_ENDIAN (arm_float_words_big_endian ())

with no corresponding LIBGCC2_FLOAT_WORDS_BIG_ENDIAN.  I think what this
means is that the places that care about the order of float words
(currently libdecnumber, libbid, and dfp-bit.h) will always use the
order indicated by __BYTE_ORDER__/WORDS_BIG_ENDIAN, even when the
backend is secretly using a different order.

ARM has probably gotten lucky wrt dfp-bit.h because it has its own
assembler fp routines that presumbly DTRT for unusual float word
orderings.  (dfp-bit.h also does not *use* the setting of
LIBGCC2_FLOAT_WORDS_BIG_ENDIAN, so that helps.)  But IIUC, using
__FLOAT_WORD_ORDER__ in the relevant libraries will break pre-existing
code that used libdecnumber and/or libbid.  I am not conversant enough
with ARM ABIs and/or targets to know which ones would break.

The saving grace here is that decimal float is not enabled by default
for arm platforms, so there are likely very few, if any, users of
decimal float on ARM; it might be worthwhile to go ahead and fix things,
ignoring the fallout from earlier versions.

What do the ARM maintainers think?  Should I prepare a patch for getting
rid of LIBGCC2_FLOAT_WORDS_BIG_ENDIAN and we'll declare decimal float
horribly broken pre-4.6?  Or is there a better way forward?

-Nathan

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-16 Thread Jan Hubicka

> More FDO related performance numbers
> 
> Experiment 1:  trunk gcc O2 + FDO vs O2:  FDO improves performance
> by 5% geomean
> Experiment 2: our internal gcc compiler (4.4.3 based with many local
> patches) O2 + FDO vs O2 (trunk gcc):   FDO improves perf by 6.6%
> geomean
> Experiment 3: our internal gcc (4.4.3 with local patchs) O2 + LIPO vs
> O2 (trunk gcc):  LIPO improves by 12%
> Experiment 4: trunk gcc O2 + LTO + fwhole-program + FDO vs O2:  LTO +
> FDO improves by 10.8%
> 
> 
> 1. Trunk gcc FDO vs O2  (5%)
> 
> 164.gzip13241302 -1.64%
>  175.vpr16941725  1.84%
>  176.gcc22932387  4.07%
>  181.mcf17721756 -0.88%
>   186.crafty23202280 -1.75%
>   197.parser11661556 33.42%
>  252.eon24432552  4.45%
>  253.perlbmk24102586  7.28%
>  254.gap19872021  1.71%
>   255.vortex23922720 13.71%
>256.bzip217191717 -0.12%
>300.twolf22882331  1.86%
> 
> 2. 4.4.3 gcc with local patch FDO vs trunk O2 (6.6%)

Interesting, any idea from where this 1.6% is comming?  I guess LIPO this might
be also reason for that 2% difference in LIPO results (in general LTO
-fwhole-program + FDO should be stronger, but it is not tunned at all yet).

Since the LIPO branch was updated to mainline some time ago, it would be nice
to compare the LIPO from the branch with mainline LTO.  i guess more fair 
comparsion
should be O2+FDO+LTO WRT O2+LIPO as LIPO makes no whole program assumptions
at all, right?

Honza

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-16 Thread Richard Guenther

2010/11/16 Jan Hubicka :
>> On Mon, Nov 15, 2010 at 5:39 PM, Jan Hubicka  wrote:
>> >> > Fortunately linker plugin solves the problem here and this is why I 
>> >> > want to
>> >> > have it by default.  GCC then can do effectively -fwhole-program for 
>> >> > binaries
>> >> > (since linker knows what will be bound elsewhere) and take advantage of
>> >> > visibility((hidden)) hints for shared libraries same way.  Most of 
>> >> > important
>> >> > shared libraries gets visibility ((hidden)) right.
>> >> >
>> >> > It is sad that LTO w/o linker plugin doesn't give that much benefits.
>> >> > Ideas are welcome here.
>> >>
>> >> Linker feedback will be limited here -- mostly global variable
>> >> aliasing (as I remember only 2/3 spec programs benefit from it), it
>> >> helps  You don't get whole program points-to, whole program mod-ref
>> >> (with context sensitivity), whole program structure layout. The latter
>> >> are the real kickers (in terms of SPEC performance), but promoting LTO
>> >> with those numbers can be misleading as many programs won't get it.
>> >
>> > Well, I am speaking of our linker plugin here.  What it does is to pass GCC
>> > resolution information so it knows what symbols are bound externally. Since
>> > typically you link LTO alone or with small non-LTO part, most of symbols 
>> > are
>> > not bound and thus effecitvely you get -fwhole-program (-fwhole-program 
>> > just
>> > declare everything static except for main ())
>> >
>> > We don't really do whole program points-to or structure layout.
>>
>> gcc will eventually, right?
>
> Sure hope so ;)
> We really need to solve scalability with our IPA points-to and make it
> compatible with WHOPR.
>>
>> > Mod-ref is just
>> > simple ipa-reference code. How you get context sensitivity on mod/ref?
>>
>> mod-ref relies on points-to. With context sensitive points-to, you can
>> also get CS mod-ref -- basically mod-ref info per callsite.
>
> Ah sure, I was too focused on our current "mod/ref" :)

Btw, IPA-PTA also performs mod/ref analysis (but of course it is
context insensitive).

Richard.

> Honza
>

__ghtread_recursive_mutex_destroy missing

2010-11-16 Thread Jonathan Wakely

The gthreads portability layer is missing a function for destroying a
__ghtread_recursive_mutex object.

For pthreads-based models the recursive mutex type is the same as the
normal mutex type so __gthread_mutex_destroy handles both, but they're
distinct types for (at least) gthr-win32.h, so we can't properly
cleanup recursive mutexes in libstdc++

Any objections if I prepare a patch to add
__gthread_recursive_mutex_destroy to each gthr header?

Re: extern "C" applied liberally?

2010-11-16 Thread Gabriel Dos Reis

On Mon, Nov 15, 2010 at 7:19 PM, Jay K  wrote:
>
> I know it is debatable and I could be convinced otherwise, but I would 
> suggest:
>
>
>
> #ifdef __cplusplus
> extern "C" {
> #endif
>
> ...
>
>
> #ifdef __cplusplus
> } /* extern "C" */
> #endif
>
>
> be applied liberally in gcc.
> Not "around" #includes, it is the job of each .h file, and mindful of #ifdefs 
> (ie: correctly).
>
>
> Rationale:
>  Any folks that get to see the mangled names, debugging, working on binutils, 
> whatever, are saved from them.
>     They are generally believed to be ugly, right? Yeah yeah, not a technical 
> argument.

binutils are good at handling those stuff these days.
In long term, that change looks counter productive.

[...]
> I think it is a good idea for any C or historically C code when moving to a 
> C++ compiler.

it may or may not be.  In this case, I don't think it is.
The transition is complete now.

> They could/would be removed as templates/function overloads/operator 
> overloading are introduced.

why introduce kludge that we may have to remove later, when the kludge
fixes no glaring problem?

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-16 Thread Xinliang David Li

More FDO related performance numbers

Experiment 1:  trunk gcc O2 + FDO vs O2:  FDO improves performance
by 5% geomean
Experiment 2: our internal gcc compiler (4.4.3 based with many local
patches) O2 + FDO vs O2 (trunk gcc):   FDO improves perf by 6.6%
geomean
Experiment 3: our internal gcc (4.4.3 with local patchs) O2 + LIPO vs
O2 (trunk gcc):  LIPO improves by 12%
Experiment 4: trunk gcc O2 + LTO + fwhole-program + FDO vs O2:  LTO +
FDO improves by 10.8%


1. Trunk gcc FDO vs O2  (5%)

164.gzip13241302 -1.64%
 175.vpr16941725  1.84%
 176.gcc22932387  4.07%
 181.mcf17721756 -0.88%
  186.crafty23202280 -1.75%
  197.parser11661556 33.42%
 252.eon24432552  4.45%
 253.perlbmk24102586  7.28%
 254.gap19872021  1.71%
  255.vortex23922720 13.71%
   256.bzip217191717 -0.12%
   300.twolf22882331  1.86%

2. 4.4.3 gcc with local patch FDO vs trunk O2 (6.6%)

164.gzip13241317 -0.48%
 175.vpr16941758  3.76%
 176.gcc22932472  7.79%
 181.mcf17721730 -2.35%
  186.crafty23202353  1.40%
  197.parser11661652 41.70%
 252.eon24432610  6.82%
 253.perlbmk24102561  6.23%
 254.gap19871987 -0.04%
  255.vortex23922801 17.09%
   256.bzip217191748  1.68%
   300.twolf22882335  2.04%

3. LIPO  vs trunk O2 (12%)

164.gzip13241350  1.99%
 175.vpr16941758  3.77%
 176.gcc22932519  9.83%
 181.mcf17721766 -0.33%
  186.crafty23202394  3.16%
  197.parser11661683 44.32%
 252.eon24432879 17.80%
 253.perlbmk24102556  6.04%
 254.gap19872139  7.61%
  255.vortex23923669 53.40%
   256.bzip217191824  6.09%
   300.twolf22882345  2.49%

4. LTO + -fwhole-program + O2 + FDO vs O2 (10.8%)

164.gzip13241340  1.25%
 175.vpr16941709  0.87%
 176.gcc22932411  5.13%
 181.mcf17721757 -0.80%
  186.crafty23202566 10.59%
  197.parser11661614 38.44%
 252.eon24432785 13.98%
 253.perlbmk24102618  8.61%
 254.gap19872063  3.81%
  255.vortex23923294 37.69%
   256.bzip217191956 13.77%
   300.twolf22882404  5.07%


David


On Mon, Nov 15, 2010 at 6:18 PM, Xinliang David Li  wrote:
> More performance data:
>
> -O2 -funroll-all-loops vs O2:   +1.1% geomean
>
>                                          O2               O2 unroll-all-loops
>            164.gzip                1324                1336      0.94%
>             175.vpr                1694                1670     -1.44%
>             176.gcc                2293                2353      2.60%
>             181.mcf                1772                1793      1.20%
>          186.crafty                2320                2300     -0.86%
>          197.parser                1166                1171      0.39%
>             252.eon                2443                2515      2.93%
>         253.perlbmk                2410                2250     -6.66%
>             254.gap                1987                2041      2.68%
>          255.vortex

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

RE: __ghtread_recursive_mutex_destroy missing

gcc-4.4-20101116 is now available

Re: CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: Mailing lists for back-end development?

CUMULATIVE_ARGS in hooks (Was: RFC: semi-automatic hookization)

Re: RFC: semi-automatic hookization

Re: Mailing lists for back-end development?

Re: Mailing lists for back-end development?

Re: Mailing lists for back-end development?

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

Re: Mailing lists for back-end development?

Mailing lists for back-end development?

Invoking atomic functions from a C++ shared lib (or should I force linking with -lgcc?)

Re: decimal float, LIBGCC2_FLOAT_WORDS_BIG_ENDIAN, and ARM ABI issues

Re: RFC: semi-automatic hookization

decimal float, LIBGCC2_FLOAT_WORDS_BIG_ENDIAN, and ARM ABI issues

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

__ghtread_recursive_mutex_destroy missing

Re: extern "C" applied liberally?

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

29 matches

Site Navigation

Mail list logo

Footer information