Re: Expansion of narrowing math built-ins into power instructions

2019-08-21 Thread Segher Boessenkool
Hi Tejas,

[ Please do not top-post. ]

On Thu, Aug 22, 2019 at 09:09:37AM +0530, Tejas Joshi wrote:
> Yes, I tried basically every combination I could think of, just not
> with the "isa attr". Now, I have the following code and it is still
> seems not to be working. Am I missing any options to pass?
> 
> (define_insn "add_truncdfsf3"
>   [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa")
>   (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%d,wa")
>  (match_operand:DF 2 "gpc_reg_operand" "d,wa")]
>   UNSPEC_ADD_NARROWING))]
>   "TARGET_HARD_FLOAT"
>   "@
>fadds %0,%1,%2
>xsaddsp %x0,%x1,%x2"
>   [(set_attr "type" "fp")
>(set_attr "isa" "*,p8v")])
> 
> with the code, I pass -O2 foo.c :
> float
> foo (double x, double y)
> {
>return __builtin_fadd (x, y);
> }

What happens then?  "It does not work" is very very vague.  At least it
seems the compiler does build now?


Segher


Re: Expansion of narrowing math built-ins into power instructions

2019-08-21 Thread Tejas Joshi
> This does almost exactly the same as what the proposed float_narrow
> would do.  Instead, write it as
>
> (define_insn "add_truncdfsf3"
>   [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa")
> (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa")
> (match_operand:DF 2 "gpc_reg_operand" ",wa")]
>UNSPEC_ADD_TRUNCATE)]
>   "TARGET_HARD_FLOAT"
>   "@
>fadds %0,%1,%2
>xsaddsp %x0,%x1,%x2"
>   [(set_attr "type" "fp")
>(set_attr "isa" "*,p8v")])

Yes, I tried basically every combination I could think of, just not
with the "isa attr". Now, I have the following code and it is still
seems not to be working. Am I missing any options to pass?

(define_insn "add_truncdfsf3"
  [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa")
  (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%d,wa")
 (match_operand:DF 2 "gpc_reg_operand" "d,wa")]
  UNSPEC_ADD_NARROWING))]
  "TARGET_HARD_FLOAT"
  "@
   fadds %0,%1,%2
   xsaddsp %x0,%x1,%x2"
  [(set_attr "type" "fp")
   (set_attr "isa" "*,p8v")])

with the code, I pass -O2 foo.c :
float
foo (double x, double y)
{
   return __builtin_fadd (x, y);
}

Thanks,
Tejas


On Thu, 22 Aug 2019 at 00:47, Segher Boessenkool
 wrote:
>
> On Wed, Aug 21, 2019 at 01:28:52PM -0500, Segher Boessenkool wrote:
> > (define_insn "add_truncdfsf3"
> >   [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa")
> >   (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa")
> >   (match_operand:DF 2 "gpc_reg_operand" ",wa")]
> >  UNSPEC_ADD_TRUNCATE)]
> >   "TARGET_HARD_FLOAT"
> >   "@
> >fadds %0,%1,%2
> >xsaddsp %x0,%x1,%x2"
> >   [(set_attr "type" "fp")
> >(set_attr "isa" "*,p8v")])
>
> And not ...  f, d, d respectively (f for SF, d for DF).
>
>
> Segher


How does one traverse all the global decls

2019-08-21 Thread Gary Oblock
I'm trying to do some analysis code for an optimization
that involves my code looking at all the declarations and
types there of during the link time optimizations.

Note, doing this for the local variables seems to be trivial
because of FOR_EACH_LOCAL_DECL and there are also
obvious ways of getting at the type information once I have
a decl. However,  I can't seem to find any similar way of
getting at the global level decls.

I'd appreciate your help on this.

Thanks,

Gary Oblock


Re: Expansion of narrowing math built-ins into power instructions

2019-08-21 Thread Segher Boessenkool
On Wed, Aug 21, 2019 at 01:28:52PM -0500, Segher Boessenkool wrote:
> (define_insn "add_truncdfsf3"
>   [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa")
>   (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa")
>   (match_operand:DF 2 "gpc_reg_operand" ",wa")]
>  UNSPEC_ADD_TRUNCATE)]
>   "TARGET_HARD_FLOAT"
>   "@
>fadds %0,%1,%2
>xsaddsp %x0,%x1,%x2"
>   [(set_attr "type" "fp")
>(set_attr "isa" "*,p8v")])

And not ...  f, d, d respectively (f for SF, d for DF).


Segher


Re: Expansion of narrowing math built-ins into power instructions

2019-08-21 Thread Segher Boessenkool
Hi Tejas,

On Wed, Aug 21, 2019 at 10:56:51PM +0530, Tejas Joshi wrote:
> I have the following code which uses unspec but I am really missing
> something here. Does unspec not work encapsulating plus? Or I have
> some more places to make changes to?
> 
> (define_insn "add_truncdfsf3"
>   [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa")
>(unspec:SF
>[(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%,wa")
>  (match_operand:DF 2 "gpc_reg_operand" ",wa"))]
>   UNSPEC_ADD_TRUNCATE))]
>   "TARGET_HARD_FLOAT"
>   "@
>fadds %0,%1,%2
>xsaddsp %x0,%x1,%x2"
>   [(set_attr "type" "fp")])

This does almost exactly the same as what the proposed float_narrow
would do.  Instead, write it as

(define_insn "add_truncdfsf3"
  [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa")
(unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa")
(match_operand:DF 2 "gpc_reg_operand" ",wa")]
   UNSPEC_ADD_TRUNCATE)]
  "TARGET_HARD_FLOAT"
  "@
   fadds %0,%1,%2
   xsaddsp %x0,%x1,%x2"
  [(set_attr "type" "fp")
   (set_attr "isa" "*,p8v")])

(note the "isa" attribute)


to prevent any folding etc. from happening to it.

> and an UNSPEC_ADD_TRUNCATE in unspec enum.

UNSPEC_ADD_NARROWING?


Segher


Re: asking for __attribute__((aligned()) clarification

2019-08-21 Thread Jonathan Wakely
On Wed, 21 Aug 2019 at 17:50, Paul Koning  wrote:
>
>
>
> > On Aug 21, 2019, at 10:57 AM, Alexander Monakov  wrote:
> >
> > On Wed, 21 Aug 2019, Paul Koning wrote:
> >
> >> I agree, but if the new approach generates a warning for code that was 
> >> written
> >> to the old rules, that would be unfortunate.
> >
> > FWIW I don't know which GCC versions accepted 'packed' on a scalar type.
>
> That wasn't what I meant; I was talking about the packed and aligned 
> attributes on struct members.  I thought you were saying that 
> ((packed,aligned(2))) is now a warning.  That doesn't appear to be the case, 
> though; it's accepted without complaint as it always was.

Right, nobody's suggesting that should be a warning. The warning is
for trying to pack a scalar variable, which is (and always was)
meaningless.


Re: Expansion of narrowing math built-ins into power instructions

2019-08-21 Thread Tejas Joshi
Hello.
I have the following code which uses unspec but I am really missing
something here. Does unspec not work encapsulating plus? Or I have
some more places to make changes to?

(define_insn "add_truncdfsf3"
  [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa")
   (unspec:SF
   [(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%,wa")
 (match_operand:DF 2 "gpc_reg_operand" ",wa"))]
  UNSPEC_ADD_TRUNCATE))]
  "TARGET_HARD_FLOAT"
  "@
   fadds %0,%1,%2
   xsaddsp %x0,%x1,%x2"
  [(set_attr "type" "fp")])

and an UNSPEC_ADD_TRUNCATE in unspec enum.

Thanks,
Tejas

On Wed, 21 Aug 2019 at 01:12, Segher Boessenkool
 wrote:
>
> On Tue, Aug 20, 2019 at 03:43:43PM +0100, Richard Sandiford wrote:
> > Segher Boessenkool  writes:
> > > On Tue, Aug 20, 2019 at 01:59:06PM +0100, Richard Sandiford wrote:
> > >> Segher Boessenkool  writes:
> > >> >>   [(set (match_operand:SI 0 "register_operand" "=d")
> > >> >> (truncate:SI
> > >> >>  (lshiftrt:DI
> > >> >
> > >> > (this is optimised to a subreg, in many cases, for example).
> > >>
> > >> Right.  MIPS avoids that one thanks to TARGET_TRULY_NOOP_TRUNCATION.
> > >
> > > Trying 10 -> 18:
> > >10: r200:TI=zero_extend(r204:DI)*zero_extend(r205:DI)
> > >   REG_DEAD r205:DI
> > >   REG_DEAD r204:DI
> > >18: $2:DI=r200:TI#0
> > >   REG_DEAD r200:TI
> > > Failed to match this instruction:
> > > (set (reg/i:DI 2 $2)
> > > (subreg:DI (mult:TI (zero_extend:TI (reg:DI 204))
> > > (zero_extend:TI (reg:DI 205))) 0))
> > >
> > > I'm afraid not.
> >
> > That's TI->DI though, whereas the pattern above is DI->SI.  The modes
> > matter :-)  There'd also need to be a shift to match a highpart pattern.
>
> It's the same for 32-bit:
>
> mips-linux-gcc -Wall -W -O2 -S mulh.c -mips32 -mabi=32
> (I hope these options are reasonable?  I don't know MIPS well at all).
>
> Trying 12 -> 20:
>12: r200:DI=zero_extend(r204:SI)*zero_extend(r205:SI)
>   REG_DEAD r205:SI
>   REG_DEAD r204:SI
>20: $2:SI=r200:DI#0
>   REG_DEAD r200:DI
> Failed to match this instruction:
> (set (reg/i:SI 2 $2)
> (subreg:SI (mult:DI (zero_extend:DI (reg:SI 204))
> (zero_extend:DI (reg:SI 205))) 0))
>
> The point is that this is the form that this insn is simplified to.  If
> that form is not recognised by your backend, various optimisation
> opportunities are missed.
>
> > I wouldn't say it knows nothing about rounding.  It doesn't know
> > what the runtime rounding mode is, but that isn't the same thing.
> > (Just like not knowing what (mem:SI (sp)) contains isn't the same
> > thing as not knowing anything about stack memory.)
>
> Does it even know if the rounding mode is one of the IEEE FP rounding
> modes?
>
>
> Segher


Re: asking for __attribute__((aligned()) clarification

2019-08-21 Thread Paul Koning



> On Aug 21, 2019, at 10:57 AM, Alexander Monakov  wrote:
> 
> On Wed, 21 Aug 2019, Paul Koning wrote:
> 
>> I agree, but if the new approach generates a warning for code that was 
>> written
>> to the old rules, that would be unfortunate.
> 
> FWIW I don't know which GCC versions accepted 'packed' on a scalar type.

That wasn't what I meant; I was talking about the packed and aligned attributes 
on struct members.  I thought you were saying that ((packed,aligned(2))) is now 
a warning.  That doesn't appear to be the case, though; it's accepted without 
complaint as it always was.

paul



Re: asking for __attribute__((aligned()) clarification

2019-08-21 Thread Alexander Monakov
On Wed, 21 Aug 2019, Paul Koning wrote:

> I agree, but if the new approach generates a warning for code that was written
> to the old rules, that would be unfortunate.

FWIW I don't know which GCC versions accepted 'packed' on a scalar type.
Already in 2006 GCC 3.4 would issue a warning:

$ echo 'typedef int ui __attribute__((packed));' | gcc34 -xc - -S -o-
.file   ""
:1: warning: `packed' attribute ignored
.section.note.GNU-stack,"",@progbits
.ident  "GCC: (GNU) 3.4.6 20060404 (Red Hat 3.4.6-4)"

> Yes.  But last I tried, optimizing that for > 1 alignment is problematic
> because that information often doesn't make it down to the target code even
> though it is documented to do so.

Thanks, indeed this memcpy solution is not so well suited for that.

Alexander


Re: asking for __attribute__((aligned()) clarification

2019-08-21 Thread Paul Koning



> On Aug 21, 2019, at 10:28 AM, Alexander Monakov  wrote:
> 
> On Tue, 20 Aug 2019, "Markus Fröschle" wrote:
> 
>> Thank you (and others) for your answers. Now I'm just as smart as before, 
>> however.
>> 
>> Is it a supported, documented, 'long term' feature we can rely on or not?
>> 
>> If yes, I would expect it to be properly documented. If not, never mind.
> 
> I think it's properly documented in gcc-9:
> 
>  https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Common-Type-Attributes.html
> 
> (the "old" behavior where the compiler would neither honor reduced alignment
> nor issue a warning seems questionable, the new documentation promises a more
> sensible approach)

I agree, but if the new approach generates a warning for code that was written 
to the old rules, that would be unfortunate.

> In portable code one can also use memcpy to move unaligned data, the compiler
> should translate it like an unaligned load/store when size is a suitable
> constant:
> 
>  int val;
>  memcpy(&val, ptr, sizeof val);
> 
> (or __builtin_memcpy when -ffreestanding is in effect)

Yes.  But last I tried, optimizing that for > 1 alignment is problematic 
because that information often doesn't make it down to the target code even 
though it is documented to do so.

paul



Re: Aw: Re: asking for __attribute__((aligned()) clarification

2019-08-21 Thread Alexander Monakov
On Tue, 20 Aug 2019, "Markus Fröschle" wrote:

> Thank you (and others) for your answers. Now I'm just as smart as before, 
> however.
> 
> Is it a supported, documented, 'long term' feature we can rely on or not?
> 
> If yes, I would expect it to be properly documented. If not, never mind.

I think it's properly documented in gcc-9:

  https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Common-Type-Attributes.html

(the "old" behavior where the compiler would neither honor reduced alignment
nor issue a warning seems questionable, the new documentation promises a more
sensible approach)

In portable code one can also use memcpy to move unaligned data, the compiler
should translate it like an unaligned load/store when size is a suitable
constant:

  int val;
  memcpy(&val, ptr, sizeof val);

(or __builtin_memcpy when -ffreestanding is in effect)

Alexander

Re: For which gcc release is going to be foreseen the support for the Coroutines TS extension?

2019-08-21 Thread Iain Sandoe


> On 20 Aug 2019, at 17:15, Florian Weimer  wrote:
> 
> * Richard Biener:
> 
>> On August 20, 2019 5:19:33 PM GMT+02:00, Nathan Sidwell  
>> wrote:
>>> On 7/26/19 6:03 AM, Iain Sandoe wrote:
 Hello Sebastian,
 
> On 26 Jul 2019, at 10:19, Florian Weimer  wrote:
>>> 
> C++ coroutines are stackless.  I don't think any new low-level
>>> run-time
> support will be needed.
 
 correct, C++20 coroutines and threading mechanisms are orthogonal
 facilities; one can use (IS C++20) coroutines on top of a threaded
>>> system
 or in a single-threaded environment.
 
 Two places I see them as being a go-to facility in embedded systems
>>> are:
  * co-operative multi-tasking UIs on single-threaded platforms.
  * async I/O completion by continuations, rather than callbacks.
>>> 
>>> There are cases where the overhead of threads is too expensive.  For 
>>> instance hiding (cache-missing) load latencies by doing other work
>>> while 
>>> waiting -- a context switch at that point is far too expensive.
>> 
>> But are coroutines so much lower latency (and a context switch does
>> not involve cache misses on its own?). For doing useful work in this
>> context CPU designers invented SMT...
> 
> I think the idea is that you don't have to worry about synchronizing
> multiple threads to reap the benefits from hardware parallelism.  For
> hiding memory access latency, that could be important because the
> synchronization overhead could easily eat up any potential benefits.

It seems to me that this is another “tool” in the “toolbox”, and as such
will find use in more cases than the performance example given above.

Some uses might have more to do with code clarity than performance,
per se.

* e.g. any code that’s written in terms of a state machine, or as a large
body of callbacks, might prove to be a candidate for clearer representation
as coroutines.

* There’s a potentialy large reduction in held state for cases where the
processing is organised as a series of transformations (as the original 
motivating example was a compiler where it was inconvenient to produce
and consume that state between passes).

* There’s clearly an industry quest for a lighter weight representation than
threads and this is being driven (amongst others) by some of the largest
users of massively parallel systems - so SMT is not meeting all their needs.
Representatives from this group are saying that coroutines are part of the
solution that they want (and backing that up by sponsoring the language
development work and several compiler implementations).  I don’t have
visibility of their specific internal use-cases, of course, unfortunately.

* Avoiding recasting problems into a form suitable for overt multi-threading

* I expect (and already have one such person reporting bugs against my
branch) that the embedded (no threads) space will find this a clearer way
to implement some functionality (co-operative multitasking user interfaces
springs to mind as noted in a previous post).

I suppose that in the end we will not know how this tool is going to be used
until it gets wider exposure (although there are some production uses of
the LLVM implementation, it’s likely that people won’t commit heavily until
it’s in the standard).

0.02GBP only ..

Iain