Re: Expansion of narrowing math built-ins into power instructions
Hi Tejas, [ Please do not top-post. ] On Thu, Aug 22, 2019 at 09:09:37AM +0530, Tejas Joshi wrote: > Yes, I tried basically every combination I could think of, just not > with the "isa attr". Now, I have the following code and it is still > seems not to be working. Am I missing any options to pass? > > (define_insn "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") > (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%d,wa") > (match_operand:DF 2 "gpc_reg_operand" "d,wa")] > UNSPEC_ADD_NARROWING))] > "TARGET_HARD_FLOAT" > "@ >fadds %0,%1,%2 >xsaddsp %x0,%x1,%x2" > [(set_attr "type" "fp") >(set_attr "isa" "*,p8v")]) > > with the code, I pass -O2 foo.c : > float > foo (double x, double y) > { >return __builtin_fadd (x, y); > } What happens then? "It does not work" is very very vague. At least it seems the compiler does build now? Segher
Re: Expansion of narrowing math built-ins into power instructions
> This does almost exactly the same as what the proposed float_narrow > would do. Instead, write it as > > (define_insn "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa") > (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa") > (match_operand:DF 2 "gpc_reg_operand" ",wa")] >UNSPEC_ADD_TRUNCATE)] > "TARGET_HARD_FLOAT" > "@ >fadds %0,%1,%2 >xsaddsp %x0,%x1,%x2" > [(set_attr "type" "fp") >(set_attr "isa" "*,p8v")]) Yes, I tried basically every combination I could think of, just not with the "isa attr". Now, I have the following code and it is still seems not to be working. Am I missing any options to pass? (define_insn "add_truncdfsf3" [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%d,wa") (match_operand:DF 2 "gpc_reg_operand" "d,wa")] UNSPEC_ADD_NARROWING))] "TARGET_HARD_FLOAT" "@ fadds %0,%1,%2 xsaddsp %x0,%x1,%x2" [(set_attr "type" "fp") (set_attr "isa" "*,p8v")]) with the code, I pass -O2 foo.c : float foo (double x, double y) { return __builtin_fadd (x, y); } Thanks, Tejas On Thu, 22 Aug 2019 at 00:47, Segher Boessenkool wrote: > > On Wed, Aug 21, 2019 at 01:28:52PM -0500, Segher Boessenkool wrote: > > (define_insn "add_truncdfsf3" > > [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa") > > (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa") > > (match_operand:DF 2 "gpc_reg_operand" ",wa")] > > UNSPEC_ADD_TRUNCATE)] > > "TARGET_HARD_FLOAT" > > "@ > >fadds %0,%1,%2 > >xsaddsp %x0,%x1,%x2" > > [(set_attr "type" "fp") > >(set_attr "isa" "*,p8v")]) > > And not ... f, d, d respectively (f for SF, d for DF). > > > Segher
How does one traverse all the global decls
I'm trying to do some analysis code for an optimization that involves my code looking at all the declarations and types there of during the link time optimizations. Note, doing this for the local variables seems to be trivial because of FOR_EACH_LOCAL_DECL and there are also obvious ways of getting at the type information once I have a decl. However, I can't seem to find any similar way of getting at the global level decls. I'd appreciate your help on this. Thanks, Gary Oblock
Re: Expansion of narrowing math built-ins into power instructions
On Wed, Aug 21, 2019 at 01:28:52PM -0500, Segher Boessenkool wrote: > (define_insn "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa") > (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa") > (match_operand:DF 2 "gpc_reg_operand" ",wa")] > UNSPEC_ADD_TRUNCATE)] > "TARGET_HARD_FLOAT" > "@ >fadds %0,%1,%2 >xsaddsp %x0,%x1,%x2" > [(set_attr "type" "fp") >(set_attr "isa" "*,p8v")]) And not ... f, d, d respectively (f for SF, d for DF). Segher
Re: Expansion of narrowing math built-ins into power instructions
Hi Tejas, On Wed, Aug 21, 2019 at 10:56:51PM +0530, Tejas Joshi wrote: > I have the following code which uses unspec but I am really missing > something here. Does unspec not work encapsulating plus? Or I have > some more places to make changes to? > > (define_insn "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa") >(unspec:SF >[(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%,wa") > (match_operand:DF 2 "gpc_reg_operand" ",wa"))] > UNSPEC_ADD_TRUNCATE))] > "TARGET_HARD_FLOAT" > "@ >fadds %0,%1,%2 >xsaddsp %x0,%x1,%x2" > [(set_attr "type" "fp")]) This does almost exactly the same as what the proposed float_narrow would do. Instead, write it as (define_insn "add_truncdfsf3" [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa") (unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%,wa") (match_operand:DF 2 "gpc_reg_operand" ",wa")] UNSPEC_ADD_TRUNCATE)] "TARGET_HARD_FLOAT" "@ fadds %0,%1,%2 xsaddsp %x0,%x1,%x2" [(set_attr "type" "fp") (set_attr "isa" "*,p8v")]) (note the "isa" attribute) to prevent any folding etc. from happening to it. > and an UNSPEC_ADD_TRUNCATE in unspec enum. UNSPEC_ADD_NARROWING? Segher
Re: asking for __attribute__((aligned()) clarification
On Wed, 21 Aug 2019 at 17:50, Paul Koning wrote: > > > > > On Aug 21, 2019, at 10:57 AM, Alexander Monakov wrote: > > > > On Wed, 21 Aug 2019, Paul Koning wrote: > > > >> I agree, but if the new approach generates a warning for code that was > >> written > >> to the old rules, that would be unfortunate. > > > > FWIW I don't know which GCC versions accepted 'packed' on a scalar type. > > That wasn't what I meant; I was talking about the packed and aligned > attributes on struct members. I thought you were saying that > ((packed,aligned(2))) is now a warning. That doesn't appear to be the case, > though; it's accepted without complaint as it always was. Right, nobody's suggesting that should be a warning. The warning is for trying to pack a scalar variable, which is (and always was) meaningless.
Re: Expansion of narrowing math built-ins into power instructions
Hello. I have the following code which uses unspec but I am really missing something here. Does unspec not work encapsulating plus? Or I have some more places to make changes to? (define_insn "add_truncdfsf3" [(set (match_operand:SF 0 "gpc_reg_operand" "=,wa") (unspec:SF [(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%,wa") (match_operand:DF 2 "gpc_reg_operand" ",wa"))] UNSPEC_ADD_TRUNCATE))] "TARGET_HARD_FLOAT" "@ fadds %0,%1,%2 xsaddsp %x0,%x1,%x2" [(set_attr "type" "fp")]) and an UNSPEC_ADD_TRUNCATE in unspec enum. Thanks, Tejas On Wed, 21 Aug 2019 at 01:12, Segher Boessenkool wrote: > > On Tue, Aug 20, 2019 at 03:43:43PM +0100, Richard Sandiford wrote: > > Segher Boessenkool writes: > > > On Tue, Aug 20, 2019 at 01:59:06PM +0100, Richard Sandiford wrote: > > >> Segher Boessenkool writes: > > >> >> [(set (match_operand:SI 0 "register_operand" "=d") > > >> >> (truncate:SI > > >> >> (lshiftrt:DI > > >> > > > >> > (this is optimised to a subreg, in many cases, for example). > > >> > > >> Right. MIPS avoids that one thanks to TARGET_TRULY_NOOP_TRUNCATION. > > > > > > Trying 10 -> 18: > > >10: r200:TI=zero_extend(r204:DI)*zero_extend(r205:DI) > > > REG_DEAD r205:DI > > > REG_DEAD r204:DI > > >18: $2:DI=r200:TI#0 > > > REG_DEAD r200:TI > > > Failed to match this instruction: > > > (set (reg/i:DI 2 $2) > > > (subreg:DI (mult:TI (zero_extend:TI (reg:DI 204)) > > > (zero_extend:TI (reg:DI 205))) 0)) > > > > > > I'm afraid not. > > > > That's TI->DI though, whereas the pattern above is DI->SI. The modes > > matter :-) There'd also need to be a shift to match a highpart pattern. > > It's the same for 32-bit: > > mips-linux-gcc -Wall -W -O2 -S mulh.c -mips32 -mabi=32 > (I hope these options are reasonable? I don't know MIPS well at all). > > Trying 12 -> 20: >12: r200:DI=zero_extend(r204:SI)*zero_extend(r205:SI) > REG_DEAD r205:SI > REG_DEAD r204:SI >20: $2:SI=r200:DI#0 > REG_DEAD r200:DI > Failed to match this instruction: > (set (reg/i:SI 2 $2) > (subreg:SI (mult:DI (zero_extend:DI (reg:SI 204)) > (zero_extend:DI (reg:SI 205))) 0)) > > The point is that this is the form that this insn is simplified to. If > that form is not recognised by your backend, various optimisation > opportunities are missed. > > > I wouldn't say it knows nothing about rounding. It doesn't know > > what the runtime rounding mode is, but that isn't the same thing. > > (Just like not knowing what (mem:SI (sp)) contains isn't the same > > thing as not knowing anything about stack memory.) > > Does it even know if the rounding mode is one of the IEEE FP rounding > modes? > > > Segher
Re: asking for __attribute__((aligned()) clarification
> On Aug 21, 2019, at 10:57 AM, Alexander Monakov wrote: > > On Wed, 21 Aug 2019, Paul Koning wrote: > >> I agree, but if the new approach generates a warning for code that was >> written >> to the old rules, that would be unfortunate. > > FWIW I don't know which GCC versions accepted 'packed' on a scalar type. That wasn't what I meant; I was talking about the packed and aligned attributes on struct members. I thought you were saying that ((packed,aligned(2))) is now a warning. That doesn't appear to be the case, though; it's accepted without complaint as it always was. paul
Re: asking for __attribute__((aligned()) clarification
On Wed, 21 Aug 2019, Paul Koning wrote: > I agree, but if the new approach generates a warning for code that was written > to the old rules, that would be unfortunate. FWIW I don't know which GCC versions accepted 'packed' on a scalar type. Already in 2006 GCC 3.4 would issue a warning: $ echo 'typedef int ui __attribute__((packed));' | gcc34 -xc - -S -o- .file "" :1: warning: `packed' attribute ignored .section.note.GNU-stack,"",@progbits .ident "GCC: (GNU) 3.4.6 20060404 (Red Hat 3.4.6-4)" > Yes. But last I tried, optimizing that for > 1 alignment is problematic > because that information often doesn't make it down to the target code even > though it is documented to do so. Thanks, indeed this memcpy solution is not so well suited for that. Alexander
Re: asking for __attribute__((aligned()) clarification
> On Aug 21, 2019, at 10:28 AM, Alexander Monakov wrote: > > On Tue, 20 Aug 2019, "Markus Fröschle" wrote: > >> Thank you (and others) for your answers. Now I'm just as smart as before, >> however. >> >> Is it a supported, documented, 'long term' feature we can rely on or not? >> >> If yes, I would expect it to be properly documented. If not, never mind. > > I think it's properly documented in gcc-9: > > https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Common-Type-Attributes.html > > (the "old" behavior where the compiler would neither honor reduced alignment > nor issue a warning seems questionable, the new documentation promises a more > sensible approach) I agree, but if the new approach generates a warning for code that was written to the old rules, that would be unfortunate. > In portable code one can also use memcpy to move unaligned data, the compiler > should translate it like an unaligned load/store when size is a suitable > constant: > > int val; > memcpy(&val, ptr, sizeof val); > > (or __builtin_memcpy when -ffreestanding is in effect) Yes. But last I tried, optimizing that for > 1 alignment is problematic because that information often doesn't make it down to the target code even though it is documented to do so. paul
Re: Aw: Re: asking for __attribute__((aligned()) clarification
On Tue, 20 Aug 2019, "Markus Fröschle" wrote: > Thank you (and others) for your answers. Now I'm just as smart as before, > however. > > Is it a supported, documented, 'long term' feature we can rely on or not? > > If yes, I would expect it to be properly documented. If not, never mind. I think it's properly documented in gcc-9: https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Common-Type-Attributes.html (the "old" behavior where the compiler would neither honor reduced alignment nor issue a warning seems questionable, the new documentation promises a more sensible approach) In portable code one can also use memcpy to move unaligned data, the compiler should translate it like an unaligned load/store when size is a suitable constant: int val; memcpy(&val, ptr, sizeof val); (or __builtin_memcpy when -ffreestanding is in effect) Alexander
Re: For which gcc release is going to be foreseen the support for the Coroutines TS extension?
> On 20 Aug 2019, at 17:15, Florian Weimer wrote: > > * Richard Biener: > >> On August 20, 2019 5:19:33 PM GMT+02:00, Nathan Sidwell >> wrote: >>> On 7/26/19 6:03 AM, Iain Sandoe wrote: Hello Sebastian, > On 26 Jul 2019, at 10:19, Florian Weimer wrote: >>> > C++ coroutines are stackless. I don't think any new low-level >>> run-time > support will be needed. correct, C++20 coroutines and threading mechanisms are orthogonal facilities; one can use (IS C++20) coroutines on top of a threaded >>> system or in a single-threaded environment. Two places I see them as being a go-to facility in embedded systems >>> are: * co-operative multi-tasking UIs on single-threaded platforms. * async I/O completion by continuations, rather than callbacks. >>> >>> There are cases where the overhead of threads is too expensive. For >>> instance hiding (cache-missing) load latencies by doing other work >>> while >>> waiting -- a context switch at that point is far too expensive. >> >> But are coroutines so much lower latency (and a context switch does >> not involve cache misses on its own?). For doing useful work in this >> context CPU designers invented SMT... > > I think the idea is that you don't have to worry about synchronizing > multiple threads to reap the benefits from hardware parallelism. For > hiding memory access latency, that could be important because the > synchronization overhead could easily eat up any potential benefits. It seems to me that this is another “tool” in the “toolbox”, and as such will find use in more cases than the performance example given above. Some uses might have more to do with code clarity than performance, per se. * e.g. any code that’s written in terms of a state machine, or as a large body of callbacks, might prove to be a candidate for clearer representation as coroutines. * There’s a potentialy large reduction in held state for cases where the processing is organised as a series of transformations (as the original motivating example was a compiler where it was inconvenient to produce and consume that state between passes). * There’s clearly an industry quest for a lighter weight representation than threads and this is being driven (amongst others) by some of the largest users of massively parallel systems - so SMT is not meeting all their needs. Representatives from this group are saying that coroutines are part of the solution that they want (and backing that up by sponsoring the language development work and several compiler implementations). I don’t have visibility of their specific internal use-cases, of course, unfortunately. * Avoiding recasting problems into a form suitable for overt multi-threading * I expect (and already have one such person reporting bugs against my branch) that the embedded (no threads) space will find this a clearer way to implement some functionality (co-operative multitasking user interfaces springs to mind as noted in a previous post). I suppose that in the end we will not know how this tool is going to be used until it gets wider exposure (although there are some production uses of the LLVM implementation, it’s likely that people won’t commit heavily until it’s in the standard). 0.02GBP only .. Iain