Re: [x86-64] RFC: Add nosse abi attribute

2023-07-17 Thread Richard Sandiford via Gcc-patches
Michael Matz via Gcc-patches writes: > Hello, > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > registers (there were actual reasons for that, but those don't > matter anymore). This starts to hurt some uses, as it means that > as soon as you have a call (say to memmove/memcpy,

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Alexander Monakov via Gcc-patches
On Tue, 11 Jul 2023, Michael Matz wrote: > Hey, > > On Tue, 11 Jul 2023, Alexander Monakov via Gcc-patches wrote: > > > > > > * nosseclobber: claims (and ensures) that xmm8-15 aren't clobbered > > > > > > > > This is the weak/active form; I'd suggest "preserve_high_sse". > > > > > > But it

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Michael Matz via Gcc-patches
Hey, On Tue, 11 Jul 2023, Alexander Monakov via Gcc-patches wrote: > > > > * nosseclobber: claims (and ensures) that xmm8-15 aren't clobbered > > > > > > This is the weak/active form; I'd suggest "preserve_high_sse". > > > > But it preserves only the low parts :-) You swapped the two in your

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Alexander Monakov via Gcc-patches
On Tue, 11 Jul 2023, Michael Matz wrote: > > > To that end I introduce actually two related attributes (for naming > > > see below): > > > * nosseclobber: claims (and ensures) that xmm8-15 aren't clobbered > > > > This is the weak/active form; I'd suggest "preserve_high_sse". > > But it

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Michael Matz via Gcc-patches
Hello, On Mon, 10 Jul 2023, Alexander Monakov wrote: > I think the main question is why you're going with this (weak) form > instead of the (strong) form "may only clobber the low XMM regs": I want to provide both. One of them allows more arbitrary function definitions, the other allows more

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Michael Matz via Gcc-patches
Hello, On Tue, 11 Jul 2023, Jan Hubicka wrote: > > > > When a function doesn't contain calls to > > > > unknown functions we can be a bit more lenient: we can make it so that > > > > GCC simply doesn't touch xmm8-15 at all, then no save/restore is > > > > necessary. > > One may also take into

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Alexander Monakov via Gcc-patches
On Tue, 11 Jul 2023, Richard Biener wrote: > > > If a function contains calls then GCC can't know which > > > parts of the XMM regset is clobbered by that, it may be parts > > > which don't even exist yet (say until avx2048 comes out), so we must > > > restrict ourself to only save/restore the

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Jan Hubicka via Gcc-patches
> > > When a function doesn't contain calls to > > > unknown functions we can be a bit more lenient: we can make it so that > > > GCC simply doesn't touch xmm8-15 at all, then no save/restore is > > > necessary. One may also take into account that first 8 registers are cheaper to encode than the

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Richard Biener via Gcc-patches
On Mon, Jul 10, 2023 at 9:08 PM Alexander Monakov via Gcc-patches wrote: > > > On Mon, 10 Jul 2023, Michael Matz via Gcc-patches wrote: > > > Hello, > > > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > > registers (there were actual reasons for that, but those don't > > matter

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Richard Biener via Gcc-patches
On Tue, Jul 11, 2023 at 10:53 AM Jan Hubicka wrote: > > > > > FWIW, this particular patch was regstrapped on x86-64-linux > > > > with trunk from a week ago (and sniff-tested on current trunk). > > > > > > This looks really cool. > > > > The biggest benefit might be from IPA with LTO where we'd

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Jan Hubicka via Gcc-patches
> > > FWIW, this particular patch was regstrapped on x86-64-linux > > > with trunk from a week ago (and sniff-tested on current trunk). > > > > This looks really cool. > > The biggest benefit might be from IPA with LTO where we'd carefully place > those > attributes at WPA time (at that time

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-11 Thread Richard Biener via Gcc-patches
On Mon, Jul 10, 2023 at 9:08 PM Alexander Monakov via Gcc-patches wrote: > > > On Mon, 10 Jul 2023, Michael Matz via Gcc-patches wrote: > > > Hello, > > > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > > registers (there were actual reasons for that, but those don't > > matter

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-10 Thread Alexander Monakov via Gcc-patches
On Mon, 10 Jul 2023, Alexander Monakov wrote: > > I chose to make it possible to write function definitions with that > > attribute with GCC adding the necessary callee save/restore code in > > the xlogue itself. > > But you can't trivially restore if the callee is sibcalling — what > happens

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-10 Thread Alexander Monakov via Gcc-patches
On Mon, 10 Jul 2023, Michael Matz via Gcc-patches wrote: > Hello, > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > registers (there were actual reasons for that, but those don't > matter anymore). This starts to hurt some uses, as it means that > as soon as you have a call

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-10 Thread Richard Biener via Gcc-patches
> Am 10.07.2023 um 17:56 schrieb Michael Matz via Gcc-patches > : > > Hello, > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > registers (there were actual reasons for that, but those don't > matter anymore). This starts to hurt some uses, as it means that > as soon as you

[x86-64] RFC: Add nosse abi attribute

2023-07-10 Thread Michael Matz via Gcc-patches
Hello, the ELF psABI for x86-64 doesn't have any callee-saved SSE registers (there were actual reasons for that, but those don't matter anymore). This starts to hurt some uses, as it means that as soon as you have a call (say to memmove/memcpy, even if implicit as libcall) in a loop that