RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
Richard, It's been 6 months so I thought I'd check in again. Do you have an estimate of when this will go in? Michael -Original Message- From: Michael Morrell Sent: Wednesday, July 14, 2021 10:50 AM To: 'Richard Henderson' ; qemu-devel@nongnu.org Subject: RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal OK, thanks for the update. I also appreciate you looking at the NaN issue. Michael -Original Message- From: Richard Henderson Sent: Wednesday, July 14, 2021 9:57 AM To: Michael Morrell ; qemu-devel@nongnu.org Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal On 7/14/21 9:44 AM, Michael Morrell wrote: > Just curious. What's the expected timeline to get these denormal patches in > the tree? Next development cycle, at minimum. I need to fix the problems vs NaNs that you identified. r~
RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
OK, thanks for the update. I also appreciate you looking at the NaN issue. Michael -Original Message- From: Richard Henderson Sent: Wednesday, July 14, 2021 9:57 AM To: Michael Morrell ; qemu-devel@nongnu.org Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal On 7/14/21 9:44 AM, Michael Morrell wrote: > Just curious. What's the expected timeline to get these denormal patches in > the tree? Next development cycle, at minimum. I need to fix the problems vs NaNs that you identified. r~
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
On 7/14/21 9:44 AM, Michael Morrell wrote: Just curious. What's the expected timeline to get these denormal patches in the tree? Next development cycle, at minimum. I need to fix the problems vs NaNs that you identified. r~
RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
Just curious. What's the expected timeline to get these denormal patches in the tree? -Original Message- From: Richard Henderson Sent: Saturday, May 29, 2021 8:21 AM To: Michael Morrell ; qemu-devel@nongnu.org Cc: alex.ben...@linaro.org Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal On 5/28/21 10:41 AM, Michael Morrell wrote: > I'm probably missing something, but why do we need both > "float_flag_inorm_denormal" and "float_flag_iflush_denormal"? > > Couldn't the code that sets these flags set just a single flag for all > denormal inputs and the code that checks these flags check that single flag > combined with the "flush_inputs_to_zero" flag to accomplish what the two > separate "input denormal" flags do? The thing that you're missing is that many guests leave the accumulated softfloat exceptions in the float_status structure until the guest FPSCR register is read. Unless the guest needs to raise an exception immediately, there's no reason to do otherwise. With this setup, you have no temporal connection between "any denormal" and "flush-to-zero is set", thus two flags. r~
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
On 6/7/21 10:19 AM, Alex Bennée wrote: If you've got a better ordering of operations for this, do tell. What I really want is to know which instructions translate into the if (s->flush_inputs_to_zero) and verifying that is only checked once. Maybe I'm just suspicious of compilers ability to optimise things away... Dump of assembler code for function float32_mul: 0x00895d60 <+0>: movzbl 0x1(%rdx),%eax 0x00895d64 <+4>: test $0x10,%al 0x00895d66 <+6>: je 0x895e30 s->float_exception_flags & float_flag_inexact 0x00895d6c <+12>:cmpb $0x0,(%rdx) 0x00895d6f <+15>:jne0x895e30 s->float_rounding_mode == float_round_nearest_even 0x00895d75 <+21>:test $0x7f80,%edi 0x00895d7b <+27>:jne0x895da0 0x00895d7d <+29>:test $0x7fff,%edi 0x00895d83 <+35>:je 0x895da0 float32_is_denormal 0x00895d85 <+37>:cmpb $0x0,0x5(%rdx) 0x00895d89 <+41>:je 0x895e60 s->flush_inputs_to_zero 0x00895d8f <+47>:or $0x40,%eax 0x00895d92 <+50>:and$0x8000,%edi 0x00895d98 <+56>:mov%al,0x1(%rdx) flush-to-zero and set iflush_denormal 0x00895da0 <+64>:test $0x7f80,%esi 0x00895da6 <+70>:jne0x895dd0 0x00895da8 <+72>:test $0x7fff,%esi 0x00895dae <+78>:je 0x895dd0 float32_is_denormal (second operand) 0x00895db0 <+80>:cmpb $0x0,0x5(%rdx) 0x00895db4 <+84>:movzbl 0x1(%rdx),%eax 0x00895db8 <+88>:je 0x895e50 0x00895dbe <+94>:or $0x40,%eax 0x00895dc1 <+97>:and$0x8000,%esi s->flush_inputs_to_zero, flush-to-zero, set iflush_denormal. ... 0x00895e50 <+240>: or $0x20,%eax 0x00895e53 <+243>: mov%al,0x1(%rdx) 0x00895e56 <+246>: jmpq 0x895dd0 set inorm_denormal (second operand) 0x00895e60 <+256>: or $0x20,%eax 0x00895e63 <+259>: mov%al,0x1(%rdx) 0x00895e66 <+262>: jmpq 0x895da0 set inorm_denormal (first operand) There do seem to be 3 reads/writes to exception_flags for float_raise. r~
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
Richard Henderson writes: > On 6/7/21 8:35 AM, Alex Bennée wrote: >> So I'm guessing Emilio had the original flush code split was to avoid >> multiple checks against s->flush_inputs_to_zero in the code. The was >> possibly a good reason, comparing the before/after of float32_mul: > > I assumed that the most important thing now is that we test > floatN_is_denormal only once -- the test for flush_inputs_to_zero is > fairly trivial. > > If you've got a better ordering of operations for this, do tell. What I really want is to know which instructions translate into the if (s->flush_inputs_to_zero) and verifying that is only checked once. Maybe I'm just suspicious of compilers ability to optimise things away... -- Alex Bennée
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
On 6/7/21 8:35 AM, Alex Bennée wrote: So I'm guessing Emilio had the original flush code split was to avoid multiple checks against s->flush_inputs_to_zero in the code. The was possibly a good reason, comparing the before/after of float32_mul: I assumed that the most important thing now is that we test floatN_is_denormal only once -- the test for flush_inputs_to_zero is fairly trivial. If you've got a better ordering of operations for this, do tell. r~
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
Alex Bennée writes: > Richard Henderson writes: > >> Create a new exception flag for reporting input denormals that are not >> flushed to zero, they are normalized and treated as normal numbers. >> >> Signed-off-by: Richard Henderson > > Anyway other than that observation seems OK to me: > > Signed-off-by: Alex Bennée I of course meant: Reviewed-by: Alex Bennée -- Alex Bennée
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
Richard Henderson writes: > Create a new exception flag for reporting input denormals that are not > flushed to zero, they are normalized and treated as normal numbers. > > Signed-off-by: Richard Henderson > --- > include/fpu/softfloat-types.h | 15 --- > fpu/softfloat.c | 84 +++ > fpu/softfloat-parts.c.inc | 1 + > 3 files changed, 36 insertions(+), 64 deletions(-) > > diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h > index e2d70ff556..174100e50e 100644 > --- a/include/fpu/softfloat-types.h > +++ b/include/fpu/softfloat-types.h > @@ -143,13 +143,14 @@ typedef enum __attribute__((__packed__)) { > */ > > enum { > -float_flag_invalid = 1, > -float_flag_divbyzero = 4, > -float_flag_overflow = 8, > -float_flag_underflow = 16, > -float_flag_inexact = 32, > -float_flag_iflush_denormal = 64, > -float_flag_oflush_denormal = 128 > +float_flag_invalid = 0x0001, > +float_flag_divbyzero = 0x0002, > +float_flag_overflow= 0x0004, > +float_flag_underflow = 0x0008, > +float_flag_inexact = 0x0010, > +float_flag_inorm_denormal = 0x0020, /* denormal input, normalized */ > +float_flag_iflush_denormal = 0x0040, /* denormal input, flushed to zero > */ > +float_flag_oflush_denormal = 0x0080, /* denormal result, flushed to > zero */ > }; > > /* > diff --git a/fpu/softfloat.c b/fpu/softfloat.c > index cb077cf111..e54cdb274d 100644 > --- a/fpu/softfloat.c > +++ b/fpu/softfloat.c > @@ -126,61 +126,23 @@ this code that are retained. > * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 > result > * and the result is < the minimum normal. > */ > -#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t) \ > + > +#define GEN_INPUT_FLUSH(name, soft_t) \ > static inline void name(soft_t *a, float_status *s) \ > { \ > if (unlikely(soft_t ## _is_denormal(*a))) { \ > -*a = soft_t ## _set_sign(soft_t ## _zero, \ > - soft_t ## _is_neg(*a));\ > -float_raise(float_flag_iflush_denormal, s); \ > +if (s->flush_inputs_to_zero) { \ > +*a = soft_t ## _set_sign(0, soft_t ## _is_neg(*a)); \ > +float_raise(float_flag_iflush_denormal, s); \ > +} else {\ > +float_raise(float_flag_inorm_denormal, s); \ > +} \ > } \ > } So I'm guessing Emilio had the original flush code split was to avoid multiple checks against s->flush_inputs_to_zero in the code. The was possibly a good reason, comparing the before/after of float32_mul: Dump of assembler code for function float32_mul: 0x00934240 <+0>: movzbl 0x1(%rdx),%eax 0x00934244 <+4>: test $0x20,%al 0x00934246 <+6>: je 0x9342b0 0x00934248 <+8>: cmpb $0x0,(%rdx) 0x0093424b <+11>: jne0x9342b0 0x0093424d <+13>: cmpb $0x0,0x5(%rdx) 0x00934251 <+17>: jne0x9342d0 0x00934253 <+19>: mov%edi,%eax 0x00934255 <+21>: shr$0x17,%eax 0x00934258 <+24>: add$0x1,%eax 0x0093425b <+27>: test $0xfe,%al 0x0093425d <+29>: je 0x9342a8 0x0093425f <+31>: mov%esi,%eax 0x00934261 <+33>: shr$0x17,%eax 0x00934264 <+36>: add$0x1,%eax 0x00934267 <+39>: test $0xfe,%al 0x00934269 <+41>: jne0x934273 0x0093426b <+43>: test $0x7fff,%esi 0x00934271 <+49>: jne0x9342b0 0x00934273 <+51>: mov%esi,-0xc(%rsp) 0x00934277 <+55>: movss -0xc(%rsp),%xmm0 0x0093427d <+61>: mov%edi,-0xc(%rsp) 0x00934281 <+65>: movss -0xc(%rsp),%xmm2 0x00934287 <+71>: mulss %xmm2,%xmm0 0x0093428b <+75>: movd %xmm0,%eax 0x0093428f <+79>: andps 0x3b805a(%rip),%xmm0# 0xcec2f0 0x00934296 <+86>: ucomiss 0x3b8047(%rip),%xmm0# 0xcec2e4 0x0093429d <+93>: jbe0x9342b8 0x0093429f <+95>: orb$0x8,0x1(%rdx) 0x009342a3 <+99>: retq 0x009342a4 <+100>: nopl 0x0(%rax) 0x009342a8 <+104>: test $0x7fff,%edi 0x009342ae <+110>: je 0x93425f 0x009342b0 <+112>: jmpq 0x9290d0
Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
On 5/28/21 10:41 AM, Michael Morrell wrote: I'm probably missing something, but why do we need both "float_flag_inorm_denormal" and "float_flag_iflush_denormal"? Couldn't the code that sets these flags set just a single flag for all denormal inputs and the code that checks these flags check that single flag combined with the "flush_inputs_to_zero" flag to accomplish what the two separate "input denormal" flags do? The thing that you're missing is that many guests leave the accumulated softfloat exceptions in the float_status structure until the guest FPSCR register is read. Unless the guest needs to raise an exception immediately, there's no reason to do otherwise. With this setup, you have no temporal connection between "any denormal" and "flush-to-zero is set", thus two flags. r~
RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal
I'm probably missing something, but why do we need both "float_flag_inorm_denormal" and "float_flag_iflush_denormal"? Couldn't the code that sets these flags set just a single flag for all denormal inputs and the code that checks these flags check that single flag combined with the "flush_inputs_to_zero" flag to accomplish what the two separate "input denormal" flags do? Michael -Original Message- From: Richard Henderson Sent: Wednesday, May 26, 2021 9:14 PM To: qemu-devel@nongnu.org Cc: Michael Morrell ; alex.ben...@linaro.org Subject: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal Create a new exception flag for reporting input denormals that are not flushed to zero, they are normalized and treated as normal numbers. Signed-off-by: Richard Henderson --- include/fpu/softfloat-types.h | 15 --- fpu/softfloat.c | 84 +++ fpu/softfloat-parts.c.inc | 1 + 3 files changed, 36 insertions(+), 64 deletions(-) diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h index e2d70ff556..174100e50e 100644 --- a/include/fpu/softfloat-types.h +++ b/include/fpu/softfloat-types.h @@ -143,13 +143,14 @@ typedef enum __attribute__((__packed__)) { */ enum { -float_flag_invalid = 1, -float_flag_divbyzero = 4, -float_flag_overflow = 8, -float_flag_underflow = 16, -float_flag_inexact = 32, -float_flag_iflush_denormal = 64, -float_flag_oflush_denormal = 128 +float_flag_invalid = 0x0001, +float_flag_divbyzero = 0x0002, +float_flag_overflow= 0x0004, +float_flag_underflow = 0x0008, +float_flag_inexact = 0x0010, +float_flag_inorm_denormal = 0x0020, /* denormal input, normalized */ +float_flag_iflush_denormal = 0x0040, /* denormal input, flushed to zero */ +float_flag_oflush_denormal = 0x0080, /* denormal result, flushed + to zero */ }; /* diff --git a/fpu/softfloat.c b/fpu/softfloat.c index cb077cf111..e54cdb274d 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -126,61 +126,23 @@ this code that are retained. * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 result * and the result is < the minimum normal. */ -#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t) \ + +#define GEN_INPUT_FLUSH(name, soft_t) \ static inline void name(soft_t *a, float_status *s) \ { \ if (unlikely(soft_t ## _is_denormal(*a))) { \ -*a = soft_t ## _set_sign(soft_t ## _zero, \ - soft_t ## _is_neg(*a));\ -float_raise(float_flag_iflush_denormal, s); \ +if (s->flush_inputs_to_zero) { \ +*a = soft_t ## _set_sign(0, soft_t ## _is_neg(*a)); \ +float_raise(float_flag_iflush_denormal, s); \ +} else {\ +float_raise(float_flag_inorm_denormal, s); \ +} \ } \ } -GEN_INPUT_FLUSH__NOCHECK(float32_input_flush__nocheck, float32) -GEN_INPUT_FLUSH__NOCHECK(float64_input_flush__nocheck, float64) -#undef GEN_INPUT_FLUSH__NOCHECK - -#define GEN_INPUT_FLUSH1(name, soft_t) \ -static inline void name(soft_t *a, float_status *s) \ -{ \ -if (likely(!s->flush_inputs_to_zero)) { \ -return; \ -} \ -soft_t ## _input_flush__nocheck(a, s); \ -} - -GEN_INPUT_FLUSH1(float32_input_flush1, float32) -GEN_INPUT_FLUSH1(float64_input_flush1, float64) -#undef GEN_INPUT_FLUSH1 - -#define GEN_INPUT_FLUSH2(name, soft_t) \ -static inline void name(soft_t *a, soft_t *b, float_status *s) \ -{ \ -if (likely(!s->flush_inputs_to_zero)) { \ -return; \ -} \ -soft_t ## _input_flush__nocheck(a, s); \ -soft_t ## _input_flush__nocheck(b, s); \ -} - -GEN_INPUT_FLUSH2(float32_input_flush2, float32) -GEN_INPUT_FLUSH2(float64_input_flush2, float64) -#undef GEN_INPUT_FLUSH2 - -#define GEN_INPUT_FLUSH3(name, soft_t) \ -static inline void name(soft_t *a, soft_t *b,