RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2022-01-11 Thread Michael Morrell
Richard,

It's been 6 months so I thought I'd check in again.   Do you have an estimate 
of when this will go in?

   Michael

-Original Message-
From: Michael Morrell 
Sent: Wednesday, July 14, 2021 10:50 AM
To: 'Richard Henderson' ; qemu-devel@nongnu.org
Subject: RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

OK, thanks for the update.  I also appreciate you looking at the NaN issue.

   Michael

-Original Message-
From: Richard Henderson  
Sent: Wednesday, July 14, 2021 9:57 AM
To: Michael Morrell ; qemu-devel@nongnu.org
Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

On 7/14/21 9:44 AM, Michael Morrell wrote:
> Just curious.  What's the expected timeline to get these denormal patches in 
> the tree?

Next development cycle, at minimum.  I need to fix the problems vs NaNs that 
you identified.


r~


RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-07-14 Thread Michael Morrell
OK, thanks for the update.  I also appreciate you looking at the NaN issue.

   Michael

-Original Message-
From: Richard Henderson  
Sent: Wednesday, July 14, 2021 9:57 AM
To: Michael Morrell ; qemu-devel@nongnu.org
Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

On 7/14/21 9:44 AM, Michael Morrell wrote:
> Just curious.  What's the expected timeline to get these denormal patches in 
> the tree?

Next development cycle, at minimum.  I need to fix the problems vs NaNs that 
you identified.


r~


Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-07-14 Thread Richard Henderson

On 7/14/21 9:44 AM, Michael Morrell wrote:

Just curious.  What's the expected timeline to get these denormal patches in 
the tree?


Next development cycle, at minimum.  I need to fix the problems vs NaNs that 
you identified.


r~



RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-07-14 Thread Michael Morrell
Just curious.  What's the expected timeline to get these denormal patches in 
the tree?

-Original Message-
From: Richard Henderson  
Sent: Saturday, May 29, 2021 8:21 AM
To: Michael Morrell ; qemu-devel@nongnu.org
Cc: alex.ben...@linaro.org
Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

On 5/28/21 10:41 AM, Michael Morrell wrote:
> I'm probably missing something, but why do we need both 
> "float_flag_inorm_denormal" and "float_flag_iflush_denormal"?
> 
> Couldn't the code that sets these flags set just a single flag for all 
> denormal inputs and the code that checks these flags check that single flag 
> combined with the "flush_inputs_to_zero" flag to accomplish what the two 
> separate "input denormal" flags do?

The thing that you're missing is that many guests leave the accumulated 
softfloat exceptions in the float_status structure until the guest FPSCR 
register is read. Unless the guest needs to raise an exception immediately, 
there's no reason to do otherwise.

With this setup, you have no temporal connection between "any denormal" and 
"flush-to-zero is set", thus two flags.


r~


Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-06-07 Thread Richard Henderson

On 6/7/21 10:19 AM, Alex Bennée wrote:

If you've got a better ordering of operations for this, do tell.


What I really want is to know which instructions translate into the if
(s->flush_inputs_to_zero) and verifying that is only checked once. Maybe
I'm just suspicious of compilers ability to optimise things away...






  Dump of assembler code for function float32_mul:
 0x00895d60 <+0>: movzbl 0x1(%rdx),%eax
 0x00895d64 <+4>: test   $0x10,%al
 0x00895d66 <+6>: je 0x895e30 


s->float_exception_flags & float_flag_inexact


 0x00895d6c <+12>:cmpb   $0x0,(%rdx)
 0x00895d6f <+15>:jne0x895e30 


s->float_rounding_mode == float_round_nearest_even


 0x00895d75 <+21>:test   $0x7f80,%edi
 0x00895d7b <+27>:jne0x895da0 
 0x00895d7d <+29>:test   $0x7fff,%edi
 0x00895d83 <+35>:je 0x895da0 


float32_is_denormal


 0x00895d85 <+37>:cmpb   $0x0,0x5(%rdx)
 0x00895d89 <+41>:je 0x895e60 


s->flush_inputs_to_zero


 0x00895d8f <+47>:or $0x40,%eax
 0x00895d92 <+50>:and$0x8000,%edi
 0x00895d98 <+56>:mov%al,0x1(%rdx)


flush-to-zero and set iflush_denormal


 0x00895da0 <+64>:test   $0x7f80,%esi
 0x00895da6 <+70>:jne0x895dd0 
 0x00895da8 <+72>:test   $0x7fff,%esi
 0x00895dae <+78>:je 0x895dd0 


float32_is_denormal (second operand)


 0x00895db0 <+80>:cmpb   $0x0,0x5(%rdx)
 0x00895db4 <+84>:movzbl 0x1(%rdx),%eax
 0x00895db8 <+88>:je 0x895e50 
 0x00895dbe <+94>:or $0x40,%eax
 0x00895dc1 <+97>:and$0x8000,%esi


s->flush_inputs_to_zero,
flush-to-zero,
set iflush_denormal.

...


 0x00895e50 <+240>:   or $0x20,%eax
 0x00895e53 <+243>:   mov%al,0x1(%rdx)
 0x00895e56 <+246>:   jmpq   0x895dd0 


set inorm_denormal (second operand)


 0x00895e60 <+256>:   or $0x20,%eax
 0x00895e63 <+259>:   mov%al,0x1(%rdx)
 0x00895e66 <+262>:   jmpq   0x895da0 


set inorm_denormal (first operand)

There do seem to be 3 reads/writes to exception_flags for float_raise.


r~



Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-06-07 Thread Alex Bennée


Richard Henderson  writes:

> On 6/7/21 8:35 AM, Alex Bennée wrote:
>> So I'm guessing Emilio had the original flush code split was to avoid
>> multiple checks against s->flush_inputs_to_zero in the code. The was
>> possibly a good reason, comparing the before/after of float32_mul:
>
> I assumed that the most important thing now is that we test
> floatN_is_denormal only once -- the test for flush_inputs_to_zero is
> fairly trivial.
>
> If you've got a better ordering of operations for this, do tell.

What I really want is to know which instructions translate into the if
(s->flush_inputs_to_zero) and verifying that is only checked once. Maybe
I'm just suspicious of compilers ability to optimise things away...

-- 
Alex Bennée



Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-06-07 Thread Richard Henderson

On 6/7/21 8:35 AM, Alex Bennée wrote:

So I'm guessing Emilio had the original flush code split was to avoid
multiple checks against s->flush_inputs_to_zero in the code. The was
possibly a good reason, comparing the before/after of float32_mul:


I assumed that the most important thing now is that we test floatN_is_denormal 
only once -- the test for flush_inputs_to_zero is fairly trivial.


If you've got a better ordering of operations for this, do tell.


r~



Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-06-07 Thread Alex Bennée


Alex Bennée  writes:

> Richard Henderson  writes:
>
>> Create a new exception flag for reporting input denormals that are not
>> flushed to zero, they are normalized and treated as normal numbers.
>>
>> Signed-off-by: Richard Henderson 

>
> Anyway other than that observation seems OK to me:
>
> Signed-off-by: Alex Bennée 

I of course meant:

Reviewed-by: Alex Bennée 

-- 
Alex Bennée



Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-06-07 Thread Alex Bennée


Richard Henderson  writes:

> Create a new exception flag for reporting input denormals that are not
> flushed to zero, they are normalized and treated as normal numbers.
>
> Signed-off-by: Richard Henderson 
> ---
>  include/fpu/softfloat-types.h | 15 ---
>  fpu/softfloat.c   | 84 +++
>  fpu/softfloat-parts.c.inc |  1 +
>  3 files changed, 36 insertions(+), 64 deletions(-)
>
> diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
> index e2d70ff556..174100e50e 100644
> --- a/include/fpu/softfloat-types.h
> +++ b/include/fpu/softfloat-types.h
> @@ -143,13 +143,14 @@ typedef enum __attribute__((__packed__)) {
>   */
>  
>  enum {
> -float_flag_invalid   =  1,
> -float_flag_divbyzero =  4,
> -float_flag_overflow  =  8,
> -float_flag_underflow = 16,
> -float_flag_inexact   = 32,
> -float_flag_iflush_denormal = 64,
> -float_flag_oflush_denormal = 128
> +float_flag_invalid = 0x0001,
> +float_flag_divbyzero   = 0x0002,
> +float_flag_overflow= 0x0004,
> +float_flag_underflow   = 0x0008,
> +float_flag_inexact = 0x0010,
> +float_flag_inorm_denormal  = 0x0020,  /* denormal input, normalized */
> +float_flag_iflush_denormal = 0x0040,  /* denormal input, flushed to zero 
> */
> +float_flag_oflush_denormal = 0x0080,  /* denormal result, flushed to 
> zero */
>  };
>  
>  /*
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index cb077cf111..e54cdb274d 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -126,61 +126,23 @@ this code that are retained.
>   * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 
> result
>   * and the result is < the minimum normal.
>   */
> -#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t)  \
> +
> +#define GEN_INPUT_FLUSH(name, soft_t)   \
>  static inline void name(soft_t *a, float_status *s) \
>  {   \
>  if (unlikely(soft_t ## _is_denormal(*a))) { \
> -*a = soft_t ## _set_sign(soft_t ## _zero,   \
> - soft_t ## _is_neg(*a));\
> -float_raise(float_flag_iflush_denormal, s);  \
> +if (s->flush_inputs_to_zero) {  \
> +*a = soft_t ## _set_sign(0, soft_t ## _is_neg(*a)); \
> +float_raise(float_flag_iflush_denormal, s); \
> +} else {\
> +float_raise(float_flag_inorm_denormal, s);  \
> +}   \
>  }   \
>  }

So I'm guessing Emilio had the original flush code split was to avoid
multiple checks against s->flush_inputs_to_zero in the code. The was
possibly a good reason, comparing the before/after of float32_mul:

  Dump of assembler code for function float32_mul:
 0x00934240 <+0>:   movzbl 0x1(%rdx),%eax
 0x00934244 <+4>:   test   $0x20,%al
 0x00934246 <+6>:   je 0x9342b0 
 0x00934248 <+8>:   cmpb   $0x0,(%rdx)
 0x0093424b <+11>:  jne0x9342b0 
 0x0093424d <+13>:  cmpb   $0x0,0x5(%rdx)
 0x00934251 <+17>:  jne0x9342d0 
 0x00934253 <+19>:  mov%edi,%eax
 0x00934255 <+21>:  shr$0x17,%eax
 0x00934258 <+24>:  add$0x1,%eax
 0x0093425b <+27>:  test   $0xfe,%al
 0x0093425d <+29>:  je 0x9342a8 
 0x0093425f <+31>:  mov%esi,%eax
 0x00934261 <+33>:  shr$0x17,%eax
 0x00934264 <+36>:  add$0x1,%eax
 0x00934267 <+39>:  test   $0xfe,%al
 0x00934269 <+41>:  jne0x934273 
 0x0093426b <+43>:  test   $0x7fff,%esi
 0x00934271 <+49>:  jne0x9342b0 
 0x00934273 <+51>:  mov%esi,-0xc(%rsp)
 0x00934277 <+55>:  movss  -0xc(%rsp),%xmm0
 0x0093427d <+61>:  mov%edi,-0xc(%rsp)
 0x00934281 <+65>:  movss  -0xc(%rsp),%xmm2
 0x00934287 <+71>:  mulss  %xmm2,%xmm0
 0x0093428b <+75>:  movd   %xmm0,%eax
 0x0093428f <+79>:  andps  0x3b805a(%rip),%xmm0# 0xcec2f0
 0x00934296 <+86>:  ucomiss 0x3b8047(%rip),%xmm0# 0xcec2e4
 0x0093429d <+93>:  jbe0x9342b8 
 0x0093429f <+95>:  orb$0x8,0x1(%rdx)
 0x009342a3 <+99>:  retq   
 0x009342a4 <+100>: nopl   0x0(%rax)
 0x009342a8 <+104>: test   $0x7fff,%edi
 0x009342ae <+110>: je 0x93425f 
 0x009342b0 <+112>: jmpq   0x9290d0 
 

Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-05-29 Thread Richard Henderson

On 5/28/21 10:41 AM, Michael Morrell wrote:

I'm probably missing something, but why do we need both "float_flag_inorm_denormal" and 
"float_flag_iflush_denormal"?

Couldn't the code that sets these flags set just a single flag for all denormal 
inputs and the code that checks these flags check that single flag
combined with the "flush_inputs_to_zero" flag to accomplish what the two separate 
"input denormal" flags do?


The thing that you're missing is that many guests leave the accumulated 
softfloat exceptions in the float_status structure until the guest FPSCR 
register is read. Unless the guest needs to raise an exception immediately, 
there's no reason to do otherwise.


With this setup, you have no temporal connection between "any denormal" and 
"flush-to-zero is set", thus two flags.



r~



RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2021-05-28 Thread Michael Morrell
I'm probably missing something, but why do we need both 
"float_flag_inorm_denormal" and "float_flag_iflush_denormal"?

Couldn't the code that sets these flags set just a single flag for all denormal 
inputs and the code that checks these flags check that single flag
combined with the "flush_inputs_to_zero" flag to accomplish what the two 
separate "input denormal" flags do?

   Michael

-Original Message-
From: Richard Henderson  
Sent: Wednesday, May 26, 2021 9:14 PM
To: qemu-devel@nongnu.org
Cc: Michael Morrell ; alex.ben...@linaro.org
Subject: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

Create a new exception flag for reporting input denormals that are not flushed 
to zero, they are normalized and treated as normal numbers.

Signed-off-by: Richard Henderson 
---
 include/fpu/softfloat-types.h | 15 ---
 fpu/softfloat.c   | 84 +++
 fpu/softfloat-parts.c.inc |  1 +
 3 files changed, 36 insertions(+), 64 deletions(-)

diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h 
index e2d70ff556..174100e50e 100644
--- a/include/fpu/softfloat-types.h
+++ b/include/fpu/softfloat-types.h
@@ -143,13 +143,14 @@ typedef enum __attribute__((__packed__)) {
  */
 
 enum {
-float_flag_invalid   =  1,
-float_flag_divbyzero =  4,
-float_flag_overflow  =  8,
-float_flag_underflow = 16,
-float_flag_inexact   = 32,
-float_flag_iflush_denormal = 64,
-float_flag_oflush_denormal = 128
+float_flag_invalid = 0x0001,
+float_flag_divbyzero   = 0x0002,
+float_flag_overflow= 0x0004,
+float_flag_underflow   = 0x0008,
+float_flag_inexact = 0x0010,
+float_flag_inorm_denormal  = 0x0020,  /* denormal input, normalized */
+float_flag_iflush_denormal = 0x0040,  /* denormal input, flushed to zero */
+float_flag_oflush_denormal = 0x0080,  /* denormal result, flushed 
+ to zero */
 };
 
 /*
diff --git a/fpu/softfloat.c b/fpu/softfloat.c index cb077cf111..e54cdb274d 
100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -126,61 +126,23 @@ this code that are retained.
  * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 result
  * and the result is < the minimum normal.
  */
-#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t)  \
+
+#define GEN_INPUT_FLUSH(name, soft_t)   \
 static inline void name(soft_t *a, float_status *s) \
 {   \
 if (unlikely(soft_t ## _is_denormal(*a))) { \
-*a = soft_t ## _set_sign(soft_t ## _zero,   \
- soft_t ## _is_neg(*a));\
-float_raise(float_flag_iflush_denormal, s);  \
+if (s->flush_inputs_to_zero) {  \
+*a = soft_t ## _set_sign(0, soft_t ## _is_neg(*a)); \
+float_raise(float_flag_iflush_denormal, s); \
+} else {\
+float_raise(float_flag_inorm_denormal, s);  \
+}   \
 }   \
 }
 
-GEN_INPUT_FLUSH__NOCHECK(float32_input_flush__nocheck, float32) 
-GEN_INPUT_FLUSH__NOCHECK(float64_input_flush__nocheck, float64) -#undef 
GEN_INPUT_FLUSH__NOCHECK
-
-#define GEN_INPUT_FLUSH1(name, soft_t)  \
-static inline void name(soft_t *a, float_status *s) \
-{   \
-if (likely(!s->flush_inputs_to_zero)) { \
-return; \
-}   \
-soft_t ## _input_flush__nocheck(a, s);  \
-}
-
-GEN_INPUT_FLUSH1(float32_input_flush1, float32) 
-GEN_INPUT_FLUSH1(float64_input_flush1, float64) -#undef GEN_INPUT_FLUSH1
-
-#define GEN_INPUT_FLUSH2(name, soft_t)  \
-static inline void name(soft_t *a, soft_t *b, float_status *s)  \
-{   \
-if (likely(!s->flush_inputs_to_zero)) { \
-return; \
-}   \
-soft_t ## _input_flush__nocheck(a, s);  \
-soft_t ## _input_flush__nocheck(b, s);  \
-}
-
-GEN_INPUT_FLUSH2(float32_input_flush2, float32) 
-GEN_INPUT_FLUSH2(float64_input_flush2, float64) -#undef GEN_INPUT_FLUSH2
-
-#define GEN_INPUT_FLUSH3(name, soft_t)  \
-static inline void name(soft_t *a, soft_t *b,