RE: Denormal input handling
OK, I've done more testing. I'm not sure if we need any specialization, but the setting for float_flag_inorm_denormal isn't right for x86. It is set unconditionally when flush_inputs_to_zero is false, but it needs to take into account the other operand(s). Given "denorm / 0" or any instruction with a NaN operand, float_flag_inorm_denormal should not be set (and that way, the DE bit in MXCSR won't be set when it shouldn't be). Michael -Original Message- From: Michael Morrell Sent: Monday, June 21, 2021 5:57 PM To: 'Richard Henderson' ; qemu-devel@nongnu.org Subject: RE: Denormal input handling Richard, I was under the mistaken impression that your changes in this area (splitting float_flag_input_denormal into 2 flags) were already checked in, but I see now that is not the case. I should probably wait until that is done before I try to claim there are additional issues here. Michael -Original Message- From: Richard Henderson Sent: Monday, June 21, 2021 4:30 PM To: Michael Morrell ; qemu-devel@nongnu.org Subject: Re: Denormal input handling On 6/21/21 4:13 PM, Michael Morrell wrote: > I have another couple of thoughts around input denormal handling. > > The first is straightforward. I noticed that the Aarch64 port doesn't > report input denormals (I could not find any code which sets the IDC > bit in the FPSR). I found code in the arm (not aarch64) port that > sets other bits like IXC, but nothing for IDC. Is that simply because no > one has bothered to add this support? It's because we failed to use symbolic constants. See vfp_exceptbits_from_host. Which *is* used for both aarch64 and arm. > The second concerns support for cases where multiple exception > conditions occur. I had originally thought that denormal input > handling would be orthogonal to everything else and so a case like > "sNaN + denorm" would set both the invalid and input denormal flags > or "denorm / 0" would set both idivde by zero and input denormal, but I don't > think that is true for at least some architectures. Perhaps some > specialization is needed here? If you've got a specific example, we can look at it. There's no point adding specialization that isn't going to be used. r~
RE: Denormal input handling
Richard, I was under the mistaken impression that your changes in this area (splitting float_flag_input_denormal into 2 flags) were already checked in, but I see now that is not the case. I should probably wait until that is done before I try to claim there are additional issues here. Michael -Original Message- From: Richard Henderson Sent: Monday, June 21, 2021 4:30 PM To: Michael Morrell ; qemu-devel@nongnu.org Subject: Re: Denormal input handling On 6/21/21 4:13 PM, Michael Morrell wrote: > I have another couple of thoughts around input denormal handling. > > The first is straightforward. I noticed that the Aarch64 port doesn't > report input denormals (I could not find any code which sets the IDC > bit in the FPSR). I found code in the arm (not aarch64) port that > sets other bits like IXC, but nothing for IDC. Is that simply because no > one has bothered to add this support? It's because we failed to use symbolic constants. See vfp_exceptbits_from_host. Which *is* used for both aarch64 and arm. > The second concerns support for cases where multiple exception conditions > occur. I had > originally thought that denormal input handling would be orthogonal to > everything else and > so a case like "sNaN + denorm" would set both the invalid and input denormal > flags or > "denorm / 0" would set both idivde by zero and input denormal, but I don't > think that is > true for at least some architectures. Perhaps some specialization is needed > here? If you've got a specific example, we can look at it. There's no point adding specialization that isn't going to be used. r~
RE: Denormal input handling
Thanks for pointing me to vfp_exceptbits_from_host. I see it setting bit 7 if flag_input_denormal is set, but it does not seem to work. I never see that bit set. As for the specialization, perhaps we don't need it if all architectures are the same, but the current code doesn't match any architecture I'm aware of. I gave the example of "sNaN + denorm". The HW will not set the "input denormal" flag, but qemu will. If you want to see my test code, I can send that. Michael -Original Message- From: Richard Henderson Sent: Monday, June 21, 2021 4:30 PM To: Michael Morrell ; qemu-devel@nongnu.org Subject: Re: Denormal input handling On 6/21/21 4:13 PM, Michael Morrell wrote: > I have another couple of thoughts around input denormal handling. > > The first is straightforward. I noticed that the Aarch64 port doesn't > report input denormals (I could not find any code which sets the IDC > bit in the FPSR). I found code in the arm (not aarch64) port that > sets other bits like IXC, but nothing for IDC. Is that simply because no > one has bothered to add this support? It's because we failed to use symbolic constants. See vfp_exceptbits_from_host. Which *is* used for both aarch64 and arm. > The second concerns support for cases where multiple exception conditions > occur. I had > originally thought that denormal input handling would be orthogonal to > everything else and > so a case like "sNaN + denorm" would set both the invalid and input denormal > flags or > "denorm / 0" would set both idivde by zero and input denormal, but I don't > think that is > true for at least some architectures. Perhaps some specialization is needed > here? If you've got a specific example, we can look at it. There's no point adding specialization that isn't going to be used. r~
Re: Denormal input handling
On 6/21/21 4:13 PM, Michael Morrell wrote: I have another couple of thoughts around input denormal handling. The first is straightforward. I noticed that the Aarch64 port doesn't report input denormals (I could not find any code which sets the IDC bit in the FPSR). I found code in the arm (not aarch64) port that sets other bits like IXC, but nothing for IDC. Is that simply because no one has bothered to add this support? It's because we failed to use symbolic constants. See vfp_exceptbits_from_host. Which *is* used for both aarch64 and arm. The second concerns support for cases where multiple exception conditions occur. I had originally thought that denormal input handling would be orthogonal to everything else and so a case like "sNaN + denorm" would set both the invalid and input denormal flags or "denorm / 0" would set both idivde by zero and input denormal, but I don't think that is true for at least some architectures. Perhaps some specialization is needed here? If you've got a specific example, we can look at it. There's no point adding specialization that isn't going to be used. r~
RE: Denormal input handling
I have another couple of thoughts around input denormal handling. The first is straightforward. I noticed that the Aarch64 port doesn't report input denormals (I could not find any code which sets the IDC bit in the FPSR). I found code in the arm (not aarch64) port that sets other bits like IXC, but nothing for IDC. Is that simply because no one has bothered to add this support? The second concerns support for cases where multiple exception conditions occur. I had originally thought that denormal input handling would be orthogonal to everything else and so a case like "sNaN + denorm" would set both the invalid and input denormal flags or "denorm / 0" would set both idivde by zero and input denormal, but I don't think that is true for at least some architectures. Perhaps some specialization is needed here? Michael
Re: Denormal input handling
On 5/26/21 2:59 PM, Michael Morrell wrote: First, I apologize for the duplicate thread. I thought the first attempt didn't go through. I agree with Richard that we need an extra flag bit. The current behavior is not right for SSE on x86 (it looks like x87 might be different still). For ARM, setting FPCR.FZ to 1 will result in FPSR.IDC being set for a denormal input (and that input will be flushed to 0), whereas for x86/SSE, setting MXCSR.DAZ to 0 will result in MXCSR.DE being set for a denormal input (and MXCSR.DAZ = 1 flushes that input to 0). It seems the language for x87 is different because there is no DAZ bit for x87, only SSE. Thus the x87 DE bit is set for a denormal input which will never be flushed to zero. I'm a little surprised there are no x86 test cases that cover this. The status of x86 is fairly shakey when it comes to tcg. Most of the work for x86 is all about native virtualization. The last person to touch this code, Joseph Myers, kindly left us a big fat FIXME comment in update_mxcsr_from_sse_status. Richard, are you willing to make the change or do you want me to try? I'll do it. I've got some outstanding floatx80 cleanups that would be handy to build upon while doing this. r~
RE: Denormal input handling
First, I apologize for the duplicate thread. I thought the first attempt didn't go through. I agree with Richard that we need an extra flag bit. The current behavior is not right for SSE on x86 (it lopls like x87 might be different still). For ARM, setting FPCR.FZ to 1 will result in FPSR.IDC being set for a denormal input (and that input will be flushed to 0), whereas for x86/SSE, setting MXCSR.DAZ to 0 will result in MXCSR.DE being set for a denormal input (and MXCSR.DAZ = 1 flushes that input to 0). I'm a little surprised there are no x86 test cases that cover this. Richard, are you willing to make the change or do you want me to try? Thanks, Michael -Original Message- From: Peter Maydell Sent: Wednesday, May 26, 2021 1:19 PM To: Michael Morrell Cc: qemu-devel@nongnu.org; Richard Henderson Subject: Re: Denormal input handling On Wed, 26 May 2021 at 20:07, Michael Morrell wrote: > I see support in QEMU for architectures which have a denormal input > flag bit and those that have a "flush inputs to zero" control bit, but > the implementation is not specializable and seems wrong for x86 at > least. > For example, in sf_canonicalize, if the input is denormal and > "flush_inputs_to_zero" is true, the "input denormal" flag is set and > then the value is set to a zero value, and if the input is denormal > and "flush_inputs_to_zero" is false, then the input is simply > normalized. This is the intended behaviour -- if a target arch needs "denormalized inputs should be flushed to zero", it sets the float_status flush_inputs_to_zero flag. If it also wants to be able to detect when this has happened, it can then look at the input_denormal flag. This matches the behaviour that Arm needs, and it is for Arm that the flush_inputs_to_zero and input_denormal flags were introduced. > I think the behavior should be for denormal inputs that if > "flush_inputs_to_zero" is true, then the value is set to zero; and if > "flush_inputs_to_zero" is false, set the "input denormal" > flag and normalize the input. > This matches what x86 does (I'm not sure about other architectures). What in particular does x86 want that it isn't getting at the moment? If it needs some additional variation of behaviour we can look at adding a new status flag to control that. thanks -- PMM
Re: Denormal input handling
So, DE should not be set for a denormal input if DAZ is set (it is set only when DAZ is 0 -- the default "IEEE mode"). It's not my day is it? Clearly the bit that I quoted is wrt to the output denormal flushing. But my suggested fix is still not wrong -- have two bits that cover both alternatives. Can you point me to the ARM documentation? https://developer.arm.com/documentation/ddi0487/latest Section D1.12.6 Floating point exceptions and exception traps, subsection Input Denormal exceptions. r~
Re: Denormal input handling
On Wed, 26 May 2021 at 20:07, Michael Morrell wrote: > I see support in QEMU for architectures which have a denormal > input flag bit and those that have a "flush inputs to zero" control > bit, but the implementation is not specializable and seems wrong > for x86 at least. > For example, in sf_canonicalize, if the input is denormal and > "flush_inputs_to_zero" is true, the "input denormal" flag is set > and then the value is set to a zero value, and if the input is > denormal and "flush_inputs_to_zero" is false, then the input > is simply normalized. This is the intended behaviour -- if a target arch needs "denormalized inputs should be flushed to zero", it sets the float_status flush_inputs_to_zero flag. If it also wants to be able to detect when this has happened, it can then look at the input_denormal flag. This matches the behaviour that Arm needs, and it is for Arm that the flush_inputs_to_zero and input_denormal flags were introduced. > I think the behavior should be for denormal inputs that if > "flush_inputs_to_zero" is true, then the value is set to zero; > and if "flush_inputs_to_zero" is false, set the "input denormal" > flag and normalize the input. > This matches what x86 does (I'm not sure about other architectures). What in particular does x86 want that it isn't getting at the moment? If it needs some additional variation of behaviour we can look at adding a new status flag to control that. thanks -- PMM
Re: Denormal input handling
Richard, How is what I suggested wrong for x86? It matches the spec and actual behavior: == Start quote == 10.2.3.4 Denormals-Are-Zeros Bit 6 (DAZ) of the MXCSR register enables the denormals-are-zeros mode, which controls the processor’s response to a SIMD floating-point denormal operand condition. When the denormals-are-zeros flag is set, the processor converts all denormal source operands to a zero with the sign of the original operand before performing any computations on them. The processor does not set the denormal-operand exception flag (DE), regardless of the setting of the denormal-operand exception mask bit (DM); and it does not generate a denormal-operand exception if the exception is unmasked. == End quote == So, DE should not be set for a denormal input if DAZ is set (it is set only when DAZ is 0 -- the default "IEEE mode"). Can you point me to the ARM documentation? Michael On Wednesday, May 26, 2021, 12:28:38 PM PDT, Richard Henderson wrote: On 5/26/21 12:23 PM, Richard Henderson wrote: > On 5/26/21 12:02 PM, Michael Morrell via wrote: >> I think the behavior should be for denormal inputs that if >> "flush_inputs_to_zero" is true, then set the value to zero (without setting >> the "input denormal" flag); and if "flush_inputs_to_zero" is false, set the >> "input denormal" flag and normalize the input. >> >> This matches what x86 does (I'm not sure about other architectures). > > It is not. > > Intel Architectures SDM Vol 1, Section 11.5.2.2: > The denormal operand exception is not affected by > flush-to-zero mode. Ho hum, I misread what you wrote outside the quoted context. Both your suggestion and the current behaviour are wrong for x86. The current behavior is correct for arm. What we need are two separate softfloat flags, one for "any denormal seen on input" and another for "denormal flushed to zero on input". The target emulation code then chooses what set of bits needs exporting to target architecture state. r~
Re: Denormal input handling
On 5/26/21 12:23 PM, Richard Henderson wrote: On 5/26/21 12:02 PM, Michael Morrell via wrote: I think the behavior should be for denormal inputs that if "flush_inputs_to_zero" is true, then set the value to zero (without setting the "input denormal" flag); and if "flush_inputs_to_zero" is false, set the "input denormal" flag and normalize the input. This matches what x86 does (I'm not sure about other architectures). It is not. Intel Architectures SDM Vol 1, Section 11.5.2.2: The denormal operand exception is not affected by flush-to-zero mode. Ho hum, I misread what you wrote outside the quoted context. Both your suggestion and the current behaviour are wrong for x86. The current behavior is correct for arm. What we need are two separate softfloat flags, one for "any denormal seen on input" and another for "denormal flushed to zero on input". The target emulation code then chooses what set of bits needs exporting to target architecture state. r~
Re: Denormal input handling
On 5/26/21 12:02 PM, Michael Morrell via wrote: I think the behavior should be for denormal inputs that if "flush_inputs_to_zero" is true, then set the value to zero (without setting the "input denormal" flag); and if "flush_inputs_to_zero" is false, set the "input denormal" flag and normalize the input. This matches what x86 does (I'm not sure about other architectures). It is not. Intel Architectures SDM Vol 1, Section 11.5.2.2: The denormal operand exception is not affected by flush-to-zero mode. r~