from:"lhmouse"

Re: [Mingw-w64-public] patch to add htonll/ntohll

2021-12-17 Thread lhmouse

在 2021-12-17 02:13, Michel Zou 写道:
> Hi,
> It turns out that these are inline functions, here is a new patch.
> xan
> 
Thanks. Pushed to master.

Next time, please send a patch created by `git format`, and please do sign off 
the commit with `git commit -s`.


-- 
Best regards,
LIU Hao

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] wine FTBFS with mingw64 gcc 11: undefined reference to `sincos'

2021-05-14 Thread lhmouse

在 5/15/21 1:27 AM, Jacek Caban 写道:
> 
> I think that the decision was unfortunate on GCC side, but there is little we 
> can do. We will 
> probably need to provide it in msvcrt importlibs. Please try the attached 
> patch, it should help.
> 
> 
Doesn't GCC transform such pair of calls to `sincos()` again and result in an 
infinite recursion?



-- 
Best regards,
Liu Hao

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [ucrt]missing quick_exit in stdlib.h/cstdlib

2021-04-22 Thread lhmouse

在 4/20/21 9:31 PM, yume todo 写道:
> 
> In ucrt64, quick_exit is not found.
> 
> 
This has been fixed on master now:  
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/7dda261


-- 
Best regards,
Liu Hao

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCH 2/5] corecrt_startup.h: Added _onexit_table_t and related functions declarations.

2018-01-11 Thread lhmouse

On 2018/1/12 4:30, Jacek Caban wrote: > Signed-off-by: Jacek Caban
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCH 3/5] libmsvcr*.a: Added compatibility implementation of onexit table functions.

2018-01-11 Thread lhmouse

On 2018/1/12 4:31, Jacek Caban wrote: > Signed-off-by: Jacek Caban
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCH] intrin-impl.h: Added missing volatile to _interlockedbittestandset and _interlockedbittestandset64 declarations.

2018-01-11 Thread lhmouse

On 2018/1/12 4:28, Jacek Caban wrote: > > Fixes compilation with clang. > > 
Signed-off-by: Jacek Caban
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCHv2] crt: Add an ldexpl function for arm and arm64

2017-12-19 Thread lhmouse

On 2017/12/18 3:43, Martin Storsjö wrote: > Since long double just is normal 
double on arm and arm64, just > call the normal ldexp function. > > 
Signed-off-by: Martin Storsjö
> --- > Fixed the parameters in the C wrappers. > --- >  
> mingw-w64-crt/Makefile.am |  3 ++- >  mingw-w64-crt/math/arm/ldexpl.c 
>   | 16  >  mingw-w64-crt/math/arm64/ldexpl.c | 16 
>  >  3 files changed, 34 insertions(+), 1 deletion(-) >  
> create mode 100644 mingw-w64-crt/math/arm/ldexpl.c >  create mode 100644 
> mingw-w64-crt/math/arm64/ldexpl.c > > diff --git a/mingw-w64-crt/Makefile.am 
> b/mingw-w64-crt/Makefile.am > index 6812a5e..7d6c395 100644 > --- 
> a/mingw-w64-crt/Makefile.am > +++ b/mingw-w64-crt/Makefile.am > @@ -390,13 
> +390,14 @@ src_libmingwexarm32+=\ >math/softmath/sinf.c  
> math/softmath/sinl.c  math/softmath/tanf.c  math/softmath/tanl.c >  
> else >  src_libmingwexarm32+=\ > -  math/arm/exp2.c   math/arm/log2.c 
>   math/arm/scalbn.c math/arm/sincos.c > +  math/arm/exp2.c
>math/arm/ldexpl.c math/arm/log2.c   math/arm/scalbn.c  
>math/arm/sincos.c >  endif >   >  # these only go into the ARM64 
> version: >  src_libmingwexarm64=\ >math/arm64/_chgsignl.S
> math/arm64/ceil.S math/arm64/ceilf.Smath/arm64/ceill.S
> math/arm64/copysignl.c\ >math/arm64/exp2.S math/arm64/exp2f.S 
>math/arm64/floor.Smath/arm64/floorf.S   
> math/arm64/floorl.S   \ > +  math/arm64/ldexpl.c \ >math/arm64/log2.c 
> math/arm64/nearbyint.Smath/arm64/nearbyintf.S   
> math/arm64/nearbyintl.S   math/arm64/scalbn.c   \ >
> math/arm64/sincos.c   math/arm64/trunc.Smath/arm64/truncf.S >   > 
> diff --git a/mingw-w64-crt/math/arm/ldexpl.c 
> b/mingw-w64-crt/math/arm/ldexpl.c > new file mode 100644 > index 
> 000..7d3bffc > --- /dev/null > +++ b/mingw-w64-crt/math/arm/ldexpl.c > @@ 
> -0,0 +1,16 @@ > +/** > + * This file has no copyright assigned and is placed 
> in the Public Domain. > + * This file is part of the mingw-w64 runtime 
> package. > + * No warranty is given; refer to the file DISCLAIMER.PD within 
> this package. > + */ > + > +#include> + > +long double ldexpl(long double x, 
> int n) > +{ > +#if defined(__arm__) || defined(_ARM_) > +return ldexp(x, 
> n); > +#else > +#error Not supported on your platform yet > +#endif > +} > 
> diff --git a/mingw-w64-crt/math/arm64/ldexpl.c 
> b/mingw-w64-crt/math/arm64/ldexpl.c > new file mode 100644 > index 
> 000..bfa3287 > --- /dev/null > +++ b/mingw-w64-crt/math/arm64/ldexpl.c > 
> @@ -0,0 +1,16 @@ > +/** > + * This file has no copyright assigned and is 
> placed in the Public Domain. > + * This file is part of the mingw-w64 runtime 
> package. > + * No warranty is given; refer to the file DISCLAIMER.PD within 
> this package. > + */ > + > +#include
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCH] remove cast to int from mantissa for __mingw_printf

2017-06-12 Thread lhmouse

On 2017/6/12 23:24, Martell Malone wrote:
> In this thread https://sourceforge.net/p/mingw-w64/bugs/459/
> there is a suggested fix for print with whole numbers
> 
> The builtin __mingw_printf is inconsistent with printf on %a format.
>> I think __mingw_printf is wrong, because obviously 1.0 != 0x0p-63.
> 
> 
> vacaboja opened an issue on msys2 for this
> https://github.com/msys2/msys2/issues/35
> and suggested a fix of removing the case to int
> here is a patch that does just that.
> According to the discussion on that ticket, this patch looks correct.

But why was there such a suspicious cast? It might be there to silence a 
warning about comparison between signed and unsigned integers, which is enabled 
by `-Wsign-compare` or `-Wall` in C++ or `-Wextra` in C. If you do see such a 
warning, I suggest you 1) redeclare `c` as `unsigned` instead of `int`, or 2) 
cast `c` back to `unsigned` before the comparison.


-- 
Best regards,
LH_Mouse
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] delay loading of dll

2017-02-06 Thread lhmouse

On 2017/2/7 0:26, Hannes Domani wrote:
> Hello
>
>
> Does delay-loading work with 32bit executables?
>
> In the following example it crashes for me on the dll_function() call.
> I've used i686-6.3.0-release-win32-dwarf-rt_v5-rev1.7z for my tests.
I compiled the program and it did crash. The assembly code generated 
looks like this:

 00401570 | pushebp| int main(){
 00401571 | mov ebp, esp   |
 00401573 | and esp, FFF0  |
 00401576 | callapp.401700 |   __main();
 0040157B | mov eax, dword ptr ds:[407204] |
 00401580 | calleax|
 00401582 | mov eax, 0 |   return 0;
 00401587 | leave  |
 00401588 | ret| }

 0040158C | pushecx|
 0040158D | pushedx|
 0040158E | pusheax|
 0040158F | push   |
 00401594 | callapp.4026A0 |
 00401599 | pop edx|
 0040159A | pop ecx|
 0040159B | jmp eax|

The pointer at address 407204 should be a pointer to the DLL loader 
function initially, which is located at 0040158C. The pointer here is 
initially null and results in jumping to address zero, hence the crash.

In addition to that, the assembly code of the DLL loader function is 
incorrect. The DLL loader function requires the caller to pass the 
address of the function pointer above (which is 407204) via the EAX 
register. That is, the first instruction at 0040158C should have been 
`lea eax, dword ptr ds:[407204]`.

Compiling app.c with `-S -masm=intel` produces the following assembly 
code, with directives removed:

 _main:
pushebp
mov ebp, esp
and esp, -16
call___main
mov eax, DWORD PTR __imp__dll_function
calleax
mov eax, 0
leave
ret

The DLL loader function `__imp__dll_function` seems not generated by the 
compiler. So it seems that dlltool for i686 isn't generating correct 
machine code for delay-loaded functions.

-- 
Best regards,
LH_Mouse

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Issue with headers.

2017-01-26 Thread lhmouse


On 2017/1/26 19:00, Petri Hodju wrote:
> Hi!
> I ran to the same problem earlier and I posted patches here on the
> list on 2nd December 2016 for this problem.
> In short, the CompPtr has specialized constructors that can't access
> the protected members as they are not working on class level. The fix
> was trivial, changing the direct member variable access to use the
> already available accessor methods.
> Other problem I encountered was a missing BitmapBrushProperties1 in
> the d2d1_1helper1.h, for which I also posted a patch.
> With these patches I'm able to build the Qt-5.8.0 just fine : )
> Have I not followed some step of providing patches as these have not 
> been commented at all so far... ?
I am afraid [1] isn't the correct way to fix it. As for consistency, the 
correct solution is adding a `friend` declaration, as what Microsoft 
people did.


Patch attached, please test.

[1] https://sourceforge.net/p/mingw-w64/mailman/message/35527066/

--
Best regards,
LH_Mouse





From 2ab50e9a9b1d3a8d8c6e33d1e2e9077a872166a8 Mon Sep 17 00:00:00 2001
From: LH_Mouse 
Date: Thu, 26 Jan 2017 19:28:51 +0800
Subject: [PATCH] mingw-w64-headers/include/wrl/client.h: Fix error: 
'ptr_' is

 protected within this context.

---
 mingw-w64-headers/include/wrl/client.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mingw-w64-headers/include/wrl/client.h 
b/mingw-w64-headers/include/wrl/client.h

index 83b4cb3..448c7a2 100644
--- a/mingw-w64-headers/include/wrl/client.h
+++ b/mingw-w64-headers/include/wrl/client.h
@@ -252,6 +252,7 @@ namespace Microsoft {
 */
 protected:
 InterfaceType *ptr_;
+template friend class ComPtr;

 void InternalAddRef() const throw() {
 if(ptr_)
--
2.10.2


From 2ab50e9a9b1d3a8d8c6e33d1e2e9077a872166a8 Mon Sep 17 00:00:00 2001
From: LH_Mouse 
Date: Thu, 26 Jan 2017 19:28:51 +0800
Subject: [PATCH] mingw-w64-headers/include/wrl/client.h: Fix error: 'ptr_' is
 protected within this context.

---
 mingw-w64-headers/include/wrl/client.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mingw-w64-headers/include/wrl/client.h 
b/mingw-w64-headers/include/wrl/client.h
index 83b4cb3..448c7a2 100644
--- a/mingw-w64-headers/include/wrl/client.h
+++ b/mingw-w64-headers/include/wrl/client.h
@@ -252,6 +252,7 @@ namespace Microsoft {
 */
 protected:
 InterfaceType *ptr_;
+template friend class ComPtr;
 
 void InternalAddRef() const throw() {
 if(ptr_)
-- 
2.10.2

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-22 Thread lhmouse

On 2017/1/23 9:08, David Wohlferd wrote:
> Hmm.
>
> It seems a bit backwards to have the function that takes a 'long double'
> calling the function that takes a 'double.'  Yes, they are both the same
> size on ARM, but I think I would have gone the other way.  Plus I kinda
> like having all the implementations in one file (fmal.c).
I prefer that too. At the moment I have to follow what mingw-w64 has 
been doing. That is, keeping separated functions for {f,,l} in different 
files.

> Other than that, this looks ok to me.  Building for ARM with clang seems
> to work (although I have no way to run it).
Thanks for testing.

-- 
Best regards,
LH_Mouse


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-20 Thread lhmouse

The mail has been being rejected for spamming for a few hours.
Hope it wouldn't be this time.


-- 
Best regards,
lh_mouse





From 82fd24e992a402ff2f7c55780fd76945ef83e094 Mon Sep 17 00:00:00 2001
From: LH_Mouse 
Date: Wed, 18 Jan 2017 19:35:43 +0800
Subject: [PATCH] mingw-w64-crt/math/fma{,f,l}.c: Implement fused multiply-add
 (FMA) funcitons for x86 families properly. mingw-w64-crt/Makefile.am:
 Likewise. mingw-w64-crt/math/fma{,f}.S: Merge into corresponding C files with
 the same names, respectively.

---
 mingw-w64-crt/Makefile.am |   4 +-
 mingw-w64-crt/math/fma.S  |  42 --
 mingw-w64-crt/math/fma.c  |  29 ++
 mingw-w64-crt/math/fmaf.S |  43 --
 mingw-w64-crt/math/fmaf.c |  29 ++
 mingw-w64-crt/math/fmal.c | 143 --
 6 files changed, 198 insertions(+), 92 deletions(-)
 delete mode 100644 mingw-w64-crt/math/fma.S
 create mode 100644 mingw-w64-crt/math/fma.c
 delete mode 100644 mingw-w64-crt/math/fmaf.S
 create mode 100644 mingw-w64-crt/math/fmaf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index 44360db..5eba234 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -227,7 +227,6 @@ src_libmingwex=\
   \
   math/_chgsignl.S  math/ceil.Smath/ceilf.S  math/ceill.S  
   math/copysignl.S \
   math/floor.S  math/floorf.S  math/floorl.S \
-  math/fma.Smath/fmaf.S\
   math/nearbyint.S  math/nearbyintf.S  math/nearbyintl.S \
   math/trunc.S  math/truncf.S  \
   math/cbrt.c   \
@@ -235,7 +234,8 @@ src_libmingwex=\
   math/coshf.c  math/coshl.c   math/erfl.c   \
   math/expf.c   \
   math/fabs.c   math/fabsf.c   math/fabsl.c  math/fdim.c   
   math/fdimf.c math/fdiml.c \
-  math/fmal.c   math/fmax.cmath/fmaxf.c  math/fmaxl.c  
   math/fmin.c  math/fminf.c \
+  math/fma.cmath/fmaf.cmath/fmal.c   \
+  math/fmax.c   math/fmaxf.c   math/fmaxl.c  math/fmin.c   
   math/fminf.c \
   math/fminl.c  math/fp_consts.c   math/fp_constsf.c \
   math/fp_constsl.c math/fpclassify.c  math/fpclassifyf.c
math/fpclassifyl.c   math/frexpf.c\
   math/hypotf.c math/hypot.c  math/hypotl.c  math/isnan.c  
math/isnanf.cmath/isnanl.c\
diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S
deleted file mode 100644
index 74becde..000
--- a/mingw-w64-crt/math/fma.S
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fma.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 4
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fma)
-   .def__MINGW_USYMBOL(fma);   .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fma):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movsd   %xmm0,(%rsp)
-   movsd   %xmm1,16(%rsp)
-   movsd   %xmm2,32(%rsp)
-   fldl(%rsp)
-   fmull   16(%rsp)
-   fldl32(%rsp)
-   faddp
-   fstpl   (%rsp)
-   movsd   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacd d2, d0, d1
-   fcpyd d0, d2
-   bx  lr
-#elif defined(_X86_) || defined(__i386__)
-   fldl4(%esp)
-   fmull   12(%esp)
-   fldl20(%esp)
-   faddp
-   ret
-#endif
diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c
new file mode 100644
index 000..645a3d1
--- /dev/null
+++ b/mingw-w64-crt/math/fma.c
@@ -0,0 +1,29 @@
+/**
+ * This file has no copyright assigned and is placed in the Public Domain.
+ * This file is part of the mingw-w64 runtime package.
+ * No warranty is given; refer to the file DISCLAIMER.PD within this package.
+ */
+double fma(double x, double y, double z);
+
+#if defined(_ARM_) || defined(__arm__)
+
+/* Use hardware FMA on ARM. */
+double fma(double x, double y, double z){
+  __asm__ (
+"fmacd %0, %1, %2 \n"
+: "+w"(z)
+: "w"(x), "w"(y)
+  );
+  return z;
+}
+
+#else
+
+long double fmal(long double x, long double y, long double z);
+
+/* For platforms that don't have hardware FMA, emulate it. */
+double fma(double x, double y, double z){
+  return (double)fmal(x, y, z);
+}
+
+#endif
diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S
deleted file mode 100644
index 6bc7ef0..000
--- a/mingw-w64-crt/math/fmaf.S
+++ /dev/null
@@ -1,43 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime

Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-19 Thread lhmouse

> So you have decided that __builtins can't be used then?  That's too bad.
Yes it results in a call to `fma()` on x64. Can't test it on ARM though.

> I know almost nothing about the guts of floating point, so I'm prepared
> to defer to your judgement, but here's what I think:
> 
> Let me propose an alternative for fma.c:
> ... ...
> In other words, remove all the platform specific code.  This (greatly)
> simplifies this file.  You were already using fmal for x86.  And it
> doesn't lose anything for ARM, since both fma() and fmal() use the exact
> same inline asm.  Why have the exact same (hard to maintain) code in 2
> places?
Keeping asm code in fmaf.c but not in fma.c seems style inconsistency.
However the contrary is doable: In the case of ARM, call `fma()` in `fmal()`.

> As for fmaf, what about:
> ... ...
> The case here is less compelling, but I assert that if fmal is
> supported, it can always be used to calculate fmaf.  If there is a
> shorter/more efficient method (such as there is with ARM), it can be
> added here.
Fair enough. Updated.

> As for fmal, I have a question about your code.  Not the implementation,
> but the design.  Looking at https://en.wikipedia.org/wiki/Long_double,
> it says "Microsoft Windows with Visual C++ also sets the processor in
> double-precision mode by default."  Since (it appears?) you aren't
> following _controlfp_s, won't this give use a different answer than fmal
> from msvcr120.dll?
MSVC doesn't support 80-bit `long double` (it is 64 bits there) so
the results can't equal unless it fits into 64 bits precisely.
My FMA algorithm is basically splitting both operands into two 32-bit ones,
multiplying them using elementary arithmetics then adding the four 64-bit
results altogether: (a+b)(c+d) = ac+(bc+ad)+bd. So the precision of x87
indeed affects the result.
I doubt whether it is necessary to save the x87 control word and set it to
64-bit precision before the calcuation and restore it thereafter. MinGW-w64
already sets it to 64-bit precision during CRT initialization, and if people
set it lower they ain't going to need `fma()` either.

An interesting look at https://msdn.microsoft.com/en-us/library/c9676k6h.aspx
reminds me that _PC_64 isn't supported on x64. Sounds incredible, no? Does
`_controlfp_s()` return an error if we try to set _PC_64 on 0x64? I have no
idea. Nevertheless the precision flags can be set and restored using inline
assembly - yet another dirty solution.

> More nits:
>
> s/whecher/whether
> s/#x86_Extended_Precision_Format/#x86_extended_precision_format
Fixed. The bookmark to wikipedia was copied from my broswer half a year ago
at least and it probably was modified.

--   
Best regards,
lh_mouse
2017-01-20
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-19 Thread lhmouse

New patch attached.
This patch fixes ARM functions and adds a check in `fpu_fma()` for potential 
NaN or INF results.


--   
Best regards,
lh_mouse
2017-01-19






From 3c55daec84dac190b9e3cb032371960e1acbc38f Mon Sep 17 00:00:00 2001
From: LH_Mouse 
Date: Wed, 18 Jan 2017 19:35:43 +0800
Subject: [PATCH] mingw-w64-crt/math/fma{,f,l}.c: Implement fused multiply-add
 (FMA) funcitons for x86 families properly. mingw-w64-crt/Makefile.am:
 Likewise. mingw-w64-crt/math/fma{,f}.S: Merge into corresponding C files with
 the same names, respectively.

---
 mingw-w64-crt/Makefile.am |   4 +-
 mingw-w64-crt/math/fma.S  |  42 -
 mingw-w64-crt/math/fma.c  |  31 ++
 mingw-w64-crt/math/fmaf.S |  43 --
 mingw-w64-crt/math/fmaf.c |  31 ++
 mingw-w64-crt/math/fmal.c | 146 --
 6 files changed, 205 insertions(+), 92 deletions(-)
 delete mode 100644 mingw-w64-crt/math/fma.S
 create mode 100644 mingw-w64-crt/math/fma.c
 delete mode 100644 mingw-w64-crt/math/fmaf.S
 create mode 100644 mingw-w64-crt/math/fmaf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index 44360db..5eba234 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -227,7 +227,6 @@ src_libmingwex=\
   \
   math/_chgsignl.S  math/ceil.Smath/ceilf.S  math/ceill.S  
   math/copysignl.S \
   math/floor.S  math/floorf.S  math/floorl.S \
-  math/fma.Smath/fmaf.S\
   math/nearbyint.S  math/nearbyintf.S  math/nearbyintl.S \
   math/trunc.S  math/truncf.S  \
   math/cbrt.c   \
@@ -235,7 +234,8 @@ src_libmingwex=\
   math/coshf.c  math/coshl.c   math/erfl.c   \
   math/expf.c   \
   math/fabs.c   math/fabsf.c   math/fabsl.c  math/fdim.c   
   math/fdimf.c math/fdiml.c \
-  math/fmal.c   math/fmax.cmath/fmaxf.c  math/fmaxl.c  
   math/fmin.c  math/fminf.c \
+  math/fma.cmath/fmaf.cmath/fmal.c   \
+  math/fmax.c   math/fmaxf.c   math/fmaxl.c  math/fmin.c   
   math/fminf.c \
   math/fminl.c  math/fp_consts.c   math/fp_constsf.c \
   math/fp_constsl.c math/fpclassify.c  math/fpclassifyf.c
math/fpclassifyl.c   math/frexpf.c\
   math/hypotf.c math/hypot.c  math/hypotl.c  math/isnan.c  
math/isnanf.cmath/isnanl.c\
diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S
deleted file mode 100644
index 74becde..000
--- a/mingw-w64-crt/math/fma.S
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fma.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 4
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fma)
-   .def__MINGW_USYMBOL(fma);   .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fma):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movsd   %xmm0,(%rsp)
-   movsd   %xmm1,16(%rsp)
-   movsd   %xmm2,32(%rsp)
-   fldl(%rsp)
-   fmull   16(%rsp)
-   fldl32(%rsp)
-   faddp
-   fstpl   (%rsp)
-   movsd   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacd d2, d0, d1
-   fcpyd d0, d2
-   bx  lr
-#elif defined(_X86_) || defined(__i386__)
-   fldl4(%esp)
-   fmull   12(%esp)
-   fldl20(%esp)
-   faddp
-   ret
-#endif
diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c
new file mode 100644
index 000..00f100c
--- /dev/null
+++ b/mingw-w64-crt/math/fma.c
@@ -0,0 +1,31 @@
+/**
+ * This file has no copyright assigned and is placed in the Public Domain.
+ * This file is part of the mingw-w64 runtime package.
+ * No warranty is given; refer to the file DISCLAIMER.PD within this package.
+ */
+double fma(double x, double y, double z);
+
+#if defined(_AMD64_) || defined(__x86_64__) || defined(_X86_) || 
defined(__i386__)
+
+long double fmal(long double x, long double y, long double z);
+
+double fma(double x, double y, double z){
+  return (double)fmal(x, y, z);
+}
+
+#elif defined(_ARM_) || defined(__arm__)
+
+double fma(double x, double y, double z){
+  __asm__ (
+"fmacd %0, %1, %2 \n"
+: "+w"(z)
+: "w"(x), "w"(y)
+  );
+  return z;
+}
+
+#else
+
+#error This platform is not supported.
+
+#endif
diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S
deleted file mode 100644
index 6bc7ef0..000
--- a/mingw-w64-crt/math/fmaf.S
+++ /dev/null
@@ -1,43 +0,0 @@
-/**
- * This

Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-18 Thread lhmouse

> I see that you have replaced the x86 parts for fma and fmaf with C 
> code.  That seems like a good thing.  Is there some reason you can't do 
> that with the ARM versions too?
ARM has hardware FMA and software emulation is not optimal.

> Reducing the amount of platform-specific code also seems like a good thing.
The x87 80-bit floating point format is already platform-specific.

> There are a number of reasons not to use inline asm (for example 
> https://gcc.gnu.org/wiki/DontUseInlineAsm ).  Are you sure this is a 
> good idea?
I am not sure about the inline asm itself. The primary reason I did that
is because, if we have `fma.S` and `fma.c` in the same directory they will
compile to the same file `fma.o`, and `make` complains about that.

Inline asm is indeed hard to maintain and I am aware of it. Personally
I only write asm statements that contain very few instructions, simulating
builtin functions or intrinsics for use in C code.

> Yup, that's one of the downsides to using inline asm.
> 
> I'm no ARM expert, but I'm not sure about this ARM code for fmal:
> 
> +long double fmal(long double x, long double y, long double z){
> +  __asm__ (
> +"fmacd %2, %0, %1 \n"
> +"fcpyd %0, %2 \n"
> +: "+"(z)
> +: "w"(x), "w"(y)
> +  );
> 
> Doesn't fmacd modify %2?  That would be (y), which is listed as an input 
> parameter (and therefore is read-only).  What's more, I thought fmacd 
> was calculating "Fd + Fn * Fm" where the parameters were "fmacd Fd, Fn, 
> Fm".  Such being the case, I would have expected "fmacd %0, %1 %2"?  I 
> don't have a way to run this either, but this looks wrong.
Thanks for pointing it out. That is a mistake. I forgot to fix it after
copying it from the asm code. The `fma()` function was the correct one.

> Under the nit-picky heading:
> 
> +double fma(double x, double y, double z){
> +  __asm__ (
> +"fmacd %0, %1, %2 \n"
> +: "+"(z)
> +: "w"(x), "w"(y)
> +  );
> 
> The \n is redundant.  And doesn't the + make the & redundant as well?
I just perfer to terminate every line of asm code with \n.

I believe the & is redundant not only because of the +, but also because
that there is only one instruction so nothing can be written before
the others are read.

> Lastly I gotta ask: Can we use __builtin_fmal?  Or is mingw-w64 the one 
> providing the implementations for these?
We have to ask a GCC developer for sure. According to my experience this
function is something guaranteed to be semantically equivalent to the one
without the __builtin_ prefix in the standard library. Sometimes
the compiler cannot assume all functions from the standard C library are
available and have the specified behavior e.g. when compiling the Linux
kernel. The `__builtin_fmal()` function is then considered to be
a standard FMA, suitable for constant folding. It may result in an inline
instruction where possible, but could also result in a call to the `fmal()`
external function, resulting in infinite recursion if used in `fmal()`.

--   
Best regards,
lh_mouse
2017-01-19



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-18 Thread lhmouse

The correctness of fma() function can be verified using the following program:
---
#include 
#include 

volatile double x = 0x1.3p52;
volatile double y = 0x1.5p52;
volatile double z = -0x1.8p104;

int main(){
printf("x * y + z= %f\n", x * y + z);
printf("fma(x, y, z) = %f\n", fma(x, y, z));
}
---
A naive multiply-then-add loses some LSBs during the multiplication and
yields zero when the MSBs are complemented by a negative number.
A true FMA function yields 15 in this example.


--   
Best regards,
lh_mouse
2017-01-18


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly

2017-01-18 Thread lhmouse

Patch is attached.
This patch removes assembly files that implement FMA on ARM and merges
them into the corresponding C files with the same name using inline assembly.
A re-generation of Makefile.in is required.

I don't have any knowledge about ARM assembly. Those functions for ARM
were created using my x86 assembly knowledge and the actual instructions
are copy-n-paste'd from old .S files. I don't have an ARM compiler to test
those functions. Please fix them should they be broken.
  
-- 
Best regards, 
lh_mouse 
2017-01-18 


From 0534577644f12e94cc408d37083277f133d1ca47 Mon Sep 17 00:00:00 2001
From: LH_Mouse 
Date: Wed, 18 Jan 2017 19:35:43 +0800
Subject: [PATCH] mingw-w64-crt/math/fma{,f,l}.c: Implement fused multiply-add
 (FMA) funcitons for x86 families properly. mingw-w64-crt/Makefile.am:
 Likewise. mingw-w64-crt/math/fma{,f}.S: Merge into corresponding C files with
 the same names, respectively.

---
 mingw-w64-crt/Makefile.am |   4 +-
 mingw-w64-crt/math/fma.S  |  42 ---
 mingw-w64-crt/math/fma.c  |  31 +++
 mingw-w64-crt/math/fmaf.S |  43 ---
 mingw-w64-crt/math/fmaf.c |  31 +++
 mingw-w64-crt/math/fmal.c | 135 --
 6 files changed, 194 insertions(+), 92 deletions(-)
 delete mode 100644 mingw-w64-crt/math/fma.S
 create mode 100644 mingw-w64-crt/math/fma.c
 delete mode 100644 mingw-w64-crt/math/fmaf.S
 create mode 100644 mingw-w64-crt/math/fmaf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index 44360db..5eba234 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -227,7 +227,6 @@ src_libmingwex=\
   \
   math/_chgsignl.S  math/ceil.Smath/ceilf.S  math/ceill.S  
   math/copysignl.S \
   math/floor.S  math/floorf.S  math/floorl.S \
-  math/fma.Smath/fmaf.S\
   math/nearbyint.S  math/nearbyintf.S  math/nearbyintl.S \
   math/trunc.S  math/truncf.S  \
   math/cbrt.c   \
@@ -235,7 +234,8 @@ src_libmingwex=\
   math/coshf.c  math/coshl.c   math/erfl.c   \
   math/expf.c   \
   math/fabs.c   math/fabsf.c   math/fabsl.c  math/fdim.c   
   math/fdimf.c math/fdiml.c \
-  math/fmal.c   math/fmax.cmath/fmaxf.c  math/fmaxl.c  
   math/fmin.c  math/fminf.c \
+  math/fma.cmath/fmaf.cmath/fmal.c   \
+  math/fmax.c   math/fmaxf.c   math/fmaxl.c  math/fmin.c   
   math/fminf.c \
   math/fminl.c  math/fp_consts.c   math/fp_constsf.c \
   math/fp_constsl.c math/fpclassify.c  math/fpclassifyf.c
math/fpclassifyl.c   math/frexpf.c\
   math/hypotf.c math/hypot.c  math/hypotl.c  math/isnan.c  
math/isnanf.cmath/isnanl.c\
diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S
deleted file mode 100644
index 74becde..000
--- a/mingw-w64-crt/math/fma.S
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fma.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 4
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fma)
-   .def__MINGW_USYMBOL(fma);   .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fma):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movsd   %xmm0,(%rsp)
-   movsd   %xmm1,16(%rsp)
-   movsd   %xmm2,32(%rsp)
-   fldl(%rsp)
-   fmull   16(%rsp)
-   fldl32(%rsp)
-   faddp
-   fstpl   (%rsp)
-   movsd   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacd d2, d0, d1
-   fcpyd d0, d2
-   bx  lr
-#elif defined(_X86_) || defined(__i386__)
-   fldl4(%esp)
-   fmull   12(%esp)
-   fldl20(%esp)
-   faddp
-   ret
-#endif
diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c
new file mode 100644
index 000..98249aa
--- /dev/null
+++ b/mingw-w64-crt/math/fma.c
@@ -0,0 +1,31 @@
+/**
+ * This file has no copyright assigned and is placed in the Public Domain.
+ * This file is part of the mingw-w64 runtime package.
+ * No warranty is given; refer to the file DISCLAIMER.PD within this package.
+ */
+double fma(double x, double y, double z);
+
+#if defined(_AMD64_) || defined(__x86_64__) || defined(_X86_) || 
defined(__i386__)
+
+long double fmal(long double x, long double y, long double z);
+
+double fma(double x, double y, double z){
+  return (double)fmal(x, y, z);
+}
+
+#elif defined(_ARM_) || defined(__arm__)
+
+double fma(double x, double y, double

Re: [Mingw-w64-public] FLT_EPSILON missing

2016-12-20 Thread lhmouse

That might be because gcc has its own float.h:

/mingw32/lib/gcc/i686-w64-mingw32/6.2.1/include/float.h:113:#define FLT_EPSILON 
__FLT_EPSILON__

--   
Best regards,
lh_mouse
2016-12-20

-
发件人：niXman 
发送日期：2016-12-20 19:38
收件人：mingw-w64-public
抄送：
主题：Re: [Mingw-w64-public] FLT_EPSILON missing

Vincent Torri 2016-12-20 09:04:
> Hello
Hi,

> it seems that  FLT_EPSILON and DBL_EPSILON are missing in float.h. at
> least, i can't find it here :
> 
> https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/crt/float.h
> 
> for reference, see :
> 
> https://msdn.microsoft.com/fr-fr/library/k15zsh48.aspx
> 
> can this be added ?

This is strange because just now I test it using i686-6.2.0-posix-dwarf 
and all works fine.

My example:

#include 
#include 

int main() {
printf("%g\n", FLT_EPSILON);
printf("%g\n", DBL_EPSILON);
}

--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public



--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] What is _pei386_runtime_relocator?

2016-12-05 Thread lhmouse

See comments about --enable-auto-import on
https://www.sourceware.org/binutils/docs/ld/Options.html .

When you refer to a member of a struct or array with static storage duration,
the compiler may generate instructions to read from or write to a constant
address that compares unequal to the address of the struct or array. Such
an address can't be resolved by the DLL loader (because it is not exported)
and is unable to be fixed by LD either and has to be resolved during run time.

--   
Best regards,
lh_mouse
2016-12-05

-
发件人：Иван Иванов 
发送日期：2016-11-29 16:47
收件人：Mingw W64 Public
抄送：
主题：[Mingw-w64-public] What is _pei386_runtime_relocator?

Could you please tell me what operation does the _pei386_runtime_relocator
function perform exactly? In which cases does the compiler generate calls to 
this
function? How can I get rid of this function when doing "-nostdlib" development
(think of it like bare metal development)?
--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public



--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Make GCC emit ASM instructions in 'gcc/except.c' for i686 MinGW targets ?

2016-10-17 Thread lhmouse

> I'd probably create a new exception handling model and conditionalize 
> whatever code you need based on that. 

That would require copy-n-paste of tons of code...
All this remains contingent on Microsoft's generosity because
they don't provide APIs for SEH on x86, unlike on x64.
So I have to reuse stack unwinding code from SJLJ at the moment.

> Emission of code for that new 
> exception model would likely require some amount of target specific code 
> called via target hooks.

Hooks... Er, are you talking about those global pointer-to-functions?
There are a lot, indeed.

--   
Best regards,
lh_mouse
2016-10-17



--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] Make GCC emit ASM instructions in 'gcc/except.c' for i686 MinGW targets ?

2016-10-16 Thread lhmouse

Hi there,

I come up with an idea about implementing stack unwinding for
the i686-w64-mingw32 target using native Windows Structured
Exception Handling (a.k.a SEH) for efficiency reasons.

Unlike DWARF and SEH for x64, SEH for x86 is stack-based
and works like the SJLJ exception model: The operating system
keeps a thread specific pointer to an SEH node on the stack
that must be installed/uninstalled during run time.

The SEH-head pointer is stored in `fs:[0]`.
Typecially, an SEH handler is installed like this, in Intel syntax:

# typedef EXCEPTION_DISPOSITION
#   filter_function(
# EXCEPTION_RECORD *record, void *establisher_frame,
# CONTEXT *machine_context, void *dispatcher_context)
#   __attribute__((__cdecl__));
# struct x86_seh_node_header {
#   struct x86_seh_node_header *next;
#   filter_function *filter;
#   char extra_data[];
# };

sub esp, 8  # struct x86_seh_node_header this_node;
mov ecx, dword ptr fs:[0]   # 
mov dword ptr[esp], ecx # this_node.next = get_thread_seh_head();
mov dword ptr[esp + 4], offset my_seh_filter
# this_node.filter = _seh_filter
mov dword ptr fs:[0], esp   # set_thread_seh_head(_node);

Before the function exits and its frame is destroyed, the node
must be uninstalled like this:

mov ecx, dword ptr fs:[0]   #
mov dword ptr fs:[0], ecx   # set_thread_seh_head(this_node.next);

Since I am looking at the SJLJ exception model and it seems using
a slim, inlined version of `setjmp()` with `__builtin_longjmp()`
that only stores 3 or 4 pointers, extending that structure should be
a simple matter. The problem is that, installation and uninstallation
of SEH nodes require target-specific ASM code generation.

Is it possible to do in 'gcc/except.c' ?


--
Best regards,
lh_mouse
2016-10-17


--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] Bootstrap gcc for i686 with SJLJ exception model in MSYS2 ?

2016-10-13 Thread lhmouse

Today I tried bootstrapping GCC 6.2.1 using PKGBUILD modified from
the MSYS2 one for gcc-git package.
I changed the line 
https://github.com/lhmouse/MINGW-packages/blob/master/mingw-w64-gcc-git/PKGBUILD#L148
from `local _conf="--disable-sjlj-exceptions --with-dwarf2"`
to `local _conf="--enable-sjlj-exceptions"`,
and the 3-stage bootstrap started at the end of stage 1 with thousands of
undefined references to _Unwind_* functions.

I tried both MSYS2 toolchains (with DWARF) and mingw-builds toolchains (with 
SJLJ)
and the latter failed with fewer yet the same errors.

Do you have any ideas why this error happened?

--
Best regards,
lh_mouse
2016-10-14


--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?

2016-09-12 Thread lhmouse

Reliable results could usually be generated using GCC's constant folding.
For example, the following program:

#include 
int main(){
const double x = 10.001000, y = -1.299000;
int quo;
double rem = __builtin_remquo(x, y, );
printf("rem = %f, quo = %d\n", rem, quo);
}

after compiled with `gcc test.c -O3 -masm=intel -S`, produces the following 
assembly:

movsd   xmm0, QWORD PTR .LC0[rip]
lea rcx, .LC1[rip]
mov r8d, -8# <== folded constant goes here
movapd  xmm1, xmm0
movqrdx, xmm0
callprintf

And yes, the result -8 is correct. 0 isn't.

It doesn't matter whether we return -16 or -8 here, as both ISO C and POSIX
only require three bits at least. Here they are both conforming as long as
we document our `remquo()` as returning only three bits into `quo`.

--   
Best regards,
lh_mouse
2016-09-12

-
发件人："K. Frank" <kfrank2...@gmail.com>
发送日期：2016-09-12 22:23
收件人：mingw64
抄送：
主题：Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?

Hello Lefty!

I do think you have found a bug here, and it does appear to
be in the mingw-w64 code.  Disclaimer: I don't understand
this completely.

Further comments in line, below.

On Tue, Sep 6, 2016 at 11:52 PM, lhmouse <lh_mo...@126.com> wrote:
> More likely a bug in mingw-w64:
>
> #include 
> #include 
> volatile double x = 10.001000, y = -1.299000;
> int main(){
> int quo;
> double rem = remquo(x, y, );
> printf("rem = %f, quo = %d\n", rem, quo);
> }
>
> With mingw-w64 this program gives the following output:
>
> E:\Desktop>gcc test.c
>
> E:\Desktop>a
> rem = -0.391000, quo = 0

I get the same result as you in a c++ test program using:

   g++ (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.2

> However, according to ISO C11 draft:
>
>> 7.12.10.3 The remquo functions
>> 2 The remquo functions compute the same remainder as the remainder 
>> functions. In
>> the object pointed to by quo they store a value whose sign is the sign of 
>> x/y and whose
>> magnitude is congruent modulo 2n to the magnitude of the integral quotient 
>> of x/y, where
>> n is an implementation-defined integer greater than or equal to 3.
>
> the value stored in `quo` must have the same sign with `x/y`.
>
> In the above example, since `x/y`, which is about -7.699, is negative,
> returning a non-negative value (zero in the above example) should be a bug.

I agree with lh_mouse's reading of the standard, and that
quo should be negative to match the sign of x / y.

Here is my imperfect analysis of what is going on.

First, I found a copy of remquo.S here:

https://sourceforge.net/p/mingw-w64/code/6570/tree//stable/v1.x/mingw-w64-crt/math/remquo.S

(but I don't understand it).

I also found a "softmath" copy of remquo.c here:

https://github.com/msys2/mingw-w64/blob/master/mingw-w64-crt/math/softmath/remquo.c

(I have no idea whether remquo.c is equivalent in detail to
remquo.S.)

   #include "softmath_private.h"

   double remquo(double x, double y, int *quo)
   {
   double r;

   if (isnan(x) || isnan(y)) return NAN;
   if (y == 0.0)
   {
   errno = EDOM;
   __mingw_raise_matherr (_DOMAIN, "remquo", x, y, NAN);
   return NAN;
   }

   r = remainder(x, y);
   if (quo) *quo = (int)((x - r) / y) % 8;
   return r;
   }

First, the expression "(int)((x - r) / y)" is undefined
behavior when (x - r) / y is too large for an int.  (This
can easily happen with floats and doubles.)  (remquo.S
uses the intel floating-point instruction fprem1, and
therefore -- if written correctly -- should not have this
problem.)

But ignoring the possible integer overflow, the error here,
which is the result lh_mouse gets in his test, is that if

   (int)((x - r) / y)

is a multiple of 8, then

  (int)((x - r) / y) % 8

will evaluate to zero, losing the sign information.

In lh_mouse's test case

   x / y = -7.699

which rounds-to-nearest to -8, which equals 0 mod 8.

How might one fix this (in c code)?  My reading of the
standard says that quo doesn't have to be exactly the
last three bits -- or even the last n bits -- of

   (int)((x - r) / y)

Rather, it only has to be congruent to this mod 8.

So (ignoring overflow in the integer conversion), one
could do something like this:

r = remainder(x, y);
if (quo) *quo = (int)((x - r) / y) % 8;
if (quo  &&  *quo == 0  &&  x/y < 0.0)  *quo = -16;
return r;

(Here, we are deeming the sign of 0 to be positive.  I don't
know whether this would be language-lawyer consistent with
t

Re: [Mingw-w64-public] [PATCH] [FIXED_UP] Added standard-conforming fmaf(), fma() and fmal() functions.

2016-09-09 Thread lhmouse

Please discard the previous patch as it does not handle denormal numbers 
correctly.
Sorry for that.
---

>From 890aab4fc63264074b6e1e16f5bd64e3c4a6f795 Mon Sep 17 00:00:00 2001
From: lhmouse <lh_mo...@126.com>
Date: Fri, 9 Sep 2016 22:52:26 +0800
Subject: [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions.

Signed-off-by: lhmouse <lh_mo...@126.com>
---
 mingw-w64-crt/Makefile.am |  4 +--
 mingw-w64-crt/math/fma.S  | 42 
 mingw-w64-crt/math/fma.c  | 12 +++
 mingw-w64-crt/math/fmaf.S | 43 
 mingw-w64-crt/math/fmaf.c | 10 ++
 mingw-w64-crt/math/fmal.c | 84 ++-
 6 files changed, 107 insertions(+), 88 deletions(-)
 delete mode 100644 mingw-w64-crt/math/fma.S
 create mode 100644 mingw-w64-crt/math/fma.c
 delete mode 100644 mingw-w64-crt/math/fmaf.S
 create mode 100644 mingw-w64-crt/math/fmaf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index 886fcf0..1c6e534 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -244,7 +244,6 @@ src_libmingwex=\
   \
   math/_chgsignl.S  math/ceil.Smath/ceilf.S  math/ceill.S  
   math/copysignl.S \
   math/floor.S  math/floorf.S  math/floorl.S \
-  math/fma.Smath/fmaf.S\
   math/nearbyint.S  math/nearbyintf.S  math/nearbyintl.S \
   math/trunc.S  math/truncf.S  \
   math/cbrt.c   \
@@ -252,7 +251,8 @@ src_libmingwex=\
   math/coshf.c  math/coshl.c   math/erfl.c   \
   math/expf.c   \
   math/fabs.c   math/fabsf.c   math/fabsl.c  math/fdim.c   
   math/fdimf.c math/fdiml.c \
-  math/fmal.c   math/fmax.cmath/fmaxf.c  math/fmaxl.c  
   math/fmin.c  math/fminf.c \
+  math/fma.cmath/fmaf.cmath/fmal.c   \
+  math/fmax.c   math/fmaxf.c   math/fmaxl.c  math/fmin.c   
   math/fminf.c \
   math/fminl.c  math/fp_consts.c   math/fp_constsf.c \
   math/fp_constsl.c math/fpclassify.c  math/fpclassifyf.c
math/fpclassifyl.c   math/frexpf.c\
   math/hypotf.c math/hypot.c  math/hypotl.c  math/isnan.c  
math/isnanf.cmath/isnanl.c\
diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S
deleted file mode 100644
index 74becde..000
--- a/mingw-w64-crt/math/fma.S
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fma.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 4
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fma)
-   .def__MINGW_USYMBOL(fma);   .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fma):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movsd   %xmm0,(%rsp)
-   movsd   %xmm1,16(%rsp)
-   movsd   %xmm2,32(%rsp)
-   fldl(%rsp)
-   fmull   16(%rsp)
-   fldl32(%rsp)
-   faddp
-   fstpl   (%rsp)
-   movsd   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacd d2, d0, d1
-   fcpyd d0, d2
-   bx  lr
-#elif defined(_X86_) || defined(__i386__)
-   fldl4(%esp)
-   fmull   12(%esp)
-   fldl20(%esp)
-   faddp
-   ret
-#endif
diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c
new file mode 100644
index 000..3703e00
--- /dev/null
+++ b/mingw-w64-crt/math/fma.c
@@ -0,0 +1,12 @@
+/**
+ * This file has no copyright assigned and is placed in the Public Domain.
+ * This file is part of the mingw-w64 runtime package.
+ * No warranty is given; refer to the file DISCLAIMER.PD within this package.
+ */
+long double fmal ( long double _x,  long double _y,  long double _z);
+
+double
+fma ( double _x,  double _y,  double _z)
+{
+  return (double)fmal(_x, _y, _z);
+}
diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S
deleted file mode 100644
index 6bc7ef0..000
--- a/mingw-w64-crt/math/fmaf.S
+++ /dev/null
@@ -1,43 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fmaf.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 2
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fmaf)
-   .def__MINGW_USYMBOL(fmaf);  .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fmaf):
-#if defined(_AMD64_) || defined(__x8

[Mingw-w64-public] [PATCH] [FIXED_UP] Added standard-conforming fmaf(), fma() and fmal() functions.

2016-09-09 Thread lhmouse

>From 52dd6b38d01e1f30bf1821a2621d707d07ec8f15 Mon Sep 17 00:00:00 2001
From: lhmouse <lh_mo...@126.com>
Date: Fri, 9 Sep 2016 22:52:26 +0800
Subject: [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions.

Signed-off-by: lhmouse <lh_mo...@126.com>
---
 mingw-w64-crt/Makefile.am |  4 +--
 mingw-w64-crt/math/fma.S  | 42 -
 mingw-w64-crt/math/fma.c  | 12 +++
 mingw-w64-crt/math/fmaf.S | 43 -
 mingw-w64-crt/math/fmaf.c | 10 ++
 mingw-w64-crt/math/fmal.c | 80 ++-
 6 files changed, 103 insertions(+), 88 deletions(-)
 delete mode 100644 mingw-w64-crt/math/fma.S
 create mode 100644 mingw-w64-crt/math/fma.c
 delete mode 100644 mingw-w64-crt/math/fmaf.S
 create mode 100644 mingw-w64-crt/math/fmaf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index 886fcf0..1c6e534 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -244,7 +244,6 @@ src_libmingwex=\
   \
   math/_chgsignl.S  math/ceil.Smath/ceilf.S  math/ceill.S  
   math/copysignl.S \
   math/floor.S  math/floorf.S  math/floorl.S \
-  math/fma.Smath/fmaf.S\
   math/nearbyint.S  math/nearbyintf.S  math/nearbyintl.S \
   math/trunc.S  math/truncf.S  \
   math/cbrt.c   \
@@ -252,7 +251,8 @@ src_libmingwex=\
   math/coshf.c  math/coshl.c   math/erfl.c   \
   math/expf.c   \
   math/fabs.c   math/fabsf.c   math/fabsl.c  math/fdim.c   
   math/fdimf.c math/fdiml.c \
-  math/fmal.c   math/fmax.cmath/fmaxf.c  math/fmaxl.c  
   math/fmin.c  math/fminf.c \
+  math/fma.cmath/fmaf.cmath/fmal.c   \
+  math/fmax.c   math/fmaxf.c   math/fmaxl.c  math/fmin.c   
   math/fminf.c \
   math/fminl.c  math/fp_consts.c   math/fp_constsf.c \
   math/fp_constsl.c math/fpclassify.c  math/fpclassifyf.c
math/fpclassifyl.c   math/frexpf.c\
   math/hypotf.c math/hypot.c  math/hypotl.c  math/isnan.c  
math/isnanf.cmath/isnanl.c\
diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S
deleted file mode 100644
index 74becde..000
--- a/mingw-w64-crt/math/fma.S
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fma.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 4
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fma)
-   .def__MINGW_USYMBOL(fma);   .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fma):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movsd   %xmm0,(%rsp)
-   movsd   %xmm1,16(%rsp)
-   movsd   %xmm2,32(%rsp)
-   fldl(%rsp)
-   fmull   16(%rsp)
-   fldl32(%rsp)
-   faddp
-   fstpl   (%rsp)
-   movsd   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacd d2, d0, d1
-   fcpyd d0, d2
-   bx  lr
-#elif defined(_X86_) || defined(__i386__)
-   fldl4(%esp)
-   fmull   12(%esp)
-   fldl20(%esp)
-   faddp
-   ret
-#endif
diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c
new file mode 100644
index 000..3703e00
--- /dev/null
+++ b/mingw-w64-crt/math/fma.c
@@ -0,0 +1,12 @@
+/**
+ * This file has no copyright assigned and is placed in the Public Domain.
+ * This file is part of the mingw-w64 runtime package.
+ * No warranty is given; refer to the file DISCLAIMER.PD within this package.
+ */
+long double fmal ( long double _x,  long double _y,  long double _z);
+
+double
+fma ( double _x,  double _y,  double _z)
+{
+  return (double)fmal(_x, _y, _z);
+}
diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S
deleted file mode 100644
index 6bc7ef0..000
--- a/mingw-w64-crt/math/fmaf.S
+++ /dev/null
@@ -1,43 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fmaf.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 2
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fmaf)
-   .def__MINGW_USYMBOL(fmaf);  .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fmaf):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movss   %xmm0,(%rsp)
-   movss   %xmm1,16(%rsp)
-

[Mingw-w64-public] Wrong output from %La specifier in printf()?

2016-09-08 Thread lhmouse

When the `%La` specifier is used in `printf()` to format
the C99 hexdecimal floating point value `0x5p-80l`,
a wrong result is generated, as shown in this example:

E:\Desktop>cat test.c
extern int __mingw_printf(const char *, ...);
int main(){
__mingw_printf("%La\n", 0x5p-80l);
}

E:\Desktop>gcc test.c -std=c99

E:\Desktop>a.exe
0x0p-141

Removing the `__mingw_` prefix and testing the same program
on Linux gives the correct result:

    lh_mouse@lhmouse-dev:~$ uname -a
Linux lhmouse-dev 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2+deb8u3 
(2016-07-02) x86_64 GNU/Linux
    lh_mouse@lhmouse-dev:~$ gcc test.c -std=c99
    lh_mouse@lhmouse-dev:~$ ./a.out 
0xap-81

Is this a bug in `printf()`?

--
Best regards,
lh_mouse
2016-09-09


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions.

2016-09-08 Thread lhmouse

Oops. Are there any volunteers to implement `fma()` functions for ARM ?

--   
Best regards,
lh_mouse
2016-09-08


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions.

2016-09-08 Thread lhmouse

---
 mingw-w64-crt/Makefile.am |  4 ++--
 mingw-w64-crt/math/fma.S  | 42 
 mingw-w64-crt/math/fma.c  | 12 ++
 mingw-w64-crt/math/fmaf.S | 43 -
 mingw-w64-crt/math/fmaf.c | 10 
 mingw-w64-crt/math/fmal.c | 61 ++-
 6 files changed, 84 insertions(+), 88 deletions(-)
 delete mode 100644 mingw-w64-crt/math/fma.S
 create mode 100644 mingw-w64-crt/math/fma.c
 delete mode 100644 mingw-w64-crt/math/fmaf.S
 create mode 100644 mingw-w64-crt/math/fmaf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index 886fcf0..1c6e534 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -244,7 +244,6 @@ src_libmingwex=\
   \
   math/_chgsignl.S  math/ceil.Smath/ceilf.S  math/ceill.S  
   math/copysignl.S \
   math/floor.S  math/floorf.S  math/floorl.S \
-  math/fma.Smath/fmaf.S\
   math/nearbyint.S  math/nearbyintf.S  math/nearbyintl.S \
   math/trunc.S  math/truncf.S  \
   math/cbrt.c   \
@@ -252,7 +251,8 @@ src_libmingwex=\
   math/coshf.c  math/coshl.c   math/erfl.c   \
   math/expf.c   \
   math/fabs.c   math/fabsf.c   math/fabsl.c  math/fdim.c   
   math/fdimf.c math/fdiml.c \
-  math/fmal.c   math/fmax.cmath/fmaxf.c  math/fmaxl.c  
   math/fmin.c  math/fminf.c \
+  math/fma.cmath/fmaf.cmath/fmal.c   \
+  math/fmax.c   math/fmaxf.c   math/fmaxl.c  math/fmin.c   
   math/fminf.c \
   math/fminl.c  math/fp_consts.c   math/fp_constsf.c \
   math/fp_constsl.c math/fpclassify.c  math/fpclassifyf.c
math/fpclassifyl.c   math/frexpf.c\
   math/hypotf.c math/hypot.c  math/hypotl.c  math/isnan.c  
math/isnanf.cmath/isnanl.c\
diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S
deleted file mode 100644
index 74becde..000
--- a/mingw-w64-crt/math/fma.S
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fma.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 4
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fma)
-   .def__MINGW_USYMBOL(fma);   .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fma):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movsd   %xmm0,(%rsp)
-   movsd   %xmm1,16(%rsp)
-   movsd   %xmm2,32(%rsp)
-   fldl(%rsp)
-   fmull   16(%rsp)
-   fldl32(%rsp)
-   faddp
-   fstpl   (%rsp)
-   movsd   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacd d2, d0, d1
-   fcpyd d0, d2
-   bx  lr
-#elif defined(_X86_) || defined(__i386__)
-   fldl4(%esp)
-   fmull   12(%esp)
-   fldl20(%esp)
-   faddp
-   ret
-#endif
diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c
new file mode 100644
index 000..3703e00
--- /dev/null
+++ b/mingw-w64-crt/math/fma.c
@@ -0,0 +1,12 @@
+/**
+ * This file has no copyright assigned and is placed in the Public Domain.
+ * This file is part of the mingw-w64 runtime package.
+ * No warranty is given; refer to the file DISCLAIMER.PD within this package.
+ */
+long double fmal ( long double _x,  long double _y,  long double _z);
+
+double
+fma ( double _x,  double _y,  double _z)
+{
+  return (double)fmal(_x, _y, _z);
+}
diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S
deleted file mode 100644
index 6bc7ef0..000
--- a/mingw-w64-crt/math/fmaf.S
+++ /dev/null
@@ -1,43 +0,0 @@
-/**
- * This file has no copyright assigned and is placed in the Public Domain.
- * This file is part of the mingw-w64 runtime package.
- * No warranty is given; refer to the file DISCLAIMER.PD within this package.
- */
-#include <_mingw_mac.h>
-
-   .file   "fmaf.S"
-   .text
-#ifdef __x86_64__
-   .align 8
-#else
-   .align 2
-#endif
-   .p2align 4,,15
-   .globl __MINGW_USYMBOL(fmaf)
-   .def__MINGW_USYMBOL(fmaf);  .scl2;  .type   32; .endef
-__MINGW_USYMBOL(fmaf):
-#if defined(_AMD64_) || defined(__x86_64__)
-   subq$56, %rsp
-   movss   %xmm0,(%rsp)
-   movss   %xmm1,16(%rsp)
-   movss   %xmm2,32(%rsp)
-   flds(%rsp)
-   fmuls   16(%rsp)
-   flds32(%rsp)
-   faddp
-   fstps   (%rsp)
-   movss   (%rsp),%xmm0
-   addq$56, %rsp
-   ret
-#elif defined(_ARM_) || defined(__arm__)
-   fmacs s2, s0, s1
-   fcpys s0, s2
-   bx

Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem

2016-09-08 Thread lhmouse

Thanks for such nice work!
I hope someone would accept it. Kai has been away for days.

--   
Best regards,
lh_mouse
2016-09-08

-
发件人：Thomas Bickel <tmb...@gmail.com>
发送日期：2016-09-08 21:01
收件人：mingw-w64-public
抄送：
主题：Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem



On 07.09.2016 17:21, lhmouse wrote:
> (I don't write AT assembly so I am unable to make a patch.
> Nevertheless I hope someone who writes AT assembly could fix it.)

I don't have time to write a patch but I can donate some code that AFAIK does 
what you need for the 
sin functions.

 >gcc -m32 sinl32.s sin.c
 >a
sinl = -5.421010862427522170037264e-020
 my_sinl = -0.e+000
sin = 1.224606353822377258211418e-016
 my_sin = 1.225148454908620010428422e-016
sinf = -8.7422776573475858e-008
 my_sinf = -8.7422780003674585e-008

 >gcc -m64 sinl64.s sin.c
 >a
sinl = -5.421010862427522170037264e-020
 my_sinl = -0.e+000
sin = 1.224606353822377258211418e-016
 my_sin = 1.225148454908620010428422e-016
sinf = -8.7422776573475858e-008
 my_sinf = -8.7422776573475858e-008



Regards
Thomas

--

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public




--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] `fma()` functions are completely wrong in mingw-w64

2016-09-08 Thread lhmouse

Reading `mingw-w64/mingw-w64-crt/math/fmal.c`:

long double
fmal ( long double _x,  long double _y,  long double _z)
{
  return ((_x * _y) + _z);
}

This implementation is completely wrong.

https://en.wikipedia.org/wiki/Multiply–accumulate_operation#Fused_multiply.E2.80.93add
The multiplication in a single FMA operation must behave as if
the result had infinite precision. That is, multiplying
two x87-extended-precision floating point numbers
(1 sign + 15 exp + 64 frac = 80 bits) yields a result of 144 bits
(1 sign + 15 exp + 64 frac * 2 = 144 bits).

For example, with a conforming `fmal()`, the expression
`fmal(1.2l, 3.4l, -3.00010l)` shall yield
approximately `8e-18`, because `1.2l * 3.4l` yields
`3.000108l`. But in mingw-w64, this indeterminate result
is truncated when converted to `long double`, yielding `3.0001l`,
and adding `-3.00010l` to it yields zero.

Since x87 does not have 128-bit registers, FMA must be done in software:
1. Split both multiplier into higher and lower parts. Since a `long double`
has 64 significant bits (it does not have a hidden bit), either of the two 
parts
has to have 32 bits so we don't get precision losses when multiplying them.
2. Keeping in mind that `(a+b)(c+d)=ac+ad+bc+bd`, calculate the sum
IN THE FOLLOWING ORDER:

long double ret = z;
ret += xhi * yhi;
ret += xhi * ylo + xlo * yhi;
ret += xlo * ylo;


A conforming implementation can be found here:
https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/fma.c

--
Best regards,
lh_mouse
2016-09-08


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem

2016-09-08 Thread lhmouse

It is merely a function guaranteed to be declared implicitly
(thus requires no ) and has the same semantics with
the standard function `sinl()`. The GCC optimizer can perform
certain types of optimization such as constant folding and inlining
only if `fsinl()` is supposed to do the same thing as specified
by the C standard, which could be explicitly disabled using
`-fno-builtin` or `-ffreestanding`. AFAICS there is otherwise 
no difference. `__builtin_fsinl()` may result in a call to `sinl()`.

--   
Best regards,
lh_mouse
2016-09-08

-
发件人：NightStrike 
发送日期：2016-09-08 15:06
收件人：mingw-w64-public@lists.sourceforge.net
抄送：
主题：Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem

What does gcc's __builtin_sinl() do?


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem

2016-09-07 Thread lhmouse

If performance is the problem there are a number of solutions such as
inline assembly, static lookup tables, etc. `fsinl()` is apparently not one
of them.
But yes, I am all ears

--   
Best regards,
lh_mouse
2016-09-08

-
发件人：Riot 
发送日期：2016-09-08 04:00
收件人：mingw-w64-public
抄送：
主题：Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem

Some of us (game developers especially) greatly prefer a minor inaccuracy
to a potentially major slowdown; I would personally opposed this change, as
you're noticeably increasing the cost of something that's used heavily in
tightly looped code.  Perhaps an appropriately named #ifdef switch would be
a way to please everyone here?

Regards,
Riot


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] sinl/cosl/tanl accuracy problem

2016-09-07 Thread lhmouse

(I don't write AT assembly so I am unable to make a patch.
Nevertheless I hope someone who writes AT assembly could fix it.)

The x87 `sinl` instruction has been suffering from an accuracy problem
since decades ago, which is described in this article:
https://software.intel.com/blogs/2014/10/09/fsin-documentation-improvements-in-the-intel-64-and-ia-32-architectures-software

Long story short: Before we can calculate `sin(x)`, we have to reduce `x`
such that it falls in (-π/2,π/2]. This can be easily done via dividing `x`
with π and getting the remainder. The problem is that, instead of
a reasonably accurate value of π, `FSIN` uses a 66-bit approximate value
as the divisor, which makes the result very inaccurate if `x` is proximate to
some multiple of π, because the remainder would end up with most of
its upper bits being zero and very few significant bits left.

To compromise with Intel people, as the article suggests, it is essential
to reduce `x` before executing the `fsin` instruction. This is done as follows:

1. Use `FLDPI` instruction to get an accurate value of π.
2. Run `FPREM1` instruction repeatly until the _C2_ bit in FPU status word
is cleared. The result remainder will be in (-π/2,π/2], and 
the _C0_,_C3_,_C1_ bits are the least three significant bits of
the quotient, from left to right.
3. Calculate the sine value using `FSIN` instruction. This never fails.
4. Acknowledging that `sin(x) = -sin(x+kπ)` when `k` is odd and
`sin(x) = sin(x+kπ)` when `k` is even, because the parity bit of the 
quotient
is the _C1_ bit in the FPU status word, if it is set, negate the result with
`FCHS` instruction. We get the sine value now.

The above process is the same for cosine.
In the case of tangent, step 4 should be removed.


The following code fragment compares `sinl` from mingw-w64 and
my own implementation:

volatile auto one = 1.0l;
auto theta = atanl(one) * 4; // This function is from mingw-w64.
std::printf("sinl   (theta) = %.16Le\n", sinl   (theta));
std::printf("my_sinl(theta) = %.16Le\n", my_sinl(theta));

It produces the following result:

sinl   (theta) = 1.6263032587282567e-019
my_sinl(theta) = -0.e+000

My implementation could be found here:
https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/sin.c#L12

static inline long double fpu_sin(long double x){
unsigned fsw;
const long double reduced = __MCFCRT_fremainder(, x, 
__MCFCRT_fldpi());
long double ret = __MCFCRT_fsin_unsafe(reduced);
if(fsw & 0x0200){
ret = __MCFCRT_fneg(ret);
}
return ret;
}

--
Best regards,
lh_mouse
2016-09-07


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?

2016-09-06 Thread lhmouse

More likely a bug in mingw-w64:

#include 
#include 
volatile double x = 10.001000, y = -1.299000;
int main(){
int quo;
double rem = remquo(x, y, );
printf("rem = %f, quo = %d\n", rem, quo);
}

With mingw-w64 this program gives the following output:

E:\Desktop>gcc test.c

E:\Desktop>a
rem = -0.391000, quo = 0

However, according to ISO C11 draft:

> 7.12.10.3 The remquo functions
> 2 The remquo functions compute the same remainder as the remainder functions. 
> In
> the object pointed to by quo they store a value whose sign is the sign of x/y 
> and whose
> magnitude is congruent modulo 2n to the magnitude of the integral quotient of 
> x/y, where
> n is an implementation-defined integer greater than or equal to 3.

the value stored in `quo` must have the same sign with `x/y`.

In the above example, since `x/y`, which is about -7.699, is negative,
returning a non-negative value (zero in the above example) should be a bug.

--   
Best regards,
lh_mouse
2016-09-07




--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?

2016-09-05 Thread lhmouse

The disagreement of glibc and mingw-w64 (in my opinion) is definitely glibc's 
bug:

lh_mouse@lhmouse-dev:~$ cat test3.c 
#include 
#include 

int main(){
double x = 10.001000;
double y = 0.701000;
int quo;
double rem = remquo(x, y, );
printf("%f %f %d %f\n", x, y, quo, rem);
}
    lh_mouse@lhmouse-dev:~$ gcc test3.c -lm -O0 && ./a.out # use glibc
10.001000 0.701000 8 4.393000
    lh_mouse@lhmouse-dev:~$ gcc test3.c -lm -O2 && ./a.out  # performs constant 
folding
10.001000 0.701000 14 0.187000
lh_mouse@lhmouse-dev:~$ 

The remainder of `remquo` from mingw-w64 seems all right.
However the value (or rather, the 3 least significant bits) returned in the 
third parameter
still seems problematic.

--   
Best regards,
lh_mouse
2016-09-06

-----
发件人："lhmouse"<lh_mo...@126.com>
发送日期：2016-09-05 23:08
收件人：mingw-w64-public,lhmouse
抄送：
主题：Re:  [Mingw-w64-public] Wrong quotient results of `remquo()`?

Found an example on cppreference:
http://en.cppreference.com/w/cpp/numeric/math/remquo

The example shows that, since `cos()` is periodic,
adding 1 * PI to its parameter doesn't change the result.

But, we can also say that, subtracting 1 * PI
from its parameter should not change the result either. However,
with mingw-w64 and MSVCRT, it DOES change the result,
as shown on the last line:

E:\Desktop>g++ test.cpp -std=c++14

E:\Desktop>a.exe
cos(pi * -0.25) = 0.707107
cos(pi * -1.25) = -0.707107
cos(pi * -1.25) = 0.707123
cos(pi * -10001.25) = -0.707117
cos(pi * -1.25) = 0.707107
cos(pi * -10001.25) = 0.707107

This could be a potential bug.

--   
Best regards,
lh_mouse
2016-09-05

---------
发件人："lhmouse"<lh_mo...@126.com>
发送日期：2016-09-05 22:27
收件人：mingw-w64-public
抄送：
主题：[Mingw-w64-public] Wrong quotient results of `remquo()`?

Hello guys,
I am testing my `remquo()` implementation when I find that `remquo`
on Linux (using glibc) and on Windows (using mingw-w64) generate
different results. I don't think this is the correct behavior. Any ideas?

The testcases in file `remquo.txt` the attached zip file was generated
on my VPS running Debian. MinGW-w64 is failing some of them:

E:\Desktop\remquo_test>gcc test.c -std=c99 && a.exe > nul
passed: 37864
failed: 2537

--
Best regards,
lh_mouse
2016-09-05
--

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public





--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?

2016-09-05 Thread lhmouse

Found an example on cppreference:
http://en.cppreference.com/w/cpp/numeric/math/remquo

The example shows that, since `cos()` is periodic,
adding 1 * PI to its parameter doesn't change the result.

But, we can also say that, subtracting 1 * PI
from its parameter should not change the result either. However,
with mingw-w64 and MSVCRT, it DOES change the result,
as shown on the last line:

E:\Desktop>g++ test.cpp -std=c++14

E:\Desktop>a.exe
cos(pi * -0.25) = 0.707107
cos(pi * -1.25) = -0.707107
cos(pi * -1.25) = 0.707123
cos(pi * -10001.25) = -0.707117
cos(pi * -1.25) = 0.707107
cos(pi * -10001.25) = 0.707107

This could be a potential bug.

--   
Best regards,
lh_mouse
2016-09-05

-
发件人："lhmouse"<lh_mo...@126.com>
发送日期：2016-09-05 22:27
收件人：mingw-w64-public
抄送：
主题：[Mingw-w64-public] Wrong quotient results of `remquo()`?

Hello guys,
I am testing my `remquo()` implementation when I find that `remquo`
on Linux (using glibc) and on Windows (using mingw-w64) generate
different results. I don't think this is the correct behavior. Any ideas?

The testcases in file `remquo.txt` the attached zip file was generated
on my VPS running Debian. MinGW-w64 is failing some of them:

E:\Desktop\remquo_test>gcc test.c -std=c99 && a.exe > nul
passed: 37864
failed: 2537

--
Best regards,
lh_mouse
2016-09-05
--

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public




--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

[Mingw-w64-public] Wrong quotient results of `remquo()`?

2016-09-05 Thread lhmouse

Hello guys,
I am testing my `remquo()` implementation when I find that `remquo`
on Linux (using glibc) and on Windows (using mingw-w64) generate
different results. I don't think this is the correct behavior. Any ideas?

The testcases in file `remquo.txt` the attached zip file was generated
on my VPS running Debian. MinGW-w64 is failing some of them:

E:\Desktop\remquo_test>gcc test.c -std=c99 && a.exe > nul
passed: 37864
failed: 2537

--
Best regards,
lh_mouse
2016-09-05
--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Building mingw-w64 and include paths

2016-08-26 Thread lhmouse

I haven't had a deep look at the patch, but AFAICS the problem is a simple 
matter
that can be solved by adding `-nostdinc 
-I"path/to/the/directory/of/new/headers"`
to `CPPFLAGS` when building the CRT.

--   
Best regards,
lh_mouse
2016-08-26

-
发件人：David Wohlferd 
发送日期：2016-08-23 07:34
收件人：mingw-w64-public
抄送：
主题：Re: [Mingw-w64-public] Building mingw-w64 and include paths

Let's try this again.  This time the proposed patch is attached which 
may help.  For 'ease of review,' this patch does not include the 
'generated' makefile.in files.

The problem I am trying to fix is that when building mingw-w64, the 
compiler will often use headers from the Tools Directories instead of 
the mingw-w64 source directory.  This is due to the fact that while some 
mingw-w64 SourceDir paths are passed to the compiler, several are not.  
This causes 3 problems:

1) In order for me to make and test changes to intrin.h (for example), I 
have to remember to copy it to ToolsDir after every change.

2) Users who want to build mingw-w64 have to *know* that they need to 
copy certain files (from a variety of locations) to their ToolsDir 
before trying to build mingw-w64.  While the build may succeed without 
this, the results will be uncertain.

3) Header files that are in ToolsDir are usually treated as 'System 
Headers' (https://gcc.gnu.org/onlinedocs/cpp/System-Headers.html). Among 
other things, this means that warnings in these files tend to get 
suppressed rather than found and fixed.

While there are multiple ways to fix this, I have chosen to add the 
necessary paths to Makefile.am.  I also needed to duplicate 2 files 
(_mingw_directx.h _mingw_ddk.h) into appropriately named directories so 
that _mingw.h's #include "sdks/_mingw_directx.h" would work (Is there a 
better way to resolve this?  I can't just generate them into the sdks 
directory, since 'distribution' needs them where they are.).

Kai has suggested that we modify the configure script to copy all the 
header files to a single directory (a la 'make install') and use that 
single directory instead of the 9 above.  My scripting skills are not 
sufficient to do this.  If this is how we want to proceed, someone else 
will need to write it.

dw


--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] Does MinGW support Signals and sigset_t ?

2016-08-01 Thread lhmouse

SSE2 is mandatory on x64.

On x86 you probably want to take a look at Vectored Exception Handling
which needs no compiler specific magic. However the canonical way to 
test a CPU for SSE support would be using the CPUID instruction.

--   
Best regards,
lh_mouse
2016-08-01

-
发件人：Jeffrey Walton 
发送日期：2016-08-01 19:32
收件人：mingw-w64-public
抄送：
主题：Re: [Mingw-w64-public] Does MinGW support Signals and sigset_t ?

On Mon, Aug 1, 2016 at 5:30 AM, LRN  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> On 01.08.2016 11:25, Jeffrey Walton wrote:
>
>> My question is, does MinGW support Signals and sigset_t ?
>
> No.
>
>> Or does MinGQW use SEH and try/except/finally?
>
> No.

Perfect, thanks.

Our problem is runtime feature detection. X86/X64 can result in an
illegal instruction for SSE2 *if* the OS does not support it. We use
SEH on Windows and SigIll handler on Unix/Linux to guard the code that
performs the probe.

I realize the OSes that will trap are basically extinct. However, our
governance dictates we still support them. Its not too difficult in
practice.

My next question is, what are the MinGW alternatives to guard the code
if neither SEH nor Signals are available?

Jeff

--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

--
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

39 matches

Mail list logo