Re: [HACKERS] Crash with old Windows on new CPU

2016-03-10 Thread Christian Ullrich

* Magnus Hagander wrote:


I did notice the #ifdef's are actually different in the header and body
section of the patch, which seems wrong. I used the one from the actual
implementation (_M_AMD64) for the header includes as, and also merged the
#ifdef's together to a single #if in each section. Please verify!


Should be fine. I remember I had a reason for doing it this way, but it 
can't have been a very good one. Thanks for cleaning it up.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-10 Thread Magnus Hagander
On Wed, Mar 9, 2016 at 5:48 AM, Christian Ullrich 
wrote:

> * Peter Eisentraut wrote:
>
> On 2/12/16 11:24 AM, Christian Ullrich wrote:
>>
>
> Otherwise, it may be time to update the manual (15.6 Supported
>>> Platforms) where it says PostgreSQL "can be expected to work on these
>>> operating systems: [...] Windows (Win2000 SP4 and later), [...]".
>>> Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when
>>> running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and
>>> later)"?
>>>
>>
> Wouldn't the fix be for users to upgrade their service packs?
>>
>
> Windows 2008 and 2008R2 are entirely different things: 2008 is the server
> sibling of Vista (internal version 6.0), 2008R2 is that of Windows 7 (6.1).
> There is no version of 2008 that supports AVX2.
>
>
Windows 2008 went out of mainstream support in January last year, but is on
extended support until 2020. Extended support means both paid support and
security support is still there (what you don't get is new hardware support
and generic non-security related updates). That means we're going to see
these versions around for a long time.  (And Vista is in extended support
until 2017).

And exactly the type of upgrade scenario outlined in the OP is only going
to be come more common as old hardware gets replaced. If we didn't patch
this, the reasonable thing would be to say we don't support Visual Studio
2013, rather than a specific version of Windows, I think.

Given the small and localized change of this (and hey it even goes inside a
function labeled hacks), I definitely think it's worth doing.

I've commited this one, including a reference to the MS bug report where
they say they're not going to fix it in existing versions. I didn't
actually test it on mingw, but as it doesn't define MSC_VER it shouldn't be
affected (by the bug or by the patch).

I did notice the #ifdef's are actually different in the header and body
section of the patch, which seems wrong. I used the one from the actual
implementation (_M_AMD64) for the header includes as, and also merged the
#ifdef's together to a single #if in each section. Please verify!

Thanks for a very good analysis and patch, and for good explanations of the
details! :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Christian Ullrich

* Magnus Hagander wrote:


On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich 
wrote:


* Magnus Hagander wrote:



How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?


Not unless mingw defines _MSC_VER.


The question is then - should it, under some conditions?


I have no clue about MinGW, so all I can say is that msvcrt.dll (even in 
Windows 10) does not use the problematic instructions, nor does it 
export anything with "FMA" in its name. If there is nothing to turn off, 
and nothing to turn it off with, then I suspect we are safe there. (This 
is based on a minimal test program, just to see where MinGW takes its 
log() from; it comes from msvcrt.)


Also, we have confirmation, in the link I sent when I started this 
thread, that the bug is only and specifically in the VS2013 CRT, not 
anywhere else before or since.



I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?



No, that is fine. I think it's just to keep the function name from getting
too ridiculously long. The functions in  are all named
for the client versions only, and only check the version number, not the
client/server capability flags. Or, rather, there is a separate function to
determine that.



Presumably the link is documented somewhere (the docs don't seem to say
anything about it).


 
("IsWindows7SP1OrGreater function"):


> This function does not differentiate between client and server
> releases. It will return true if the current OS version number is
> equal to or higher than the version of the client named in the call.
> For example, a call to IsWindowsXPSP3OrGreater will return true on
> Windows Server 2008. Applications that need to distinguish between
> server and client versions of Windows should call IsWindowsServer.

The internal version numbers (since NT4) are:

 Version | Client | Server
-++
 4.0 | NT4| NT4
 5.0 | 2000   | 2000
 5.1 | XP |
 5.2 || 2003
 6.0 | Vista  | 2008
 6.1 | 7  | 2008R2
 6.2 | 8  | 2012
 6.3 | 8.1| 2012R2
10.0 | 10 | [2016]

The relevant SDK header (), where constants for the version 
numbers are defined, also only has entries named for the client 
versions, with the sole exception of 2008, with the same value as Vista, 
and the special case of 2003 with no equivalent client version (unless 
you count the second attempt at an XP x64 ... but I digress again).


If Microsoft wanted the added complexity of separate symbols for what is 
basically the same code, I rather think they would have them.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Alvaro Herrera
Magnus Hagander wrote:
> On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich 
> wrote:

> > And apparently not a single one with VS 2013. OK, I'll see what I can do
> > about setting some up soonish, at least with (server) 2008 and (client) 7.
> > FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008
> > now, with no complaints.
> 
> Having that added to the buildfarm would definitely be very useful!

Yeah, ideally we would have one buildfarm member for each MSVC compiler
supported by each Windows version.  AFAICS we only have

Windows 8 8.1 Pro Visual Studio 2012 x86_64
Windows Server 2008 R2 (64bit) Visual C++ 2005 AMD64
Windows Server 2003 R2 5.2.3790 MSVC 2005 Express 8.0.50727.42 x86
Vista Ultimate 6.0.6000 MSVC 2005 Pro 8.0.50727.867 x86
Windows XP-PRO SP3 MSVC++ 2008 Express i686

which is pretty unfulfilling.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Magnus Hagander
On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich 
wrote:

> * Magnus Hagander wrote:
>
> On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich 
>> wrote:
>>
>
> On February 13, 2016 4:10:34 PM Tom Lane  wrote:
>>>
>>
> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
 the code compiles on *exactly one* MSVC version.

>>>
>>> The bug exists in only that compiler version's CRT, also, that is not the
>>> complete version number. There may be different builds somewhere, but
>>> they
>>> all start with 18.0.
>>>
>>
> IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
>> they don't actually bump that number in different build numbers.
>>
>> How does this work wrt mingw, though? Do we have the same problem there?
>> AIUI this code can never run on mingw, correct?
>>
>
> Not unless mingw defines _MSC_VER.
>

The question is then - should it, under some conditions?



> (If they do, I suggest we make Yury Zhuravlev cry and drop MinGW support
> instead. IMHO everyone should boycott them anyway until they come up with a
> working libc of their own instead of doing unspeakable things to a helpless
> msvcrt.dll that is not intended for use by non-system components at all.
> But I digress.)



Haha, well, at least they're not cygwin...



> I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
>> W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
>> be a separate check for server-windows?
>>
>
> No, that is fine. I think it's just to keep the function name from getting
> too ridiculously long. The functions in  are all named
> for the client versions only, and only check the version number, not the
> client/server capability flags. Or, rather, there is a separate function to
> determine that.


Presumably the link is documented somewhere (the docs don't seem to say
anything about it).


> That would
 give us some context to estimate the risks of this code executing
 when it's not really needed.

>>>
>>> Hence all the conditions. The problem is *certain* to occur under these
>>> specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
>>> when built with VS2013), and under no others, and these conditions flip
>>> the
>>> switch exactly then.
>>>
>>
> Well, it doesn't flip it based on the specifics "running on a CPU with
>> AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
>> no-op.
>>
>
> Precisely.
>
> Isn't that what the buildfarm is (among other things) for?
>>>
>>
>> The buildfarm doesn't really have a big spread of Windows animals,
>> unfortunately.
>>
>
> And apparently not a single one with VS 2013. OK, I'll see what I can do
> about setting some up soonish, at least with (server) 2008 and (client) 7.
> FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008
> now, with no complaints.


Having that added to the buildfarm would definitely be very useful!


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Christian Ullrich

* Magnus Hagander wrote:


On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich 
wrote:



On February 13, 2016 4:10:34 PM Tom Lane  wrote:



I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
the code compiles on *exactly one* MSVC version.


The bug exists in only that compiler version's CRT, also, that is not the
complete version number. There may be different builds somewhere, but they
all start with 18.0.



IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
they don't actually bump that number in different build numbers.

How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?


Not unless mingw defines _MSC_VER.

(If they do, I suggest we make Yury Zhuravlev cry and drop MinGW support 
instead. IMHO everyone should boycott them anyway until they come up 
with a working libc of their own instead of doing unspeakable things to 
a helpless msvcrt.dll that is not intended for use by non-system 
components at all. But I digress.)



I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?


No, that is fine. I think it's just to keep the function name from 
getting too ridiculously long. The functions in  are 
all named for the client versions only, and only check the version 
number, not the client/server capability flags. Or, rather, there is a 
separate function to determine that.



That would
give us some context to estimate the risks of this code executing
when it's not really needed.


Hence all the conditions. The problem is *certain* to occur under these
specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
when built with VS2013), and under no others, and these conditions flip the
switch exactly then.



Well, it doesn't flip it based on the specifics "running on a CPU with
AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
no-op.


Precisely.


Isn't that what the buildfarm is (among other things) for?


The buildfarm doesn't really have a big spread of Windows animals,
unfortunately.


And apparently not a single one with VS 2013. OK, I'll see what I can do 
about setting some up soonish, at least with (server) 2008 and (client) 
7. FWIW, I have a local build of 9.5.1 with this patch in daily use on 
2008 now, with no complaints.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Magnus Hagander
On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich 
wrote:

> On February 13, 2016 4:10:34 PM Tom Lane  wrote:
>
> > Christian Ullrich  writes:
> >> * Robert Haas wrote:
> >>> Thanks for the report and patch.  Regrettably I haven't the Windows
> >>> knowledge to have any idea whether it's right or wrong, but hopefully
> >>> someone who knows Windows will jump in here.
> >
> >> In commitfest now.
> >
> > FWIW, I'm a tad suspicious of the notion that it's our job to make this
> > case work.  How practical is it really to run a Windows release on
> > unsupported-by-Microsoft hardware --- aren't dozens of other programs
> > going to have the same issue?
>
> Why would the hardware be unsupported? The problem occurs on new CPUs, not
> old ones, and even the OS (2008) is still in extended support until next
> year, IIRC.
>
> > I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
> > the code compiles on *exactly one* MSVC version.
>
> The bug exists in only that compiler version's CRT, also, that is not the
> complete version number. There may be different builds somewhere, but they
> all start with 18.0.
>


IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
they don't actually bump that number in different build numbers.


How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?


> Lastly, I'd like to see some discussion of what side effects
> > "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> > a magic-elixir-against-crashes-with-no-downsides.
>
> It tells the math library (in the CRT, no separate libm on Windows) not to
> use the AVX2-based implementations of log() and possibly other functions.
> AIUI, FMA means "fused multiply-add" and is apparently something that
> increases performance and accuracy in transcendental functions.
>
> I can check the CRT source later today and figure out exactly what it does.
>
> Also, if you look at the link I sent, you will find that a member of the
> Visual C++ Libraries team at MS is the source for the workaround. They
> probably know what they are doing, present circumstances excepted.
>

I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?




> > That would
> > give us some context to estimate the risks of this code executing
> > when it's not really needed.
>
> Hence all the conditions. The problem is *certain* to occur under these
> specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
> when built with VS2013), and under no others, and these conditions flip the
> switch exactly then.
>


Well, it doesn't flip it based on the specifics "running on a CPU with
AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
no-op.



> Without that, I'd be worried that
> > this cure is worse than the disease because it breaks cases that
> > weren't broken before.
>
> Isn't that what the buildfarm is (among other things) for?
>

The buildfarm doesn't really have a big spread of Windows animals,
unfortunately.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-08 Thread Christian Ullrich

* Peter Eisentraut wrote:


On 2/12/16 11:24 AM, Christian Ullrich wrote:



Otherwise, it may be time to update the manual (15.6 Supported
Platforms) where it says PostgreSQL "can be expected to work on these
operating systems: [...] Windows (Win2000 SP4 and later), [...]".
Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when
running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and
later)"?



Wouldn't the fix be for users to upgrade their service packs?


Windows 2008 and 2008R2 are entirely different things: 2008 is the 
server sibling of Vista (internal version 6.0), 2008R2 is that of 
Windows 7 (6.1). There is no version of 2008 that supports AVX2.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-03-08 Thread Peter Eisentraut
On 2/12/16 11:24 AM, Christian Ullrich wrote:
> Otherwise, it may be time to update the manual (15.6 Supported
> Platforms) where it says PostgreSQL "can be expected to work on these
> operating systems: [...] Windows (Win2000 SP4 and later), [...]".
> Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when
> running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and
> later)"?

Wouldn't the fix be for users to upgrade their service packs?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-13 Thread Tom Lane
Christian Ullrich  writes:
> * Robert Haas wrote:
>> Thanks for the report and patch.  Regrettably I haven't the Windows
>> knowledge to have any idea whether it's right or wrong, but hopefully
>> someone who knows Windows will jump in here.

> In commitfest now.

FWIW, I'm a tad suspicious of the notion that it's our job to make this
case work.  How practical is it really to run a Windows release on
unsupported-by-Microsoft hardware --- aren't dozens of other programs
going to have the same issue?

I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
the code compiles on *exactly one* MSVC version.  Maybe that's actually
what's needed, but it sure looks fishy.  And what connection does the
build toolchain version have to the runtime environment anyway?

Likewise, how can we know that !IsWindows7SP1OrGreater() is the exactly
right runtime test?

Lastly, I'd like to see some discussion of what side effects
"_set_FMA3_enable(0);" has ... I rather doubt that it's really
a magic-elixir-against-crashes-with-no-downsides.  That would
give us some context to estimate the risks of this code executing
when it's not really needed.  Without that, I'd be worried that
this cure is worse than the disease because it breaks cases that
weren't broken before.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-13 Thread Christian Ullrich
On February 13, 2016 4:10:34 PM Tom Lane  wrote:

> Christian Ullrich  writes:
>> * Robert Haas wrote:
>>> Thanks for the report and patch.  Regrettably I haven't the Windows
>>> knowledge to have any idea whether it's right or wrong, but hopefully
>>> someone who knows Windows will jump in here.
>
>> In commitfest now.
>
> FWIW, I'm a tad suspicious of the notion that it's our job to make this
> case work.  How practical is it really to run a Windows release on
> unsupported-by-Microsoft hardware --- aren't dozens of other programs
> going to have the same issue?

Why would the hardware be unsupported? The problem occurs on new CPUs, not old 
ones, and even the OS (2008) is still in extended support until next year, IIRC.

> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
> the code compiles on *exactly one* MSVC version.  

The bug exists in only that compiler version's CRT, also, that is not the 
complete version number. There may be different builds somewhere, but they all 
start with 18.0. 

After all, MS is in the business of selling new compilers, not maintaining the 
old ones. 

> Maybe that's actually
> what's needed, but it sure looks fishy.  And what connection does the
> build toolchain version have to the runtime environment anyway?

The CRT version is tied to the compiler version. It has mainly to do with 
matching allocators.

> Likewise, how can we know that !IsWindows7SP1OrGreater() is the exactly
> right runtime test?

Because all sources, including Microsoft, say that AVX2 support was added in 
7SP1.

> Lastly, I'd like to see some discussion of what side effects
> "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> a magic-elixir-against-crashes-with-no-downsides.  

It tells the math library (in the CRT, no separate libm on Windows) not to use 
the AVX2-based implementations of log() and possibly other functions. AIUI, FMA 
means "fused multiply-add" and is apparently something that increases 
performance and accuracy in transcendental functions.

I can check the CRT source later today and figure out exactly what it does. 

Also, if you look at the link I sent, you will find that a member of the Visual 
C++ Libraries team at MS is the source for the workaround. They probably know 
what they are doing, present circumstances excepted. 

> That would
> give us some context to estimate the risks of this code executing
> when it's not really needed. 

Hence all the conditions. The problem is *certain* to occur under these 
specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2 when 
built with VS2013), and under no others, and these conditions flip the switch 
exactly then. 

> Without that, I'd be worried that
> this cure is worse than the disease because it breaks cases that
> weren't broken before.

Isn't that what the buildfarm is (among other things) for?

-- 
Christian Ullrich 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-13 Thread Yury Zhuravlev

Tom Lane wrote:

Lastly, I'd like to see some discussion of what side effects
"_set_FMA3_enable(0);" has ... I rather doubt that it's really
a magic-elixir-against-crashes-with-no-downsides.  That would
give us some context to estimate the risks of this code executing
when it's not really needed.  Without that, I'd be worried that
this cure is worse than the disease because it breaks cases that
weren't broken befor


I think I have already proposed to drop MSVC support? :)
Too many problems due to of one compiler...

PS Cry from the heart. I work too much with him. Never mind.

--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-13 Thread Christian Ullrich
* From: Christian Ullrich

> On February 13, 2016 4:10:34 PM Tom Lane  wrote:
> 
> > Christian Ullrich  writes:

> > Lastly, I'd like to see some discussion of what side effects
> > "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> > a magic-elixir-against-crashes-with-no-downsides.
> 
> It tells the math library (in the CRT, no separate libm on Windows)
> not to use the AVX2-based implementations of log() and possibly
> other functions. AIUI, FMA means "fused multiply-add" and is
> apparently something that increases performance and accuracy in
> transcendental functions.
> 
> I can check the CRT source later today and figure out exactly what
> it does.

OK, it turns out that the CRT source MS ships is not quite as complete as I 
thought it was (up until 2013, at least), so I had a look at the disassembly. 
When the library initializes, it checks whether the CPU supports the FMA 
instructions by looking at a certain bit in the CPUID result. If that is set, 
it sets a flag to use the FMA instructions. Later, in exp(), log(), pow() and 
the trigonometrical functions, it first checks whether that flag is set, and if 
so, uses the AVX-based implementation. If the flag is not set, it falls back to 
an SSE2-based one. So, yes, that function only and specifically disables the 
use of instructions that do not work in the problematic case.

The bug appears to be that it uses all manner of AVX and AVX2 instructions 
based only on the FMA support flag in CPUID, even though AVX2 has its own bit 
there.

To reiterate: The problem occurs because the library only asks the CPU whether 
it is *able* to perform the AVX instructions, but not whether it is *willing* 
to do so. In this particular situation, the former applies but not the latter, 
because the CPU needs OS support (saving the XMM/YMM registers across context 
switches), and the OS has not declared its support for that.

The downside to disabling the AVX implementations is a performance loss 
compared to using it. I ran a microbenchmark (avg(log(x) from 
generate_series(1,1e8))), and the result was that with FMA enabled, it is ~5.5% 
faster than without.

-- 
Christian


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-13 Thread Craig Ringer
On 13 February 2016 at 23:45, Christian Ullrich 
wrote:


>
> > Maybe that's actually
> > what's needed, but it sure looks fishy.  And what connection does the
> > build toolchain version have to the runtime environment anyway?
>
> The CRT version is tied to the compiler version. It has mainly to do with
> matching allocators.
>

In UNIX terms, the compiler and libc are tightly coupled on Windows and
each executable or shared library links to the C library provided by the
compiler that built the DLL or EXE. It's normal to link a program with
multiple C runtime libraries if that program is composed of one or more
DLLs that were each compiled by a different toolchain and therefore linked
to a different C runtime ... which is common in the Windows binary
distribution model.

"msvcrt.dll" is the best-known C runtime, the one bundled in Windows. It's
ancient and buggy and retained only for backward compatibility. mingw has
to pile hacks upon hacks around it to use it.

Microsoft has been bundling the C runtime libraries with the compiler since
at least Visual Studio 7. Application installers are expected to silently
run an installer for the C runtime library before installing the
application. They do not use the C runtime library from the operating
system.

You might have seen "Visual Studio Redistributable" packages installed by
various app installers. That's what they are - they're making sure the
specific C runtime they are built against is installed.

It's a bit like, on UNIX, each shared library expecting to have a different
libc. One expects libc-2.20.so, one expects libc-2.19.so, etc. Each with
linkage private to the shared library that loaded it. (We have this to a
lesser degree with libgcc, but everyone uses one version of gcc for
building everything on a given system so in practice it makes no
difference.)

This is why applications on Windows must be very careful not to malloc()
memory in one DLL and free() it in another, etc. They're effectively
leaking it and freeing unallocated memory at the same time because the two
C libraries have separate memory management structures.

It lets applications be composed of binary-distributed components from
different vendors, using different toolchains. Want to build new parts of
an app with VS 2017 while compiling your legacy components with VS 7,
linking with a libabc built using mingw32 by a vendor who won't show you
the source, and linking a libxyz built with an ancient version of Borland C
from sources you lost 5 years ago? You can do that.

I don't pretend it's a good idea, but it does have advantages sometimes.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] Crash with old Windows on new CPU

2016-02-12 Thread Christian Ullrich

Hello,

I just found a compatibility issue when I was migrating an elderly VM to 
a new host. The VM is running Windows Server 2008 SP2, and it has the 
EDB build of PostgreSQL 9.4.5 on it. (9.4.6 behaves the same.) It is 
also not dependent on running in a VM; it would fail on the hardware as 
well.


Backends (and possibly other processes) crash at the slightest 
provocation, such as "SELECT * FROM pg_stat_activity;" or VACUUM. The 
log says either "exception 0xC005" (segfault) or "exception 
0xC01D" (illegal instruction).


The interesting reason: The old host had a Core-generation CPU, which 
does not support the AVX2 instruction set. The new one has a 
Haswell-generation one, and this one does. The EDB distribution of 9.4 
was built with the Visual Studio 2013 compiler, whose CRT (aka libc) has 
a bug where it uses AVX2 instructions if the *CPU* supports them, but 
does not care whether the *OS* does, and 2008 doesn't. That support was 
added in SP1 for 7/2008R2.


There is a workaround, see 
. 
It consists of a single function call, required only if the OS does not 
support AVX2.


I just tried it, and it appears to work. If there is any interest in 
fixing this, I'll be happy to prepare a patch. (Where would be the best 
place to put a function call from  that has to be done during 
startup of each server process, on Windows only?)


Otherwise, it may be time to update the manual (15.6 Supported 
Platforms) where it says PostgreSQL "can be expected to work on these 
operating systems: [...] Windows (Win2000 SP4 and later), [...]". 
Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when 
running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and 
later)"?


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-12 Thread Christian Ullrich

* Robert Haas wrote:


On Fri, Feb 12, 2016 at 7:26 PM, Christian Ullrich  wrote:



startup_hacks(), I think. Proposed patch attached.


Thanks for the report and patch.  Regrettably I haven't the Windows
knowledge to have any idea whether it's right or wrong, but hopefully
someone who knows Windows will jump in here.


In commitfest now.

--
Christian




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash with old Windows on new CPU

2016-02-12 Thread Christian Ullrich

* Christian Ullrich wrote:


Backends (and possibly other processes) crash at the slightest
provocation, such as "SELECT * FROM pg_stat_activity;" or VACUUM. The
log says either "exception 0xC005" (segfault) or "exception
0xC01D" (illegal instruction).

The interesting reason: The old host had a Core-generation CPU, which
does not support the AVX2 instruction set. The new one has a
Haswell-generation one, and this one does. The EDB distribution of 9.4
was built with the Visual Studio 2013 compiler, whose CRT (aka libc) has
a bug where it uses AVX2 instructions if the *CPU* supports them, but
does not care whether the *OS* does, and 2008 doesn't. That support was
added in SP1 for 7/2008R2.



I just tried it, and it appears to work. If there is any interest in
fixing this, I'll be happy to prepare a patch. (Where would be the best
place to put a function call from  that has to be done during
startup of each server process, on Windows only?)


startup_hacks(), I think. Proposed patch attached.

--
Christian
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 19176b1..4be21a0 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -26,6 +26,13 @@
 #include 
 #endif
 
+#ifdef _WIN32
+#if _MSC_VER == 1800
+#include 
+#include 
+#endif
+#endif
+
 #include "bootstrap/bootstrap.h"
 #include "common/username.h"
 #include "postmaster/postmaster.h"
@@ -263,6 +270,22 @@ startup_hacks(const char *progname)
 
/* In case of general protection fault, don't show GUI popup 
box */
SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX);
+
+#ifdef _M_AMD64
+#if _MSC_VER == 1800
+   /*
+* Avoid crashing in certain floating-point operations if
+* we were compiled for x64 with MS Visual Studio 2013 and
+* are running on Windows prior to 7/2008R2 SP1 on an 
+* AVX2-capable CPU.
+*/
+   if (!IsWindows7SP1OrGreater())
+   {
+   _set_FMA3_enable(0);
+   }
+#endif /* _MSC_VER == 1800 */
+#endif /* _M_AMD64 */
+
}
 #endif   /* WIN32 */
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers