Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-07 Thread Tom Lane
Amit Kapila  writes:
> On Tue, Jun 6, 2017 at 10:14 PM, Tom Lane  wrote:
>> By definition, the address range we're trying to reuse worked successfully
>> in the postmaster process.  I don't see how forcing a specific address
>> could do anything but create an additional risk of postmaster startup
>> failure.

> I think it won't create an additional risk, because the idea is that
> if we fail to map the shm segment at a predefined address, then we
> will allow the system to choose the initial address as we are doing
> now.  So, it can reduce chances of doing retries.

[ shrug... ]  That would just make the patch even more complicated and
hard to test.  And it doesn't do anything to fix the ASLR issue.
Could we get on with trying to test something that *does* fix the
ASLR issue, like the draft patch I posted upthread?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-07 Thread Amit Kapila
On Tue, Jun 6, 2017 at 10:14 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think the idea of retrying process creation (and I definitely agree
>> with Tom and Magnus that we have to retry process creation, not just
>> individual mappings) is a good place to start.  Now if we find that we
>> are having to retry frequently, then I think we might need to try
>> something along the lines of what Andres proposed and what nginx
>> apparently did.  However, any fixed address will be prone to
>> occasional failures (or maybe, on some systems, regular failures) if
>> that particular address happens to get claimed by something.  I don't
>> think we can say that there is any address where that definitely won't
>> happen.  So I would say let's do this retry thing first, and then if
>> that proves inadequate, we can also try moving the mappings to a range
>> where conflicts are less likely.
>
> By definition, the address range we're trying to reuse worked successfully
> in the postmaster process.  I don't see how forcing a specific address
> could do anything but create an additional risk of postmaster startup
> failure.
>

I think it won't create an additional risk, because the idea is that
if we fail to map the shm segment at a predefined address, then we
will allow the system to choose the initial address as we are doing
now.  So, it can reduce chances of doing retries.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-06 Thread Robert Haas
On Tue, Jun 6, 2017 at 1:27 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jun 6, 2017 at 12:44 PM, Tom Lane  wrote:
>>> By definition, the address range we're trying to reuse worked successfully
>>> in the postmaster process.  I don't see how forcing a specific address
>>> could do anything but create an additional risk of postmaster startup
>>> failure.
>
>> If the postmaster picked an address where other things are unlikely to
>> get loaded, then that would increase the chances of child processes
>> finding it available, wouldn't it?
>
> But how would we know that a particular address range is more unlikely
> than others to have a conflict? (And even if we do know that, what
> happens when there is a conflict anyway?)  I sure don't want to be in
> the business of figuring out what to use across all the different Windows
> versions there are, to say nothing of the different antivirus products
> that might be causing the problem.

I'm not quite sure how to respond to this.  How do we know any piece
of information about anything, ever?  Sometimes we figure it out by
looking for relevant sources using, say, Google, and other times we
determine it by experiment or from first principles.  You and Andres
were having a discussion earlier about gathering this exact
information, so apparently you thought it might be possible back then:

https://www.postgresql.org/message-id/4180.1492292046%40sss.pgh.pa.us

Now, that having been said, I agree that no address range is perfectly
safe (and that's why it's good to have retries).  I also agree that
this is likely to be heavily platform-dependent, which is why I wrote
DSM the way that I did instead of (as Noah was advocating) trying to
solve the problem of getting a constant mapping across all processes
in a parallel group.  But since nobody's keen on the idea of trying to
tolerate having the main shared memory segment at different addresses
in different processes, we'll have to come up with some other solution
for that case.  If the retry thing doesn't plug the hole adequately,
trying to put the initial allocation in some place that's less likely
to induce conflicts seems like the next thing to try.

> Also, the big picture here is that we ought to be working towards allowing
> our Windows builds to use ASLR; our inability to support that is not
> something to be proud of in 2017.  No predetermined-address scheme is
> likely to be helpful for that.

Sure, retrying is better for that as well, as I already said upthread,
but that doesn't mean that putting the initial mapping someplace less
likely to conflict couldn't reduce the need for retries.

The even-bigger picture here is that both this issue and the need for
DSA and DSM are due to the fact that we've picked an unpopular
programming model with poor operating system support.   If we switch
from using processes to using threads, we don't have to deal with all
of this nonsense any more, and we'd solve some other problems, too -
e.g. at least on Windows, I think backend startup would get quite a
bit faster.  Obviously, anybody else who is using processes + shared
memory is going to run into the same problems we're hitting, and if
operating system manufacturers wanted to make this kind of programming
easy, they could do it.  We're expending a lot of effort here because
we're swimming against the current.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-06 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jun 6, 2017 at 12:44 PM, Tom Lane  wrote:
>> By definition, the address range we're trying to reuse worked successfully
>> in the postmaster process.  I don't see how forcing a specific address
>> could do anything but create an additional risk of postmaster startup
>> failure.

> If the postmaster picked an address where other things are unlikely to
> get loaded, then that would increase the chances of child processes
> finding it available, wouldn't it?

But how would we know that a particular address range is more unlikely
than others to have a conflict?  (And even if we do know that, what
happens when there is a conflict anyway?)  I sure don't want to be in
the business of figuring out what to use across all the different Windows
versions there are, to say nothing of the different antivirus products
that might be causing the problem.

Also, the big picture here is that we ought to be working towards allowing
our Windows builds to use ASLR; our inability to support that is not
something to be proud of in 2017.  No predetermined-address scheme is
likely to be helpful for that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-06 Thread Robert Haas
On Tue, Jun 6, 2017 at 12:44 PM, Tom Lane  wrote:
> By definition, the address range we're trying to reuse worked successfully
> in the postmaster process.  I don't see how forcing a specific address
> could do anything but create an additional risk of postmaster startup
> failure.

If the postmaster picked an address where other things are unlikely to
get loaded, then that would increase the chances of child processes
finding it available, wouldn't it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-06 Thread Tom Lane
Robert Haas  writes:
> I think the idea of retrying process creation (and I definitely agree
> with Tom and Magnus that we have to retry process creation, not just
> individual mappings) is a good place to start.  Now if we find that we
> are having to retry frequently, then I think we might need to try
> something along the lines of what Andres proposed and what nginx
> apparently did.  However, any fixed address will be prone to
> occasional failures (or maybe, on some systems, regular failures) if
> that particular address happens to get claimed by something.  I don't
> think we can say that there is any address where that definitely won't
> happen.  So I would say let's do this retry thing first, and then if
> that proves inadequate, we can also try moving the mappings to a range
> where conflicts are less likely.

By definition, the address range we're trying to reuse worked successfully
in the postmaster process.  I don't see how forcing a specific address
could do anything but create an additional risk of postmaster startup
failure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-06 Thread Robert Haas
On Mon, Jun 5, 2017 at 10:10 PM, Amit Kapila  wrote:
> Agreed.  By the way, while browsing about this problem, I found that
> one other open source (nginx) has used a solution similar to what
> Andres was proposing upthread to solve this problem.  Refer:
> https://github.com/nginx/nginx/commit/2ec8cfcd7d34415a99c3f3db3024ae954c00d0dd
>
> Just to be clear, by above, I don't mean to say that if some other
> open source is using some solution, we should also use it, but I think
> it is worth considering (especially if it is a proven solution - just
> saying based on the time (2015) it has been committed and sustained in
> the code).

I think the idea of retrying process creation (and I definitely agree
with Tom and Magnus that we have to retry process creation, not just
individual mappings) is a good place to start.  Now if we find that we
are having to retry frequently, then I think we might need to try
something along the lines of what Andres proposed and what nginx
apparently did.  However, any fixed address will be prone to
occasional failures (or maybe, on some systems, regular failures) if
that particular address happens to get claimed by something.  I don't
think we can say that there is any address where that definitely won't
happen.  So I would say let's do this retry thing first, and then if
that proves inadequate, we can also try moving the mappings to a range
where conflicts are less likely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-05 Thread Amit Kapila
On Mon, Jun 5, 2017 at 7:26 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> Sure.  I think it is slightly tricky because specs don't say clearly
>> how ASLR can impact the behavior of any API and in my last attempt I
>> could not reproduce the issue.
>
>> I can try to do basic verification with the patch you have proposed,
>> but I fear, to do the actual tests as suggested by you is difficult as
>> it is not reproducible on my machine, but I can still try.
>
> Yeah, being able to reproduce the problem reliably enough to say whether
> it's fixed or not is definitely the sticking point here.

Agreed.  By the way, while browsing about this problem, I found that
one other open source (nginx) has used a solution similar to what
Andres was proposing upthread to solve this problem.  Refer:
https://github.com/nginx/nginx/commit/2ec8cfcd7d34415a99c3f3db3024ae954c00d0dd

Just to be clear, by above, I don't mean to say that if some other
open source is using some solution, we should also use it, but I think
it is worth considering (especially if it is a proven solution - just
saying based on the time (2015) it has been committed and sustained in
the code).

>  I have some
> ideas about that:
>
> 1. Be sure to use Win32 not Win64 --- the odds of a failure in the larger
> address space are so small you'd never prove anything.  (And of course
> it has to be a version that has ASLR enabled.)
>

I don't have access to Win32, so I request others who are reading this
and have access to Win32 to see if they can help in reproducing the
problem.

> 2. Revert 7f3e17b48 so that you have an ASLR-enabled build.
>
> 3. Crank shared_buffers to the maximum the machine will allow, reducing
> the amount of free address space and improving the odds of a collision.
>
> 4. Spawn lots of sessions --- pgbench with -C option might be a useful
> testing tool.
>

All are very good points and I think together they will certainly
improve the chances of reproducing this problem.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-05 Thread Tom Lane
Amit Kapila  writes:
> Sure.  I think it is slightly tricky because specs don't say clearly
> how ASLR can impact the behavior of any API and in my last attempt I
> could not reproduce the issue.

> I can try to do basic verification with the patch you have proposed,
> but I fear, to do the actual tests as suggested by you is difficult as
> it is not reproducible on my machine, but I can still try.

Yeah, being able to reproduce the problem reliably enough to say whether
it's fixed or not is definitely the sticking point here.  I have some
ideas about that:

1. Be sure to use Win32 not Win64 --- the odds of a failure in the larger
address space are so small you'd never prove anything.  (And of course
it has to be a version that has ASLR enabled.)

2. Revert 7f3e17b48 so that you have an ASLR-enabled build.

3. Crank shared_buffers to the maximum the machine will allow, reducing
the amount of free address space and improving the odds of a collision.

4. Spawn lots of sessions --- pgbench with -C option might be a useful
testing tool.

With luck, that will get failures often enough that you can be pretty
sure whether a patch improves the situation or not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-05 Thread Magnus Hagander
On Mon, Jun 5, 2017 at 1:35 PM, Amit Kapila  wrote:

> On Mon, Jun 5, 2017 at 4:58 PM, Magnus Hagander 
> wrote:
> > On Mon, Jun 5, 2017 at 1:16 PM, Amit Kapila 
> wrote:
> >>
> >> On Mon, Jun 5, 2017 at 9:15 AM, Tom Lane  wrote:
> >> > Amit Kapila  writes:
> >> >
> >> >> I think the same problem can happen during reattach as well.
> >> >> Basically, MapViewOfFileEx can fail to load image at predefined
> >> >> address (UsedShmemSegAddr).
> >> >
> >> > Once we've successfully done the VirtualAllocEx call, that should hold
> >> > until the VirtualFree call in PGSharedMemoryReAttach, allowing the
> >> > immediately-following MapViewOfFileEx call to succeed.  Were that not
> >> > the
> >> > case, we'd have had problems even without ASLR.  We did have problems
> >> > exactly like that before the pgwin32_ReserveSharedMemoryRegion code
> was
> >> > added.
> >> >
> >>
> >> I could not find anything directly in specs which could prove the
> >> theory either way.  However, in one of the StackOverflow discussions,
> >> it has been indicated that MapViewOfFile can opt to load the image at
> >> an address different than the predefined address due to ASLR.
> >>
> >> >  So my feeling about this is that retrying the process creation as
> >> > in my prototype patch ought to be sufficient; if you think it isn't,
> the
> >> > burden of proof is on you.
> >> >
> >>
> >> Sure.  I think it is slightly tricky because specs don't say clearly
> >> how ASLR can impact the behavior of any API and in my last attempt I
> >> could not reproduce the issue.
> >>
> >> I can try to do basic verification with the patch you have proposed,
> >> but I fear, to do the actual tests as suggested by you is difficult as
> >> it is not reproducible on my machine, but I can still try.
> >>
> >>
> >> [1] -
> >> https://stackoverflow.com/questions/9718616/what-does-
> mapviewoffile-return/11233456
> >> Refer below text:
> >>
> >> "Yes, MapViewOfFile returns the virtual memory base address where the
> >> image has been loaded. The value (content) of this address depends on
> >> whether the image has been successfully loaded at its predefined
> >> address (which has been setup by the linker) or whether the image has
> >> been relocated (because the desired, predefined address is already
> >> occupied or because the image has opt-in to support ASLR)."
> >>
> >
> > That statements refers to mapping executables though, like DLL and EXE.
> Not
> > mapping of data segments.
> >
> > It does randomize the entire location of the heap, in which case it might
> > also change. But not for the individual block.
> >
> > But in neither of those cases does it help to retry without restarting
> the
> > process,
> >
>
> Okay, the question here is do we need some handling during reattach
> operation where we do MapViewOfFileEx at the predefined location?
>
>
As long as we restart in a new process, I don't think so. ASLR does not
randomize away our addresses *after we've been given them*. It does it at
process (or thread start), so as long as we get it initially, I think we
are safe.

(Obviously nobody can be 100% sure without seeing the source code for it,
but all references I've seen to the implementation seem to indicate that)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-05 Thread Amit Kapila
On Mon, Jun 5, 2017 at 4:58 PM, Magnus Hagander  wrote:
> On Mon, Jun 5, 2017 at 1:16 PM, Amit Kapila  wrote:
>>
>> On Mon, Jun 5, 2017 at 9:15 AM, Tom Lane  wrote:
>> > Amit Kapila  writes:
>> >
>> >> I think the same problem can happen during reattach as well.
>> >> Basically, MapViewOfFileEx can fail to load image at predefined
>> >> address (UsedShmemSegAddr).
>> >
>> > Once we've successfully done the VirtualAllocEx call, that should hold
>> > until the VirtualFree call in PGSharedMemoryReAttach, allowing the
>> > immediately-following MapViewOfFileEx call to succeed.  Were that not
>> > the
>> > case, we'd have had problems even without ASLR.  We did have problems
>> > exactly like that before the pgwin32_ReserveSharedMemoryRegion code was
>> > added.
>> >
>>
>> I could not find anything directly in specs which could prove the
>> theory either way.  However, in one of the StackOverflow discussions,
>> it has been indicated that MapViewOfFile can opt to load the image at
>> an address different than the predefined address due to ASLR.
>>
>> >  So my feeling about this is that retrying the process creation as
>> > in my prototype patch ought to be sufficient; if you think it isn't, the
>> > burden of proof is on you.
>> >
>>
>> Sure.  I think it is slightly tricky because specs don't say clearly
>> how ASLR can impact the behavior of any API and in my last attempt I
>> could not reproduce the issue.
>>
>> I can try to do basic verification with the patch you have proposed,
>> but I fear, to do the actual tests as suggested by you is difficult as
>> it is not reproducible on my machine, but I can still try.
>>
>>
>> [1] -
>> https://stackoverflow.com/questions/9718616/what-does-mapviewoffile-return/11233456
>> Refer below text:
>>
>> "Yes, MapViewOfFile returns the virtual memory base address where the
>> image has been loaded. The value (content) of this address depends on
>> whether the image has been successfully loaded at its predefined
>> address (which has been setup by the linker) or whether the image has
>> been relocated (because the desired, predefined address is already
>> occupied or because the image has opt-in to support ASLR)."
>>
>
> That statements refers to mapping executables though, like DLL and EXE. Not
> mapping of data segments.
>
> It does randomize the entire location of the heap, in which case it might
> also change. But not for the individual block.
>
> But in neither of those cases does it help to retry without restarting the
> process,
>

Okay, the question here is do we need some handling during reattach
operation where we do MapViewOfFileEx at the predefined location?




-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-05 Thread Magnus Hagander
On Mon, Jun 5, 2017 at 1:16 PM, Amit Kapila  wrote:

> On Mon, Jun 5, 2017 at 9:15 AM, Tom Lane  wrote:
> > Amit Kapila  writes:
> >
> >> I think the same problem can happen during reattach as well.
> >> Basically, MapViewOfFileEx can fail to load image at predefined
> >> address (UsedShmemSegAddr).
> >
> > Once we've successfully done the VirtualAllocEx call, that should hold
> > until the VirtualFree call in PGSharedMemoryReAttach, allowing the
> > immediately-following MapViewOfFileEx call to succeed.  Were that not the
> > case, we'd have had problems even without ASLR.  We did have problems
> > exactly like that before the pgwin32_ReserveSharedMemoryRegion code was
> > added.
> >
>
> I could not find anything directly in specs which could prove the
> theory either way.  However, in one of the StackOverflow discussions,
> it has been indicated that MapViewOfFile can opt to load the image at
> an address different than the predefined address due to ASLR.
>
> >  So my feeling about this is that retrying the process creation as
> > in my prototype patch ought to be sufficient; if you think it isn't, the
> > burden of proof is on you.
> >
>
> Sure.  I think it is slightly tricky because specs don't say clearly
> how ASLR can impact the behavior of any API and in my last attempt I
> could not reproduce the issue.
>
> I can try to do basic verification with the patch you have proposed,
> but I fear, to do the actual tests as suggested by you is difficult as
> it is not reproducible on my machine, but I can still try.
>
>
> [1] - https://stackoverflow.com/questions/9718616/what-does-
> mapviewoffile-return/11233456
> Refer below text:
>
> "Yes, MapViewOfFile returns the virtual memory base address where the
> image has been loaded. The value (content) of this address depends on
> whether the image has been successfully loaded at its predefined
> address (which has been setup by the linker) or whether the image has
> been relocated (because the desired, predefined address is already
> occupied or because the image has opt-in to support ASLR)."
>
>
That statements refers to mapping executables though, like DLL and EXE. Not
mapping of data segments.

It does randomize the entire location of the heap, in which case it might
also change. But not for the individual block.

But in neither of those cases does it help to retry without restarting the
process, because the heap will be mapped into the same place, and any DLLs
loading prior to our code will already have been loaded.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-05 Thread Amit Kapila
On Mon, Jun 5, 2017 at 9:15 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>
>> I think the same problem can happen during reattach as well.
>> Basically, MapViewOfFileEx can fail to load image at predefined
>> address (UsedShmemSegAddr).
>
> Once we've successfully done the VirtualAllocEx call, that should hold
> until the VirtualFree call in PGSharedMemoryReAttach, allowing the
> immediately-following MapViewOfFileEx call to succeed.  Were that not the
> case, we'd have had problems even without ASLR.  We did have problems
> exactly like that before the pgwin32_ReserveSharedMemoryRegion code was
> added.
>

I could not find anything directly in specs which could prove the
theory either way.  However, in one of the StackOverflow discussions,
it has been indicated that MapViewOfFile can opt to load the image at
an address different than the predefined address due to ASLR.

>  So my feeling about this is that retrying the process creation as
> in my prototype patch ought to be sufficient; if you think it isn't, the
> burden of proof is on you.
>

Sure.  I think it is slightly tricky because specs don't say clearly
how ASLR can impact the behavior of any API and in my last attempt I
could not reproduce the issue.

I can try to do basic verification with the patch you have proposed,
but I fear, to do the actual tests as suggested by you is difficult as
it is not reproducible on my machine, but I can still try.


[1] - 
https://stackoverflow.com/questions/9718616/what-does-mapviewoffile-return/11233456
Refer below text:

"Yes, MapViewOfFile returns the virtual memory base address where the
image has been loaded. The value (content) of this address depends on
whether the image has been successfully loaded at its predefined
address (which has been setup by the linker) or whether the image has
been relocated (because the desired, predefined address is already
occupied or because the image has opt-in to support ASLR)."



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-04 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Jun 5, 2017 at 4:00 AM, Tom Lane  wrote:
>> I took a quick look at this, and it seems rather beside the point.

> What I understood from the randomization shm allocation behavior due
> to ASLR is that when we try to allocate memory (say using
> VirtualAllocEx as in our case) at a pre-defined address, it will
> instead allocate it at a different address which can lead to a
> collision.

No.  The problem we're concerned about is that by the time we try
pgwin32_ReserveSharedMemoryRegion's call
VirtualAllocEx(hChild, UsedShmemSegAddr, UsedShmemSegSize,
   MEM_RESERVE, PAGE_READWRITE)
in the child process, something might have already taken that address
space in the child process, which would cause the call to fail not just
allocate some randomly different space.  That can't happen unless the base
executable, or ntdll.dll, or possibly something injected by an antivirus
product, has taken up address space different from what it took up in the
postmaster process.  What we are assuming here is that any such variation
is randomized, and so will probably be different in the next child process
launched by the postmaster, allowing a new process launch to possibly fix
the problem.  But once that address space is allocated in a given process,
no amount of merely retrying a new allocation request in that process is
going to make it go away.  You'd have to do something to free the existing
allocation, and we have no way to do that short of starting over.

> I think the same problem can happen during reattach as well.
> Basically, MapViewOfFileEx can fail to load image at predefined
> address (UsedShmemSegAddr).

Once we've successfully done the VirtualAllocEx call, that should hold
until the VirtualFree call in PGSharedMemoryReAttach, allowing the
immediately-following MapViewOfFileEx call to succeed.  Were that not the
case, we'd have had problems even without ASLR.  We did have problems
exactly like that before the pgwin32_ReserveSharedMemoryRegion code was
added.  So my feeling about this is that retrying the process creation as
in my prototype patch ought to be sufficient; if you think it isn't, the
burden of proof is on you.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-04 Thread Amit Kapila
On Mon, Jun 5, 2017 at 4:00 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> Okay, I have added the comment to explain the same.  I have also
>> modified the patch to adjust the looping as per your suggestion.
>
> I took a quick look at this, and it seems rather beside the point.
> You can't just loop inside an already-forked process and expect
> that address space that was previously unavailable will suddenly
> become available, when the problem is that the executable itself
> (or some shared library) is mapped into the space you want.
>

What I understood from the randomization shm allocation behavior due
to ASLR is that when we try to allocate memory (say using
VirtualAllocEx as in our case) at a pre-defined address, it will
instead allocate it at a different address which can lead to a
collision.  So, it seems to me the problem to solve here is not to
attempt to allocate the address space which is allocated to another
library,  rather negate the impact of randomized address allocation
behavior.  So I thought retrying to allocate space at predefined
address is sufficient.

> What we have to do in order to retry is to fork an entire new
> process, whose address space will hopefully get laid out differently.
>
> While it's possible we could do that in a platform-independent
> way, it would be pretty invasive and I don't see the value.
> The implementation I'd had in mind originally was to put a loop
> inside postmaster.c's internal_forkexec (win32 version), such
> that if pgwin32_ReserveSharedMemoryRegion failed, it would
> terminate/close up the subprocess as it already does, but then
> go back around and do another CreateProcess.  Perhaps it's as
> simple as the attached, but I'm not in a position to test it.
>

I think the same problem can happen during reattach as well.
Basically, MapViewOfFileEx can fail to load image at predefined
address (
UsedShmemSegAddr).

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-04 Thread Tom Lane
Amit Kapila  writes:
> Okay, I have added the comment to explain the same.  I have also
> modified the patch to adjust the looping as per your suggestion.

I took a quick look at this, and it seems rather beside the point.
You can't just loop inside an already-forked process and expect
that address space that was previously unavailable will suddenly
become available, when the problem is that the executable itself
(or some shared library) is mapped into the space you want.
What we have to do in order to retry is to fork an entire new
process, whose address space will hopefully get laid out differently.

While it's possible we could do that in a platform-independent
way, it would be pretty invasive and I don't see the value.
The implementation I'd had in mind originally was to put a loop
inside postmaster.c's internal_forkexec (win32 version), such
that if pgwin32_ReserveSharedMemoryRegion failed, it would
terminate/close up the subprocess as it already does, but then
go back around and do another CreateProcess.  Perhaps it's as
simple as the attached, but I'm not in a position to test it.

Assuming this passes minimal smoke testing, the way to actually test
it is to undo the hack in the MSVC build scripts that disables ASLR,
and then (on a Windows version where ASLR is active) run it until
you've seen a few occurrences of the retry, which should be detectable
by looking for bleats from pgwin32_ReserveSharedMemoryRegion
in the postmaster log.

regards, tom lane

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 35b4ec8..e30b0c6 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*** internal_forkexec(int argc, char *argv[]
*** 4487,4492 
--- 4487,4493 
  static pid_t
  internal_forkexec(int argc, char *argv[], Port *port)
  {
+ 	int			retry_count = 0;
  	STARTUPINFO si;
  	PROCESS_INFORMATION pi;
  	int			i;
*** internal_forkexec(int argc, char *argv[]
*** 4504,4509 
--- 4505,4513 
  	Assert(strncmp(argv[1], "--fork", 6) == 0);
  	Assert(argv[2] == NULL);
  
+ 	/* Resume here if we need to retry */
+ retry:
+ 
  	/* Set up shared memory for parameter passing */
  	ZeroMemory(, sizeof(sa));
  	sa.nLength = sizeof(sa);
*** internal_forkexec(int argc, char *argv[]
*** 4595,4616 
  
  	/*
  	 * Reserve the memory region used by our main shared memory segment before
! 	 * we resume the child process.
  	 */
  	if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
  	{
! 		/*
! 		 * Failed to reserve the memory, so terminate the newly created
! 		 * process and give up.
! 		 */
  		if (!TerminateProcess(pi.hProcess, 255))
  			ereport(LOG,
  	(errmsg_internal("could not terminate process that failed to reserve memory: error code %lu",
  	 GetLastError(;
  		CloseHandle(pi.hProcess);
  		CloseHandle(pi.hThread);
! 		return -1;/* logging done made by
!  * pgwin32_ReserveSharedMemoryRegion() */
  	}
  
  	/*
--- 4599,4624 
  
  	/*
  	 * Reserve the memory region used by our main shared memory segment before
! 	 * we resume the child process.  Normally this should succeed, but if ASLR
! 	 * is active then it might sometimes fail due to the executable having
! 	 * gotten mapped into that range.  In that case, just terminate the
! 	 * process and retry.
  	 */
  	if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
  	{
! 		/* pgwin32_ReserveSharedMemoryRegion already made a log entry */
  		if (!TerminateProcess(pi.hProcess, 255))
  			ereport(LOG,
  	(errmsg_internal("could not terminate process that failed to reserve memory: error code %lu",
  	 GetLastError(;
  		CloseHandle(pi.hProcess);
  		CloseHandle(pi.hThread);
! 		if (++retry_count < 100)
! 			goto retry;
! 		ereport(LOG,
! 		  (errmsg("giving up after too many tries to reserve shared memory"),
! 		   errhint("This might be caused by ASLR or antivirus software.")));
! 		return -1;
  	}
  
  	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-03 Thread Amit Kapila
On Fri, Jun 2, 2017 at 7:20 PM, Petr Jelinek
 wrote:
> On 02/06/17 15:37, Amit Kapila wrote:
>>
>> No, it is to avoid calling free of memory which is not reserved on
>> retry.  See the comment:
>> + * On the first try, release memory region reservation that was made by
>> + * the postmaster.
>>
>> Are you referring to the same function in sysv_shm.c, if so probably I
>> can say refer the same API in win32_shmem.c or maybe add a similar
>> comment there as well?
>>
>
> Yeah something like that would help, but my main confusion comes from
> the fact that there is counter (and even named as such) but only
> relevant difference is 0 and not 0. I'd like mention of that mainly
> since I was confused by that on the first read.
>

Okay, I have added the comment to explain the same.  I have also
modified the patch to adjust the looping as per your suggestion.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


win_shm_retry_reattach_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-02 Thread Petr Jelinek
On 02/06/17 15:37, Amit Kapila wrote:
> On Thu, Jun 1, 2017 at 10:36 PM, Petr Jelinek
>  wrote:
>> On 01/06/17 15:25, Tom Lane wrote:
>>> Robert Haas  writes:
 So, are you going to, perhaps, commit this?  Or who is picking this up?
>>>
 /me knows precious little about Windows.
>>>
>>> I'm not going to be the one to commit this either, but seems like someone
>>> should.
>>>
>>
>> The new code does not use any windows specific APIs or anything, it just
>> adds retry logic for reattaching when we do EXEC_BACKEND which seems to
>> be agreed way of solving this. I do have couple of comments about the
>> code though.
>>
>> The new parameter retry_count in PGSharedMemoryReAttach() seems to be
>> only used to decide if to log reattach issues so that we don't spam log
>> when retrying, but this fact is not mentioned anywhere.
>>
> 
> No, it is to avoid calling free of memory which is not reserved on
> retry.  See the comment:
> + * On the first try, release memory region reservation that was made by
> + * the postmaster.
> 
> Are you referring to the same function in sysv_shm.c, if so probably I
> can say refer the same API in win32_shmem.c or maybe add a similar
> comment there as well?
> 

Yeah something like that would help, but my main confusion comes from
the fact that there is counter (and even named as such) but only
relevant difference is 0 and not 0. I'd like mention of that mainly
since I was confused by that on the first read.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-02 Thread Amit Kapila
On Thu, Jun 1, 2017 at 10:36 PM, Petr Jelinek
 wrote:
> On 01/06/17 15:25, Tom Lane wrote:
>> Robert Haas  writes:
>>> So, are you going to, perhaps, commit this?  Or who is picking this up?
>>
>>> /me knows precious little about Windows.
>>
>> I'm not going to be the one to commit this either, but seems like someone
>> should.
>>
>
> The new code does not use any windows specific APIs or anything, it just
> adds retry logic for reattaching when we do EXEC_BACKEND which seems to
> be agreed way of solving this. I do have couple of comments about the
> code though.
>
> The new parameter retry_count in PGSharedMemoryReAttach() seems to be
> only used to decide if to log reattach issues so that we don't spam log
> when retrying, but this fact is not mentioned anywhere.
>

No, it is to avoid calling free of memory which is not reserved on
retry.  See the comment:
+ * On the first try, release memory region reservation that was made by
+ * the postmaster.

Are you referring to the same function in sysv_shm.c, if so probably I
can say refer the same API in win32_shmem.c or maybe add a similar
comment there as well?


> Also, I am not excited about following coding style:
>> + if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
>> + continue;
>> + else
>> + {
>
> Amit, if you want to avoid having to add the curly braces for single
> line while still having else, I'd invert the expression in the if ()
> statement so that true comes first. It's much less ugly to have curly
> braces part first and the continue statement in the else block IMHO.
>

I felt that it is easier to understand the code in the way it is
currently written, but I can invert the check if you find it is easier
to read and understand that way.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-02 Thread Noah Misch
On Fri, May 26, 2017 at 05:50:45PM +0530, Amit Kapila wrote:
> On Fri, May 26, 2017 at 5:30 AM, Tsunakawa, Takayuki 
>  wrote:
> >  I guessed that the reason Noah suggested 1 - 5 seconds of retry is based 
> > on the expectation that the address space might be freed by the anti-virus 
> > software.

No, I suggested it because I wouldn't seriously consider keeping an
installation where backend start takes 5s.  If the address conflicts are that
persistent, I'd fix the bug or switch operating systems.  Therefore, we may as
well let it fail at that duration, thereby showing the user what to
investigate.  Startup time of 0.2s, on the other hand, is noticeable but
usable; I'd prefer not to fail hard at that duration.

> Noah is also suggesting to have a retry count, read his mail above in
> this thread and refer to his comment ("Thus, measuring time is
> needless complexity; retry count is a suitable proxy.")

Right.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-01 Thread Petr Jelinek
On 01/06/17 15:25, Tom Lane wrote:
> Robert Haas  writes:
>> So, are you going to, perhaps, commit this?  Or who is picking this up?
> 
>> /me knows precious little about Windows.
> 
> I'm not going to be the one to commit this either, but seems like someone
> should.
> 

The new code does not use any windows specific APIs or anything, it just
adds retry logic for reattaching when we do EXEC_BACKEND which seems to
be agreed way of solving this. I do have couple of comments about the
code though.

The new parameter retry_count in PGSharedMemoryReAttach() seems to be
only used to decide if to log reattach issues so that we don't spam log
when retrying, but this fact is not mentioned anywhere.

Also, I am not excited about following coding style:
> + if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
> + continue;
> + else
> + {

Amit, if you want to avoid having to add the curly braces for single
line while still having else, I'd invert the expression in the if ()
statement so that true comes first. It's much less ugly to have curly
braces part first and the continue statement in the else block IMHO.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-01 Thread Tom Lane
Robert Haas  writes:
> So, are you going to, perhaps, commit this?  Or who is picking this up?

> /me knows precious little about Windows.

I'm not going to be the one to commit this either, but seems like someone
should.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-31 Thread Robert Haas
On Fri, May 26, 2017 at 10:51 AM, Magnus Hagander  wrote:
> I would definitely suggest putting it in HEAD (and thus, v10) for a while to
> get some real world exposure before backpatching. But if it does work out
> well in the end, then we can certainly consider backpatching it. But given
> the difficulty in reliably reproducing the problem etc, I think it's a good
> idea to give it some proper real world experience in 10 first.

So, are you going to, perhaps, commit this?  Or who is picking this up?

/me knows precious little about Windows.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-29 Thread Amit Kapila
On Fri, May 26, 2017 at 8:21 PM, Magnus Hagander  wrote:
>
> On Fri, May 26, 2017 at 8:24 AM, Michael Paquier 
> wrote:
>>
>> On Fri, May 26, 2017 at 8:20 AM, Amit Kapila 
>> wrote:
>> > I think the real question here is, shall we backpatch this fix or we
>> > want to do this just in Head or we want to consider it as a new
>> > feature for PostgreSQL-11.  I think it should be fixed in Head and the
>> > change seems harmless to me, so we should even backpatch it.
>>
>> The thing is not invasive, so backpatching is a low-risk move. We can
>> as well get that into HEAD first, wait a bit for dust to settle on it,
>> and then backpatch.
>
>
>
> I would definitely suggest putting it in HEAD (and thus, v10) for a while to
> get some real world exposure before backpatching.
>

make sense to me, so I have added an entry in "Older Bugs" section in
PostgreSQL 10 Open Items.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-26 Thread Magnus Hagander
On Fri, May 26, 2017 at 8:24 AM, Michael Paquier 
wrote:

> On Fri, May 26, 2017 at 8:20 AM, Amit Kapila 
> wrote:
> > I think the real question here is, shall we backpatch this fix or we
> > want to do this just in Head or we want to consider it as a new
> > feature for PostgreSQL-11.  I think it should be fixed in Head and the
> > change seems harmless to me, so we should even backpatch it.
>
> The thing is not invasive, so backpatching is a low-risk move. We can
> as well get that into HEAD first, wait a bit for dust to settle on it,
> and then backpatch.
>


I would definitely suggest putting it in HEAD (and thus, v10) for a while
to get some real world exposure before backpatching. But if it does work
out well in the end, then we can certainly consider backpatching it. But
given the difficulty in reliably reproducing the problem etc, I think it's
a good idea to give it some proper real world experience in 10 first.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-26 Thread Michael Paquier
On Fri, May 26, 2017 at 8:20 AM, Amit Kapila  wrote:
> I think the real question here is, shall we backpatch this fix or we
> want to do this just in Head or we want to consider it as a new
> feature for PostgreSQL-11.  I think it should be fixed in Head and the
> change seems harmless to me, so we should even backpatch it.

The thing is not invasive, so backpatching is a low-risk move. We can
as well get that into HEAD first, wait a bit for dust to settle on it,
and then backpatch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-26 Thread Amit Kapila
On Fri, May 26, 2017 at 5:30 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
>> Yes, I also share this opinion, the shm attach failures are due to
>> randomization behavior, so sleep won't help much.  So, I will change the
>> patch to use 100 retries unless people have other opinions.
>
> Perhaps I'm misunderstanding, but I thought it is not known yet whether the 
> cause of the original problem is ASLR.  I remember someone referred to 
> anti-virus software and something else.
>

We are here purposefully trying to resolve the randomize shm
allocation behavior due to ASLR.  The original failure was on a linux
machine and is resolved.  We presumably sometimes get the failures [1]
due to this behavior.

>  I guessed that the reason Noah suggested 1 - 5 seconds of retry is based on 
> the expectation that the address space might be freed by the anti-virus 
> software.
>

Noah is also suggesting to have a retry count, read his mail above in
this thread and refer to his comment ("Thus, measuring time is
needless complexity; retry count is a suitable proxy.")


I think the real question here is, shall we backpatch this fix or we
want to do this just in Head or we want to consider it as a new
feature for PostgreSQL-11.  I think it should be fixed in Head and the
change seems harmless to me, so we should even backpatch it.


[1] - https://www.postgresql.org/message-id/14121.1485360296%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-25 Thread Amit Kapila
On Thu, May 25, 2017 at 8:01 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> Yes, I also share this opinion, the shm attach failures are due to
>> randomization behavior, so sleep won't help much.  So, I will change
>> the patch to use 100 retries unless people have other opinions.
>
> Sounds about right to me.
>

Okay.  I have changed the retry count to 100 and modified few comments
in the attached patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


win_shm_retry_reattach_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-25 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
> Yes, I also share this opinion, the shm attach failures are due to
> randomization behavior, so sleep won't help much.  So, I will change the
> patch to use 100 retries unless people have other opinions.

Perhaps I'm misunderstanding, but I thought it is not known yet whether the 
cause of the original problem is ASLR.  I remember someone referred to 
anti-virus software and something else.  I guessed that the reason Noah 
suggested 1 - 5 seconds of retry is based on the expectation that the address 
space might be freed by the anti-virus software.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-25 Thread Tom Lane
Amit Kapila  writes:
> Yes, I also share this opinion, the shm attach failures are due to
> randomization behavior, so sleep won't help much.  So, I will change
> the patch to use 100 retries unless people have other opinions.

Sounds about right to me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-25 Thread Amit Kapila
On Thu, May 25, 2017 at 8:41 AM, Noah Misch  wrote:
> On Thu, May 25, 2017 at 11:41:19AM +0900, Michael Paquier wrote:
>
>> Indeed, pgrename() does so with a 100ms sleep time between each
>> iteration. Perhaps we could do that and limit to 50 iterations?
>
> pgrename() is polling for an asynchronous event, hence the sleep.  To my
> knowledge, time doesn't heal shm attach failures; therefore, a sleep is not
> appropriate here.
>

Yes, I also share this opinion, the shm attach failures are due to
randomization behavior, so sleep won't help much.  So, I will change
the patch to use 100 retries unless people have other opinions.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-24 Thread Noah Misch
On Thu, May 25, 2017 at 11:41:19AM +0900, Michael Paquier wrote:
> On Thu, May 25, 2017 at 11:34 AM, Tsunakawa, Takayuki
>  wrote:
> > From: pgsql-hackers-ow...@postgresql.org
> >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Noah Misch
> >> Ten feels low to me.  The value should be be low enough so users don't give
> >> up and assume a permanent hang, but there's little advantage to making it
> >> lower.
> >> I'd set it such that we give up in 1-5s on a modern Windows machine, which
> >> I expect implies a retry count of one hundred or more.
> >
> > Then, maybe we can measure the time in each iteration and give up after a 
> > particular seconds.

Exact duration is not important.  Giving up after 0.1s is needlessly early,
because a system taking that long to start a backend is still usable.  Giving
up after 50s is quite late.  In between those extremes, lots of durations
would be reasonable.  Thus, measuring time is needless complexity; retry count
is a suitable proxy.

> Indeed, pgrename() does so with a 100ms sleep time between each
> iteration. Perhaps we could do that and limit to 50 iterations?

pgrename() is polling for an asynchronous event, hence the sleep.  To my
knowledge, time doesn't heal shm attach failures; therefore, a sleep is not
appropriate here.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-24 Thread Michael Paquier
On Thu, May 25, 2017 at 11:34 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Noah Misch
>> Ten feels low to me.  The value should be be low enough so users don't give
>> up and assume a permanent hang, but there's little advantage to making it
>> lower.
>> I'd set it such that we give up in 1-5s on a modern Windows machine, which
>> I expect implies a retry count of one hundred or more.
>
> Then, maybe we can measure the time in each iteration and give up after a 
> particular seconds.

Indeed, pgrename() does so with a 100ms sleep time between each
iteration. Perhaps we could do that and limit to 50 iterations?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-24 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Noah Misch
> Ten feels low to me.  The value should be be low enough so users don't give
> up and assume a permanent hang, but there's little advantage to making it
> lower.
> I'd set it such that we give up in 1-5s on a modern Windows machine, which
> I expect implies a retry count of one hundred or more.

Then, maybe we can measure the time in each iteration and give up after a 
particular seconds.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-24 Thread Noah Misch
On Wed, May 24, 2017 at 09:29:11AM -0400, Michael Paquier wrote:
> On Tue, May 23, 2017 at 8:14 AM, Amit Kapila  wrote:
> > So it seems both you and Tom are leaning towards some sort of retry
> > mechanism for shm reattach on Windows.  I also think that is a viable
> > option to negate the impact of ASLR.  Attached patch does that.  Note
> > that, as I have mentioned above I think we need to do it for shm
> > reserve operation as well.  I think we need to decide how many retries
> > are sufficient before bailing out.  As of now, I have used 10 to have
> > some similarity with PGSharedMemoryCreate(), but we can choose some
> > different count as well.  One might say that we can have "number of
> > retries" as a guc parameter, but I am not sure about it, so not used.
> 
> New GUCs can be backpatched if necessary, though this does not seem
> necessary. Who is going to set up that anyway if we have a limit high
> enough. 10 looks like a sufficient number to me.

Ten feels low to me.  The value should be be low enough so users don't give up
and assume a permanent hang, but there's little advantage to making it lower.
I'd set it such that we give up in 1-5s on a modern Windows machine, which I
expect implies a retry count of one hundred or more.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-24 Thread Amit Kapila
On Wed, May 24, 2017 at 6:59 PM, Michael Paquier
 wrote:
> On Tue, May 23, 2017 at 8:14 AM, Amit Kapila  wrote:
>> So it seems both you and Tom are leaning towards some sort of retry
>> mechanism for shm reattach on Windows.  I also think that is a viable
>> option to negate the impact of ASLR.  Attached patch does that.  Note
>> that, as I have mentioned above I think we need to do it for shm
>> reserve operation as well.  I think we need to decide how many retries
>> are sufficient before bailing out.  As of now, I have used 10 to have
>> some similarity with PGSharedMemoryCreate(), but we can choose some
>> different count as well.  One might say that we can have "number of
>> retries" as a guc parameter, but I am not sure about it, so not used.
>
> New GUCs can be backpatched if necessary, though this does not seem
> necessary. Who is going to set up that anyway if we have a limit high
> enough. 10 looks like a sufficient number to me.
>

Okay.

>> Another point to consider is that do we want the same retry mechanism
>> for EXEC_BACKEND builds (function
>> PGSharedMemoryReAttach is different on Windows and EXEC_BACKEND
>> builds).  I think it makes sense to have retry mechanism for
>> EXEC_BACKEND builds, so done that way in the patch.  Yet another point
>> which needs some thought is for reattach operation, before retrying do
>> we want to reserve the shm by using VirtualAllocEx?
>
> -   elog(FATAL, "could not reattach to shared memory (key=%p,
> addr=%p): error code %lu",
> +   {
> +   elog(LOG, "could not reattach to shared memory (key=%p,
> addr=%p): error code %lu",
>  UsedShmemSegID, UsedShmemSegAddr, GetLastError());
> +   return false;
> +   }
> This should be a WARNING, with the attempt number reported as well?
>

I think for retry we just want to log, why you want to display it as a
warning?  During startup, other similar places (where we continue
startup even after the call has failed) also use LOG (refer
PGSharedMemoryDetach), so why do differently here?   However, I think
adding retry_count should be okay.

> -void
> -PGSharedMemoryReAttach(void)
> +bool
> +PGSharedMemoryReAttach(int retry_count)
> I think that the loop logic should be kept within
> PGSharedMemoryReAttach, this makes the code of postmaster.c cleaner,
>

Sure, we can do that, but then we need to repeat the same looping
logic in both sysv and win32 case.  Now, if decide not to do for the
sysv case, then it might make sense to consider it doing it in
function PGSharedMemoryReAttach().

> and it seems to me that each step of PGSharedMemoryReAttach() should
> be retried in order. Do we need also to worry about SysV? I agree with
> you that having consistency is better, but I don't recall seeing
> failures or complains related to cygwin for ASLR.
>

I am also not aware of Cygwin failures, but I think keeping the code
same for the cases where we are not using fork mechanism seems like an
advisable approach.  Also, if someone is testing EXEC_BACKEND on Linux
then randomization is 'on' by default, so one can hit this issue
during tests which doesn't matter much, but it still seems better to
have retry logic to prevent the issue.

> I think that you are forgetting PGSharedMemoryCreate in the retry process.
>

No, we don't need retry for PGSharedMemoryCreate as we need this only
we are trying to attach to some pre-reserved shared memory.  Do you
have something else in mind?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-24 Thread Michael Paquier
On Tue, May 23, 2017 at 8:14 AM, Amit Kapila  wrote:
> So it seems both you and Tom are leaning towards some sort of retry
> mechanism for shm reattach on Windows.  I also think that is a viable
> option to negate the impact of ASLR.  Attached patch does that.  Note
> that, as I have mentioned above I think we need to do it for shm
> reserve operation as well.  I think we need to decide how many retries
> are sufficient before bailing out.  As of now, I have used 10 to have
> some similarity with PGSharedMemoryCreate(), but we can choose some
> different count as well.  One might say that we can have "number of
> retries" as a guc parameter, but I am not sure about it, so not used.

New GUCs can be backpatched if necessary, though this does not seem
necessary. Who is going to set up that anyway if we have a limit high
enough. 10 looks like a sufficient number to me.

> Another point to consider is that do we want the same retry mechanism
> for EXEC_BACKEND builds (function
> PGSharedMemoryReAttach is different on Windows and EXEC_BACKEND
> builds).  I think it makes sense to have retry mechanism for
> EXEC_BACKEND builds, so done that way in the patch.  Yet another point
> which needs some thought is for reattach operation, before retrying do
> we want to reserve the shm by using VirtualAllocEx?

-   elog(FATAL, "could not reattach to shared memory (key=%p,
addr=%p): error code %lu",
+   {
+   elog(LOG, "could not reattach to shared memory (key=%p,
addr=%p): error code %lu",
 UsedShmemSegID, UsedShmemSegAddr, GetLastError());
+   return false;
+   }
This should be a WARNING, with the attempt number reported as well?

-void
-PGSharedMemoryReAttach(void)
+bool
+PGSharedMemoryReAttach(int retry_count)
I think that the loop logic should be kept within
PGSharedMemoryReAttach, this makes the code of postmaster.c cleaner,
and it seems to me that each step of PGSharedMemoryReAttach() should
be retried in order. Do we need also to worry about SysV? I agree with
you that having consistency is better, but I don't recall seeing
failures or complains related to cygwin for ASLR.

I think that you are forgetting PGSharedMemoryCreate in the retry process.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-05-23 Thread Amit Kapila
On Sat, May 20, 2017 at 5:56 PM, Noah Misch  wrote:
> On Sat, Apr 15, 2017 at 02:30:18PM -0700, Andres Freund wrote:
>> On 2017-04-15 17:24:54 -0400, Tom Lane wrote:
>> > Andres Freund  writes:
>> > > On 2017-04-15 17:09:38 -0400, Tom Lane wrote:
>> > >> Why doesn't Windows' ability to map the segment into the new process
>> > >> before it executes take care of that?
>> >
>> > > Because of ASLR of the main executable (i.e. something like PIE).
>> >
>> > Not following.  Are you saying that the main executable gets mapped into
>> > the process address space immediately, but shared libraries are not?
>
> At the time of the pgwin32_ReserveSharedMemoryRegion() call, the child process
> contains only ntdll.dll and the executable.
>
>> Without PIE/ASLR we can somewhat rely on pgwin32_ReserveSharedMemoryRegion
>> to find the space that PGSharedMemoryCreate allocated still unoccupied.
>
> I've never had access to a Windows system that can reproduce the fork
> failures.  My best theory is that antivirus or similar software injects an
> additional DLL at that early stage.
>
>> > I wonder whether we could work around that by just destroying the created
>> > process and trying again if we get a collision.  It'd be a tad
>> > inefficient, but hopefully collisions wouldn't happen often enough to be a
>> > big problem.
>>
>> That might work, although it's obviously not pretty.
>
> I didn't like that idea when Michael proposed it in 2015.  Since disabling
> ASLR on the exe proved insufficient, I do like it now.  It degrades nicely; if
> something raises the collision rate from 1% to 10%, that just looks like fork
> latency degrading.
>

So it seems both you and Tom are leaning towards some sort of retry
mechanism for shm reattach on Windows.  I also think that is a viable
option to negate the impact of ASLR.  Attached patch does that.  Note
that, as I have mentioned above I think we need to do it for shm
reserve operation as well.  I think we need to decide how many retries
are sufficient before bailing out.  As of now, I have used 10 to have
some similarity with PGSharedMemoryCreate(), but we can choose some
different count as well.  One might say that we can have "number of
retries" as a guc parameter, but I am not sure about it, so not used.
Another point to consider is that do we want the same retry mechanism
for EXEC_BACKEND builds (function
PGSharedMemoryReAttach is different on Windows and EXEC_BACKEND
builds).  I think it makes sense to have retry mechanism for
EXEC_BACKEND builds, so done that way in the patch.  Yet another point
which needs some thought is for reattach operation, before retrying do
we want to reserve the shm by using VirtualAllocEx?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


win_shm_retry_reattach_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-05-20 Thread Noah Misch
On Sat, Apr 15, 2017 at 02:30:18PM -0700, Andres Freund wrote:
> On 2017-04-15 17:24:54 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2017-04-15 17:09:38 -0400, Tom Lane wrote:
> > >> Why doesn't Windows' ability to map the segment into the new process
> > >> before it executes take care of that?
> > 
> > > Because of ASLR of the main executable (i.e. something like PIE).
> > 
> > Not following.  Are you saying that the main executable gets mapped into
> > the process address space immediately, but shared libraries are not?

At the time of the pgwin32_ReserveSharedMemoryRegion() call, the child process
contains only ntdll.dll and the executable.

> Without PIE/ASLR we can somewhat rely on pgwin32_ReserveSharedMemoryRegion
> to find the space that PGSharedMemoryCreate allocated still unoccupied.

I've never had access to a Windows system that can reproduce the fork
failures.  My best theory is that antivirus or similar software injects an
additional DLL at that early stage.

> > I wonder whether we could work around that by just destroying the created
> > process and trying again if we get a collision.  It'd be a tad
> > inefficient, but hopefully collisions wouldn't happen often enough to be a
> > big problem.
> 
> That might work, although it's obviously not pretty.

I didn't like that idea when Michael proposed it in 2015.  Since disabling
ASLR on the exe proved insufficient, I do like it now.  It degrades nicely; if
something raises the collision rate from 1% to 10%, that just looks like fork
latency degrading.

> We could also just
> default to some out-of-the-way address for MapViewOfFileEx, that might
> also work.

Dave Vitek proposed that:
https://www.postgresql.org/message-id/flat/5046CAEB.4010600%40grammatech.com

I estimate more risk than retrying forks.  There's no guarantee that a fixed
address helpful today won't make the collisions worse after the next Windows
security patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-30 Thread Amit Kapila
On Mon, May 1, 2017 at 10:23 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> Yeah, that's right.  Today, I have spent some time to analyze how and
>> where retry logic is required.  I think there are two places where we
>> need this retry logic, one is if we fail to reserve the memory
>> (pgwin32_ReserveSharedMemoryRegion) and second is if we fail to
>> reattach (PGSharedMemoryReAttach).
>
> I'm not following.  The point of the reserve operation is to ensure
> that the reattach will work.  What makes you think we need to add more
> code at that end?
>

We use VirtualAllocEx to allocate memory at a pre-defined address
(reserve operation) which can fail due to ASLR.  One recent example of
such a symptom is seen on pgsql-bugs [1], that failure is during
reserve operation and seems like something related to ASLR.  Another
point is the commit 7f3e17b4827b61ad84e0774e3e43da4c57c4487f (It
doesn't fail every time (which is explained by the Random part in
ASLR), but can fail with errors abut failing to reserve shared memory
region) also indicates that we can fail to reserve memory due to ASLR.


[1] - https://www.postgresql.org/message-id/14121.1485360296%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-30 Thread Tom Lane
Amit Kapila  writes:
> Yeah, that's right.  Today, I have spent some time to analyze how and
> where retry logic is required.  I think there are two places where we
> need this retry logic, one is if we fail to reserve the memory
> (pgwin32_ReserveSharedMemoryRegion) and second is if we fail to
> reattach (PGSharedMemoryReAttach).

I'm not following.  The point of the reserve operation is to ensure
that the reattach will work.  What makes you think we need to add more
code at that end?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-30 Thread Amit Kapila
On Tue, Apr 25, 2017 at 7:37 PM, Tom Lane  wrote:
> Craig Ringer  writes:
>> On 25 Apr. 2017 13:37, "Heikki Linnakangas"  wrote:
>>> For some data shared memory structures, that store no pointers, we wouldn't
>>> need to insist that they are mapped to the same address in every backend,
>>> though. In particular, shared_buffers. It wouldn't eliminate the problem,
>>> though, only make it less likely, so we'd still need to retry when it does
>>> happen.
>
>> Good point. Simply splitting out shared_buffers into a moveable segment
>> would make a massive difference. Much less chance of losing the dice roll
>> for mapping the fixed segment.
>
>> Should look at what else could be made cheaply relocatable too.
>
> I don't think it's worth spending any effort on.  We need the retry
> code anyway, and making it near impossible to hit that would only mean
> that it's very poorly tested.  The results upthread say that it isn't
> going to be hit often enough to be a performance problem, so why worry?
>

Yeah, that's right.  Today, I have spent some time to analyze how and
where retry logic is required.  I think there are two places where we
need this retry logic, one is if we fail to reserve the memory
(pgwin32_ReserveSharedMemoryRegion) and second is if we fail to
reattach (PGSharedMemoryReAttach). If we fail during reserve memory
operation, then we can simply terminate the process and recreate it
again, basically, add some kind of loop in internal_forkexec(). OTOH,
if we fail during reattach, I think we need an exit from the process
which means it can lead to a restart of all the processes unless we
want to add some special logic for handling process exit or
alternatively we might want to just try reattach operation in a loop
before giving up. Do we want to keep this retry logic for a certain
number of times say (10 times) or we want to try indefinitely?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-25 Thread Craig Ringer
On 25 April 2017 at 22:07, Tom Lane  wrote:
> Craig Ringer  writes:
>> On 25 Apr. 2017 13:37, "Heikki Linnakangas"  wrote:
>>> For some data shared memory structures, that store no pointers, we wouldn't
>>> need to insist that they are mapped to the same address in every backend,
>>> though. In particular, shared_buffers. It wouldn't eliminate the problem,
>>> though, only make it less likely, so we'd still need to retry when it does
>>> happen.
>
>> Good point. Simply splitting out shared_buffers into a moveable segment
>> would make a massive difference. Much less chance of losing the dice roll
>> for mapping the fixed segment.
>
>> Should look at what else could be made cheaply relocatable too.
>
> I don't think it's worth spending any effort on.  We need the retry
> code anyway, and making it near impossible to hit that would only mean
> that it's very poorly tested.  The results upthread say that it isn't
> going to be hit often enough to be a performance problem, so why worry?

Good point. Deal with it if it becomes an issue.

That said, I didn't see if any of those tests covered really big
shared_buffers. That could become an issue down the track at least.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-25 Thread Tom Lane
Craig Ringer  writes:
> On 25 Apr. 2017 13:37, "Heikki Linnakangas"  wrote:
>> For some data shared memory structures, that store no pointers, we wouldn't
>> need to insist that they are mapped to the same address in every backend,
>> though. In particular, shared_buffers. It wouldn't eliminate the problem,
>> though, only make it less likely, so we'd still need to retry when it does
>> happen.

> Good point. Simply splitting out shared_buffers into a moveable segment
> would make a massive difference. Much less chance of losing the dice roll
> for mapping the fixed segment.

> Should look at what else could be made cheaply relocatable too.

I don't think it's worth spending any effort on.  We need the retry
code anyway, and making it near impossible to hit that would only mean
that it's very poorly tested.  The results upthread say that it isn't
going to be hit often enough to be a performance problem, so why worry?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-25 Thread Craig Ringer
On 25 Apr. 2017 13:37, "Heikki Linnakangas"  wrote:


For some data shared memory structures, that store no pointers, we wouldn't
need to insist that they are mapped to the same address in every backend,
though. In particular, shared_buffers. It wouldn't eliminate the problem,
though, only make it less likely, so we'd still need to retry when it does
happen.


Good point. Simply splitting out shared_buffers into a moveable segment
would make a massive difference. Much less chance of losing the dice roll
for mapping the fixed segment.

Should look at what else could be made cheaply relocatable too.


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Heikki Linnakangas

On 04/24/2017 09:50 PM, Andres Freund wrote:

On 2017-04-24 14:43:11 -0400, Tom Lane wrote:

(We have accepted that kind of overhead for DSM segments, but the
intention I think is to allow only very trivial data structures in
the DSM segments.  Losing compiler pointer type checking for data
structures like the lock or PGPROC tables would be horrid.)


The relptr.h infrastructure brings some of the type-checking back, but
it's still pretty painful.  And just as important, it's quite noticeable
performance-wise.  So we have to do it for dynamic shm (until/unless we
go to using threads), but that doesn't mean we should do it for some of
the most performance critical data structures in PG...


Agreed.

For some data shared memory structures, that store no pointers, we 
wouldn't need to insist that they are mapped to the same address in 
every backend, though. In particular, shared_buffers. It wouldn't 
eliminate the problem, though, only make it less likely, so we'd still 
need to retry when it does happen.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Craig Ringer
On 25 Apr. 2017 02:51, "Andres Freund"  wrote:

On 2017-04-24 11:08:48 -0700, Andres Freund wrote:
> On 2017-04-24 23:14:40 +0800, Craig Ringer wrote:
> > In the long run we'll probably be forced toward threading or far
pointers.
>
> I'll vote for removing the windows port, before going for that.  And I'm
> not even joking.

Just to clarify: I'm talking about far pointers here, not threading.


Yeah, I'm pretty unimpressed that the accepted solution seems to be to
return to the early '90s.

Why don't platforms offer any sensible way to reserve a virtual address
range at process creation time?

It looks like you need a minimal loader process that rebases its self in
memory if it finds its self loaded in the desired area, then maps the
required memory range and loads the real process. Hard to imagine that not
causing more problems than it solves.


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Andres Freund
On 2017-04-24 11:08:48 -0700, Andres Freund wrote:
> On 2017-04-24 23:14:40 +0800, Craig Ringer wrote:
> > In the long run we'll probably be forced toward threading or far pointers.
> 
> I'll vote for removing the windows port, before going for that.  And I'm
> not even joking.

Just to clarify: I'm talking about far pointers here, not threading.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Andres Freund
On 2017-04-24 14:43:11 -0400, Tom Lane wrote:
> (We have accepted that kind of overhead for DSM segments, but the
> intention I think is to allow only very trivial data structures in
> the DSM segments.  Losing compiler pointer type checking for data
> structures like the lock or PGPROC tables would be horrid.)

The relptr.h infrastructure brings some of the type-checking back, but
it's still pretty painful.  And just as important, it's quite noticeable
performance-wise.  So we have to do it for dynamic shm (until/unless we
go to using threads), but that doesn't mean we should do it for some of
the most performance critical data structures in PG...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Tom Lane
Andres Freund  writes:
> On 2017-04-24 23:14:40 +0800, Craig Ringer wrote:
>> In the long run we'll probably be forced toward threading or far pointers.

> I'll vote for removing the windows port, before going for that.  And I'm
> not even joking.

Me too.  We used to *have* that kind of code, ie relative pointers into
the shmem segment, and it was a tremendous notational mess and very
bug-prone.  I do not wish to go back.

(We have accepted that kind of overhead for DSM segments, but the
intention I think is to allow only very trivial data structures in
the DSM segments.  Losing compiler pointer type checking for data
structures like the lock or PGPROC tables would be horrid.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Andres Freund
Hi,

On 2017-04-24 14:25:34 +0530, Amit Kapila wrote:
> Error code 87 means "invalid parameter".  Some googling [1] indicates
> such an error occurs if we pass the out-of-range address to
> MapViewOfFileEx. Another possible theory is that we must pass the
> address as multiple of the system's memory allocation granularity as
> that is expected by specs of MapViewOfFileEx.  I can try doing that
> but I am not confident that this is the right approach because of
> below text mentioned in docs (msdn) of MapViewOfFileEx.
> "While it is possible to specify an address that is safe now (not used
> by the operating system), there is no guarantee that the address will
> remain safe over time. Therefore, it is better to let the operating
> system choose the address.".

Sure, that's the same as mmap() etc say. I'd not be overly deterred by
that.


On 2017-04-24 23:14:40 +0800, Craig Ringer wrote:
> In the long run we'll probably be forced toward threading or far pointers.

I'll vote for removing the windows port, before going for that.  And I'm
not even joking.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Craig Ringer
On 24 April 2017 at 16:55, Amit Kapila  wrote:

> Another thing I have tried is to just start the server by setting
> RandomizedBaseAddress="TRUE".  I have tried about 15-20 times but
> could not reproduce the problem related to shared memory attach.  We
> have tried the same on one of my colleagues (Ashutosh Sharma) machine
> as well, there we could see that error once or twice out of many tries
> but couldn't get it consistently.  I think if the chances of this
> problem to occur are so less, then probably the suggestion made by Tom
> to retry if we get collision doesn't sound bad.

It's pretty uncommon, and honestly, we might well be best off just
trying again if we lose the lottery.

Most of what I read last time I looked into this essentially assumed
that you'd "fix" your code by reinventing far pointers[1], like the
good old Win16 days. Assume all addresses in shmem are relative to the
shmem base, and offset them when accessing/storing them. Fun and
efficient for everyone! That seems to be what Boost recommends[2].

Given that Pg doesn't make much effort to differentiate between
pointers to shmem and local memory, and has no pointer transformations
between shared and local pointers, adopting that would be a
horrifyingly intrusive change as well as incredibly tedious to
implement. We'd probably land up using size_t or ptrdiff_t for shmem
pointers and some kind of macro that was a noop on !windows. For once
I'd be thoroughly in agreement with Tom's complaints about
Windows-droppings.

Other people who've faced and worked around this[3] have come up with
solutions that look way scarier than just retrying if we lose the
random numbers game.

BTW, some Windows users face issues with large contiguous
allocations/mappings even without the process sharing side[4] due to
memory fragmentation created by ASLR, though this may only be a
concern for 32-bit executables. The relevant option /LARGEADDRESSAWARE
is enabled by default for 64-bit builds.

We might want to /DELAYLOAD [5] DLLs where possible to improve our
chances of winning the dice roll, but if we're going to support
retrying at all we don't need to care too much.

I looked at image binding (prelinking), but it's disabled if ASLR is in use.

In the long run we'll probably be forced toward threading or far pointers.



[1] https://en.wikipedia.org/wiki/Far_pointer,
https://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer%5Fsizes

[2] 
http://www.boost.org/doc/libs/1_64_0/doc/html/interprocess/sharedmemorybetweenprocesses.html#interprocess.sharedmemorybetweenprocesses.mapped_region.mapped_region_address_mapping

[3] http://stackoverflow.com/a/36145019/398670

[4] https://github.com/golang/go/issues/2323

[5] On 24 April 2017 at 16:55, Amit Kapila 
wrote:   > Another thing I have tried is to just start the server by
setting  > RandomizedBaseAddress="TRUE".  I have tried about 15-20
times but  > could not reproduce the problem related to shared memory
attach.  We  > have tried the same on one of my colleagues (Ashutosh
Sharma) machine  > as well, there we could see that error once or
twice out of many tries  > but couldn't get it consistently.  I think
if the chances of this  > problem to occur are so less, then probably
the suggestion made by Tom  > to retry if we get collision doesn't
sound bad.   It's pretty uncommon, and honestly, we might well be best
off just trying again if we lose the lottery.   Most of what I read
last time I looked into this essentially assumed that you'd "fix" your
code by reinventing far pointers[1], like the good old Win16 days.
Assume all addresses in shmem are relative to the shmem base, and
offset them when accessing/storing them. Fun and efficient for
everyone! That seems to be what Boost recommends[2].   Given that Pg
doesn't make much effort to differentiate between pointers to shmem
and local memory, and has no pointer transformations between shared
and local pointers, adopting that would be a horrifyingly intrusive
change as well as incredibly tedious to implement. We'd probably land
up using size_t or ptrdiff_t for shmem pointers and some kind of macro
that was a noop on !windows. For once I'd be thoroughly in agreement
with Tom's complaints about Windows-droppings.   Other people who've
faced and worked around this[3] have come up with solutions that look
way scarier than just retrying if we lose the random numbers game.
BTW, some Windows users face issues with large contiguous
allocations/mappings even without the process sharing side[4] due to
memory fragmentation created by ASLR, though this may only be a
concern for 32-bit executables. The relevant option /LARGEADDRESSAWARE
is enabled by default for 64-bit builds.  We should /DELAYLOAD as many
DDLs as possible to improve our chances.   [1]
https://en.wikipedia.org/wiki/Far_pointer,
https://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer%5Fsizes   [2]

Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Craig Ringer
On 16 April 2017 at 05:18, Andres Freund  wrote:

> Because of ASLR of the main executable (i.e. something like PIE).  It'll
> supposedly become harder (as in only running in compatibility modes) if
> binaries don't enable that.  It's currently disabled somewhere in the VC
> project generated.

I thought we passed /DYNAMICBASE:NO directly , but I don't see that in
the code. A look at the git logs shows that we disabled it in
7f3e17b48 by emitting
false in the MSBuild
project. That'll pass /DYNAMICBASE:NO to the linker.

See https://msdn.microsoft.com/en-us/library/bb384887.aspx

It's rather better than the old registry hack, but it's a compat
option we're likely to lose at some point.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-24 Thread Amit Kapila
On Fri, Apr 21, 2017 at 12:55 AM, Andres Freund  wrote:
> On 2017-04-20 16:57:03 +0530, Amit Kapila wrote:
>> On Wed, Apr 19, 2017 at 9:01 PM, Andres Freund  wrote:
>> > On 2017-04-19 10:15:31 -0400, Tom Lane wrote:
>> >> Amit Kapila  writes:
>> >> > On Sun, Apr 16, 2017 at 3:04 AM, Tom Lane  wrote:
>> >> >> Obviously, any such fix would be a lot more likely to be reliable in
>> >> >> 64-bit machines.  There's probably not enough daylight to be sure of
>> >> >> making it work in 32-bit Windows, so I suspect we'd need some retry
>> >> >> logic anyway for that case.
>> >>
>> >> > Yeah, that kind of thing can work assuming we don't get conflicts too
>> >> > often, but it could be possible that conflicts are not reported from
>> >> > ASLR enabled environments because of commit 7f3e17b4.
>> >>
>> >> Right, but Andres' point is that we should make an effort to undo that
>> >> hack and instead allow ASLR to happen.  Not just because it's allegedly
>> >> more secure, but because we may have no choice in future Windows versions.
>> >
>> > FWIW, I think it *also* might make us more secure, because addresses in
>> > the postgres binary won't be predictable anymore.
>> >
>>
>> Agreed.  I have done some further study by using VMMap tool in Windows
>> and it seems to me that all 64-bit processes use address range
>> (0001 ~ 07FE).  I have attached two screen
>> shots to show the address range (lower range and upper range).  You
>> need to refer the lower half of the window in attached screenshots.
>> At this stage, I am not completely sure whether we can assume some
>> address out of this range to use in MapViewOfFileEx.  Let me know your
>> thoughts.
>
> That seems like a fairly restricted range (good!), and squares with
> memories of reading about this (can't find the reference anymore).  Just
> using something like 0x0F00 as the address might work
> sufficiently well.  What happens if you just hardcode that in the first
> MapViewOfFileEx call, and change RandomizedBaseAddress="FALSE" in
> src/tools/msvc/VCBuildProject.pm to true?
>

The server start generates an error:
2017-04-24 12:02:05.771 IST [8940] FATAL:  could not create shared
memory segment: error code 87
2017-04-24 12:02:05.771 IST [8940] DETAIL:  Failed system call was
MapViewOfFileEx.

Error code 87 means "invalid parameter".  Some googling [1] indicates
such an error occurs if we pass the out-of-range address to
MapViewOfFileEx. Another possible theory is that we must pass the
address as multiple of the system's memory allocation granularity as
that is expected by specs of MapViewOfFileEx.  I can try doing that
but I am not confident that this is the right approach because of
below text mentioned in docs (msdn) of MapViewOfFileEx.
"While it is possible to specify an address that is safe now (not used
by the operating system), there is no guarantee that the address will
remain safe over time. Therefore, it is better to let the operating
system choose the address.".

I have printed the addresses that system automatically picks for
MapViewOfFileEx (3a2, 377, 352, 372, 373) and they
seem to be in a range which I have posted up thread for 64-bit
processes.

Another thing I have tried is to just start the server by setting
RandomizedBaseAddress="TRUE".  I have tried about 15-20 times but
could not reproduce the problem related to shared memory attach.  We
have tried the same on one of my colleagues (Ashutosh Sharma) machine
as well, there we could see that error once or twice out of many tries
but couldn't get it consistently.  I think if the chances of this
problem to occur are so less, then probably the suggestion made by Tom
to retry if we get collision doesn't sound bad.

[1] - 
https://support.microsoft.com/en-us/help/125713/common-file-mapping-problems-and-platform-differences

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-20 Thread Tom Lane
Andres Freund  writes:
> On 2017-04-20 16:57:03 +0530, Amit Kapila wrote:
>> Agreed.  I have done some further study by using VMMap tool in Windows
>> and it seems to me that all 64-bit processes use address range
>> (0001 ~ 07FE).  I have attached two screen
>> shots to show the address range (lower range and upper range).  You
>> need to refer the lower half of the window in attached screenshots.
>> At this stage, I am not completely sure whether we can assume some
>> address out of this range to use in MapViewOfFileEx.  Let me know your
>> thoughts.

> That seems like a fairly restricted range (good!), and squares with
> memories of reading about this (can't find the reference anymore).  Just
> using something like 0x0F00 as the address might work
> sufficiently well.  What happens if you just hardcode that in the first
> MapViewOfFileEx call, and change RandomizedBaseAddress="FALSE" in
> src/tools/msvc/VCBuildProject.pm to true?

The material I found about Linux x86_64 addressing explains that the
underlying hardware address space is only 48 bits.  Linux chooses to
normalize this by sign-extending into the upper 16 bits, so that
valid 64-bit addresses are
 .. 7FFF
and
8000 .. 

If I'm counting the bits correctly, Windows is choosing to use only
1/16th of the lower half of the available address space, which seems
a bit odd.  However, the main point here is that there should be
quite a bit of available daylight, if they will let us use addresses
above 07FE at all.  I'd be inclined to do our own low-tech
ASLR by using addresses with a random component, say 0Fxx.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-20 Thread Andres Freund
On 2017-04-20 16:57:03 +0530, Amit Kapila wrote:
> On Wed, Apr 19, 2017 at 9:01 PM, Andres Freund  wrote:
> > On 2017-04-19 10:15:31 -0400, Tom Lane wrote:
> >> Amit Kapila  writes:
> >> > On Sun, Apr 16, 2017 at 3:04 AM, Tom Lane  wrote:
> >> >> Obviously, any such fix would be a lot more likely to be reliable in
> >> >> 64-bit machines.  There's probably not enough daylight to be sure of
> >> >> making it work in 32-bit Windows, so I suspect we'd need some retry
> >> >> logic anyway for that case.
> >>
> >> > Yeah, that kind of thing can work assuming we don't get conflicts too
> >> > often, but it could be possible that conflicts are not reported from
> >> > ASLR enabled environments because of commit 7f3e17b4.
> >>
> >> Right, but Andres' point is that we should make an effort to undo that
> >> hack and instead allow ASLR to happen.  Not just because it's allegedly
> >> more secure, but because we may have no choice in future Windows versions.
> >
> > FWIW, I think it *also* might make us more secure, because addresses in
> > the postgres binary won't be predictable anymore.
> >
> 
> Agreed.  I have done some further study by using VMMap tool in Windows
> and it seems to me that all 64-bit processes use address range
> (0001 ~ 07FE).  I have attached two screen
> shots to show the address range (lower range and upper range).  You
> need to refer the lower half of the window in attached screenshots.
> At this stage, I am not completely sure whether we can assume some
> address out of this range to use in MapViewOfFileEx.  Let me know your
> thoughts.

That seems like a fairly restricted range (good!), and squares with
memories of reading about this (can't find the reference anymore).  Just
using something like 0x0F00 as the address might work
sufficiently well.  What happens if you just hardcode that in the first
MapViewOfFileEx call, and change RandomizedBaseAddress="FALSE" in
src/tools/msvc/VCBuildProject.pm to true?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-19 Thread Andres Freund
On 2017-04-19 10:15:31 -0400, Tom Lane wrote:
> Amit Kapila  writes:
> > On Sun, Apr 16, 2017 at 3:04 AM, Tom Lane  wrote:
> >> Obviously, any such fix would be a lot more likely to be reliable in
> >> 64-bit machines.  There's probably not enough daylight to be sure of
> >> making it work in 32-bit Windows, so I suspect we'd need some retry
> >> logic anyway for that case.
> 
> > Yeah, that kind of thing can work assuming we don't get conflicts too
> > often, but it could be possible that conflicts are not reported from
> > ASLR enabled environments because of commit 7f3e17b4.
> 
> Right, but Andres' point is that we should make an effort to undo that
> hack and instead allow ASLR to happen.  Not just because it's allegedly
> more secure, but because we may have no choice in future Windows versions.

FWIW, I think it *also* might make us more secure, because addresses in
the postgres binary won't be predictable anymore.  Since most of the
windows binaries are built by one source, that's some advantage on its
own.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-19 Thread Tom Lane
Amit Kapila  writes:
> On Sun, Apr 16, 2017 at 3:04 AM, Tom Lane  wrote:
>> Obviously, any such fix would be a lot more likely to be reliable in
>> 64-bit machines.  There's probably not enough daylight to be sure of
>> making it work in 32-bit Windows, so I suspect we'd need some retry
>> logic anyway for that case.

> Yeah, that kind of thing can work assuming we don't get conflicts too
> often, but it could be possible that conflicts are not reported from
> ASLR enabled environments because of commit 7f3e17b4.

Right, but Andres' point is that we should make an effort to undo that
hack and instead allow ASLR to happen.  Not just because it's allegedly
more secure, but because we may have no choice in future Windows versions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-19 Thread Amit Kapila
On Sun, Apr 16, 2017 at 3:04 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2017-04-15 17:24:54 -0400, Tom Lane wrote:
>>> I wonder whether we could work around that by just destroying the created
>>> process and trying again if we get a collision.  It'd be a tad
>>> inefficient, but hopefully collisions wouldn't happen often enough to be a
>>> big problem.
>
>> That might work, although it's obviously not pretty.  We could also just
>> default to some out-of-the-way address for MapViewOfFileEx, that might
>> also work.
>
> Could be.  Does Microsoft publish any documentation about the range of
> addresses their ASLR uses?
>

I have look around to find some information to see if there is any
such address range which could be used for our purpose.  I am not able
to see any such predictable address range.  You might want to read the
article [1] especially the text around "What is the memory address
space range in virtual memory map where system DLLs and user DLLs
could load?"   It seems to indicate that there is no such address
unless I have misunderstood it.  I don't deny the possibility of
having such an address range, but I could not find any info on the
same.

> Obviously, any such fix would be a lot more likely to be reliable in
> 64-bit machines.  There's probably not enough daylight to be sure of
> making it work in 32-bit Windows, so I suspect we'd need some retry
> logic anyway for that case.
>

Yeah, that kind of thing can work assuming we don't get conflicts too
often, but it could be possible that conflicts are not reported from
ASLR enabled environments because of commit 7f3e17b4.

[1] - 
https://blogs.msdn.microsoft.com/winsdk/2009/11/30/how-to-disable-address-space-layout-randomization-aslr/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-17 Thread Andres Freund
On 2017-04-15 14:34:28 -0700, Andres Freund wrote:
> On 2017-04-15 17:30:21 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2017-04-15 16:48:05 -0400, Tom Lane wrote:
> > >> Concretely, I propose the attached patch.  We'd have to put it into
> > >> all supported branches, since culicidae is showing intermittent
> > >> "could not reattach to shared memory" failures in all the branches.
> > 
> > > That seems quite reasonable.
> > 
> > Pushed, please update culicidae's settings.
> 
> Done, although there were already builds running by the time I got to
> it, so there'll be a few unaffected runs.  I've increased build
> frequency of all branches to be forced once-an-hour, so we can quickly
> see how reliable it is.  Will turn down Monday or such, if everything
> looks good.

Looking through all the branches, it seems to have done the trick - the
previous irregular failures seem to be gone.  Nice.

Unless somebody complains, I'll reduce the forced build frequency to
something more normal again.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Andres Freund
On 2017-04-15 17:30:21 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-04-15 16:48:05 -0400, Tom Lane wrote:
> >> Concretely, I propose the attached patch.  We'd have to put it into
> >> all supported branches, since culicidae is showing intermittent
> >> "could not reattach to shared memory" failures in all the branches.
> 
> > That seems quite reasonable.
> 
> Pushed, please update culicidae's settings.

Done, although there were already builds running by the time I got to
it, so there'll be a few unaffected runs.  I've increased build
frequency of all branches to be forced once-an-hour, so we can quickly
see how reliable it is.  Will turn down Monday or such, if everything
looks good.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Tom Lane
Andres Freund  writes:
> On 2017-04-15 17:24:54 -0400, Tom Lane wrote:
>> I wonder whether we could work around that by just destroying the created
>> process and trying again if we get a collision.  It'd be a tad
>> inefficient, but hopefully collisions wouldn't happen often enough to be a
>> big problem.

> That might work, although it's obviously not pretty.  We could also just
> default to some out-of-the-way address for MapViewOfFileEx, that might
> also work.

Could be.  Does Microsoft publish any documentation about the range of
addresses their ASLR uses?

Obviously, any such fix would be a lot more likely to be reliable in
64-bit machines.  There's probably not enough daylight to be sure of
making it work in 32-bit Windows, so I suspect we'd need some retry
logic anyway for that case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Tom Lane
Andres Freund  writes:
> On 2017-04-15 16:48:05 -0400, Tom Lane wrote:
>> Concretely, I propose the attached patch.  We'd have to put it into
>> all supported branches, since culicidae is showing intermittent
>> "could not reattach to shared memory" failures in all the branches.

> That seems quite reasonable.

Pushed, please update culicidae's settings.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Andres Freund
On 2017-04-15 17:24:54 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-04-15 17:09:38 -0400, Tom Lane wrote:
> >> Why doesn't Windows' ability to map the segment into the new process
> >> before it executes take care of that?
> 
> > Because of ASLR of the main executable (i.e. something like PIE).
> 
> Not following.  Are you saying that the main executable gets mapped into
> the process address space immediately, but shared libraries are not?

Without PIE/ASLR we can somewhat rely on pgwin32_ReserveSharedMemoryRegion
to find the space that PGSharedMemoryCreate allocated still unoccupied.
If the main binary also uses ASLR, not just libraries/stack/other
mappings, that's not guaranteed to be the case.

But this probably needs somebody with actual windows expertise
commenting.


> I wonder whether we could work around that by just destroying the created
> process and trying again if we get a collision.  It'd be a tad
> inefficient, but hopefully collisions wouldn't happen often enough to be a
> big problem.

That might work, although it's obviously not pretty.  We could also just
default to some out-of-the-way address for MapViewOfFileEx, that might
also work.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Tom Lane
Andres Freund  writes:
> On 2017-04-15 17:09:38 -0400, Tom Lane wrote:
>> Why doesn't Windows' ability to map the segment into the new process
>> before it executes take care of that?

> Because of ASLR of the main executable (i.e. something like PIE).

Not following.  Are you saying that the main executable gets mapped into
the process address space immediately, but shared libraries are not?

I wonder whether we could work around that by just destroying the created
process and trying again if we get a collision.  It'd be a tad
inefficient, but hopefully collisions wouldn't happen often enough to be a
big problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Andres Freund
On 2017-04-15 17:09:38 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > That seems quite reasonable.  I'm afraid we're going to have to figure
> > out something similar, but more robust, for windows soon-ish :/
> 
> Why doesn't Windows' ability to map the segment into the new process
> before it executes take care of that?

Because of ASLR of the main executable (i.e. something like PIE).  It'll
supposedly become harder (as in only running in compatibility modes) if
binaries don't enable that.  It's currently disabled somewhere in the VC
project generated.  Besides that, it'd also be good for security
purposes if we didn't have to disable PIE (which also prevents it from
being used for the initial backend).

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Tom Lane
Andres Freund  writes:
> That seems quite reasonable.  I'm afraid we're going to have to figure
> out something similar, but more robust, for windows soon-ish :/

Why doesn't Windows' ability to map the segment into the new process
before it executes take care of that?

> As a minor point, it'd probably be good to add addr=%zu, requestedAddress
> or such.

Yeah, I'd come to the same conclusion right after sending that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Andres Freund
On 2017-04-15 16:48:05 -0400, Tom Lane wrote:
> I wrote:
> > I think what may be the most effective way to proceed is to provide
> > a way to force the shmem segment to be mapped at a chosen address.
> > It looks like, at least on x86_64 Linux, mapping shmem at
> > 0x7E00 would work reliably.
> 
> > Since we only care about this for testing purposes, I don't think
> > it has to be done in any very clean or even documented way.
> > I'm inclined to propose that we put something into sysv_shmem.c
> > that will check for an environment variable named, say, PG_SHMEM_ADDR,
> > and if it's set will use the value as the address in the initial
> > shmat() call.  For a bit of extra safety we could do that only in
> > EXEC_BACKEND builds.
> 
> Concretely, I propose the attached patch.  We'd have to put it into
> all supported branches, since culicidae is showing intermittent
> "could not reattach to shared memory" failures in all the branches.

That seems quite reasonable.  I'm afraid we're going to have to figure
out something similar, but more robust, for windows soon-ish :/


>   /* OK, should be able to attach to the segment */
> - memAddress = shmat(shmid, NULL, PG_SHMAT_FLAGS);
> + memAddress = shmat(shmid, requestedAddress, PG_SHMAT_FLAGS);
>  
>   if (memAddress == (void *) -1)
>   elog(FATAL, "shmat(id=%d) failed: %m", shmid);

As a minor point, it'd probably be good to add addr=%zu, requestedAddress
or such.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Tom Lane
I wrote:
> I think what may be the most effective way to proceed is to provide
> a way to force the shmem segment to be mapped at a chosen address.
> It looks like, at least on x86_64 Linux, mapping shmem at
> 0x7E00 would work reliably.

> Since we only care about this for testing purposes, I don't think
> it has to be done in any very clean or even documented way.
> I'm inclined to propose that we put something into sysv_shmem.c
> that will check for an environment variable named, say, PG_SHMEM_ADDR,
> and if it's set will use the value as the address in the initial
> shmat() call.  For a bit of extra safety we could do that only in
> EXEC_BACKEND builds.

Concretely, I propose the attached patch.  We'd have to put it into
all supported branches, since culicidae is showing intermittent
"could not reattach to shared memory" failures in all the branches.

regards, tom lane

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 77cd8f3..15ae672 100644
*** a/src/backend/port/sysv_shmem.c
--- b/src/backend/port/sysv_shmem.c
*** static void *
*** 102,109 
--- 102,129 
  InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size)
  {
  	IpcMemoryId shmid;
+ 	void	   *requestedAddress = NULL;
  	void	   *memAddress;
  
+ 	/*
+ 	 * Normally we just pass requestedAddress = NULL to shmat(), allowing the
+ 	 * system to choose where the segment gets mapped.  But in an EXEC_BACKEND
+ 	 * build, it's possible for whatever is chosen in the postmaster to not
+ 	 * work for backends, due to variations in address space layout.  As a
+ 	 * rather klugy workaround, allow the user to specify the address to use
+ 	 * via setting the environment variable PG_SHMEM_ADDR.  (If this were of
+ 	 * interest for anything except debugging, we'd probably create a cleaner
+ 	 * and better-documented way to set it, such as a GUC.)
+ 	 */
+ #ifdef EXEC_BACKEND
+ 	{
+ 		char	   *pg_shmem_addr = getenv("PG_SHMEM_ADDR");
+ 
+ 		if (pg_shmem_addr)
+ 			requestedAddress = (void *) strtoul(pg_shmem_addr, NULL, 0);
+ 	}
+ #endif
+ 
  	shmid = shmget(memKey, size, IPC_CREAT | IPC_EXCL | IPCProtection);
  
  	if (shmid < 0)
*** InternalIpcMemoryCreate(IpcMemoryKey mem
*** 203,209 
  	on_shmem_exit(IpcMemoryDelete, Int32GetDatum(shmid));
  
  	/* OK, should be able to attach to the segment */
! 	memAddress = shmat(shmid, NULL, PG_SHMAT_FLAGS);
  
  	if (memAddress == (void *) -1)
  		elog(FATAL, "shmat(id=%d) failed: %m", shmid);
--- 223,229 
  	on_shmem_exit(IpcMemoryDelete, Int32GetDatum(shmid));
  
  	/* OK, should be able to attach to the segment */
! 	memAddress = shmat(shmid, requestedAddress, PG_SHMAT_FLAGS);
  
  	if (memAddress == (void *) -1)
  		elog(FATAL, "shmat(id=%d) failed: %m", shmid);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-15 Thread Tom Lane
Andres Freund  writes:
> On April 14, 2017 9:42:41 PM PDT, Tom Lane  wrote:
>> 2017-04-15 04:31:21.657 GMT [16792] FATAL:  could not reattach to
>> shared memory (key=6280001, addr=0x7f692fece000): Invalid argument
>> 
>> Presumably, this is the same issue we've seen on Windows where the
>> shmem address range gets overlapped by code loaded at a randomized
>> address.  Is there any real hope of making that work?

> Seems to work reasonably regularly on other branches... On phone only, so 
> can't dig into details, but it seems there's some chance involved.  Let's see 
> what the next few runs will do.  Will crank frequency once home.

I poked at this on a Fedora 25 box, and was able to reproduce failures at
a rate of one every half dozen or so runs of the core regression tests,
which seems to about match what is happening on culicidae.

Looking at the postmaster's memory map, it seems that shmem segments
get mapped in the same part of the address space as shared libraries,
ie they all end up in 0x7Fxx.  So it's not terribly
surprising that there's a risk of collision with a shared library.

I think what may be the most effective way to proceed is to provide
a way to force the shmem segment to be mapped at a chosen address.
It looks like, at least on x86_64 Linux, mapping shmem at
0x7E00 would work reliably.

Since we only care about this for testing purposes, I don't think
it has to be done in any very clean or even documented way.
I'm inclined to propose that we put something into sysv_shmem.c
that will check for an environment variable named, say, PG_SHMEM_ADDR,
and if it's set will use the value as the address in the initial
shmat() call.  For a bit of extra safety we could do that only in
EXEC_BACKEND builds.

Then you'd just need to add PG_SHMEM_ADDR=0x7E00 to
culicidae's build_env and you'd be good to go.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OK, so culicidae is *still* broken

2017-04-14 Thread Andres Freund


On April 14, 2017 9:42:41 PM PDT, Tom Lane  wrote:
>Per
>https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-04-15%2004%3A00%3A02
>
>2017-04-15 04:31:21.657 GMT [16792] FATAL:  could not reattach to
>shared memory (key=6280001, addr=0x7f692fece000): Invalid argument
>
>Presumably, this is the same issue we've seen on Windows where the
>shmem address range gets overlapped by code loaded at a randomized
>address.  Is there any real hope of making that work?

Seems to work reasonably regularly on other branches... On phone only, so can't 
dig into details, but it seems there's some chance involved.  Let's see what 
the next few runs will do.  Will crank frequency once home.

Andres

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] OK, so culicidae is *still* broken

2017-04-14 Thread Tom Lane
Per
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-04-15%2004%3A00%3A02

2017-04-15 04:31:21.657 GMT [16792] FATAL:  could not reattach to shared memory 
(key=6280001, addr=0x7f692fece000): Invalid argument

Presumably, this is the same issue we've seen on Windows where the
shmem address range gets overlapped by code loaded at a randomized
address.  Is there any real hope of making that work?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers