Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Trevor Talbot
On 10/21/07, Magnus Hagander [EMAIL PROTECTED] wrote:

  I tried generating idle connections in an effort to reproduce
  Laurent's problem, but I ran into a local limit instead: for each
  backend, postmaster creates a thread and burns 4MB of its 2GB address
  space.  It fails around 490.

 Oh, that's interesting. That's actually a sideeffect of us increasing
 the stack size for the postgres.exe executable in order to work on other
 things. By default, it burns 1MB/thread, but ours will do 4MB. Never
 really thought of the problem that it'll run out of address space.
 Unfortunately, that size can't be changed in the CreateThread() call -
 only the initially committed size can be changed there.

 There are two ways to get around it - one is not using a thread for each
 backend, but a single thread that handles them all and then some sync
 objects around it. We originally considered this but said we won't
 bother changing it because the current way is simpler, and the overhead
 of a thread is tiny compared to a process. I don't think anybody even
 thought about the fact that it'd run you out of address space...

I'd probably take the approach of combining win32_waitpid() and
threads.  You'd end up with 1 thread per 64 backends; when something
interesting happens the thread could push the info onto a queue, which
the new win32_waitpid() would check.  Use APCs to add new backends to
threads with free slots.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Magnus Hagander
Florian Weimer wrote:
 * Magnus Hagander:
 
 Oh, that's interesting. That's actually a sideeffect of us increasing
 the stack size for the postgres.exe executable in order to work on other
 things. By default, it burns 1MB/thread, but ours will do 4MB. Never
 really thought of the problem that it'll run out of address space.
 Unfortunately, that size can't be changed in the CreateThread() call -
 only the initially committed size can be changed there.
 
 Windows XP supports the STACK_SIZE_PARAM_IS_A_RESERVATION flag, which
 apparently allows to reduce the reserved size.  It might be better to do
 this the other way round, though (leave the reservation at its 1 MB
 default, and increase it only when necessary).

It does, but we still support windows 2000 as well. I think it's better
to use a different method altogether - one not using one thread per child.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Florian Weimer
* Magnus Hagander:

 Oh, that's interesting. That's actually a sideeffect of us increasing
 the stack size for the postgres.exe executable in order to work on other
 things. By default, it burns 1MB/thread, but ours will do 4MB. Never
 really thought of the problem that it'll run out of address space.
 Unfortunately, that size can't be changed in the CreateThread() call -
 only the initially committed size can be changed there.

Windows XP supports the STACK_SIZE_PARAM_IS_A_RESERVATION flag, which
apparently allows to reduce the reserved size.  It might be better to do
this the other way round, though (leave the reservation at its 1 MB
default, and increase it only when necessary).

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Magnus Hagander
Trevor Talbot wrote:
 On 10/21/07, Magnus Hagander [EMAIL PROTECTED] wrote:
 
 I tried generating idle connections in an effort to reproduce
 Laurent's problem, but I ran into a local limit instead: for each
 backend, postmaster creates a thread and burns 4MB of its 2GB address
 space.  It fails around 490.
 Oh, that's interesting. That's actually a sideeffect of us increasing
 the stack size for the postgres.exe executable in order to work on other
 things. By default, it burns 1MB/thread, but ours will do 4MB. Never
 really thought of the problem that it'll run out of address space.
 Unfortunately, that size can't be changed in the CreateThread() call -
 only the initially committed size can be changed there.

 There are two ways to get around it - one is not using a thread for each
 backend, but a single thread that handles them all and then some sync
 objects around it. We originally considered this but said we won't
 bother changing it because the current way is simpler, and the overhead
 of a thread is tiny compared to a process. I don't think anybody even
 thought about the fact that it'd run you out of address space...
 
 I'd probably take the approach of combining win32_waitpid() and
 threads.  You'd end up with 1 thread per 64 backends; when something
 interesting happens the thread could push the info onto a queue, which
 the new win32_waitpid() would check.  Use APCs to add new backends to
 threads with free slots.

I was planning to make it even easier and let Windows do the job for us,
just using RegisterWaitForSingleObject(). Does the same - one thread per
64 backends, but we don't have to deal with the queueing ourselves.
Should be rather trivial to do.

Keeps win32_waitpid() unchanged.

That said, refactoring win32_waitpid() to be based on a queue might be a
good idea *anyway*. Have the callback from above put something in the
queue, and go with your idea for the rest.

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Trevor Talbot
On 10/22/07, Magnus Hagander [EMAIL PROTECTED] wrote:
 Trevor Talbot wrote:

  I'd probably take the approach of combining win32_waitpid() and
  threads.  You'd end up with 1 thread per 64 backends; when something
  interesting happens the thread could push the info onto a queue, which
  the new win32_waitpid() would check.  Use APCs to add new backends to
  threads with free slots.

 I was planning to make it even easier and let Windows do the job for us,
 just using RegisterWaitForSingleObject(). Does the same - one thread per
 64 backends, but we don't have to deal with the queueing ourselves.

Oh, good call -- I keep forgetting the native thread pool exists.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 I was planning to make it even easier and let Windows do the job for us,
 just using RegisterWaitForSingleObject(). Does the same - one thread per
 64 backends, but we don't have to deal with the queueing ourselves.
 Should be rather trivial to do.

How can that possibly work?  Backends have to be able to run
concurrently, and I don't see how they'll do that if they share a stack.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Trevor Talbot
On 10/22/07, Tom Lane [EMAIL PROTECTED] wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  I was planning to make it even easier and let Windows do the job for us,
  just using RegisterWaitForSingleObject(). Does the same - one thread per
  64 backends, but we don't have to deal with the queueing ourselves.
  Should be rather trivial to do.

 How can that possibly work?  Backends have to be able to run
 concurrently, and I don't see how they'll do that if they share a stack.

This is about what postmaster does for its SIGCHLD wait equivalent on
win32.  The 64 comes from Windows' object/event mechanism, which lets
you perform a blocking wait on up to that many handles in a single
call.  Currently postmaster is creating a new thread to wait on only
one backend at a time, so it ends up with too many threads.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 I was planning to make it even easier and let Windows do the job for us,
 just using RegisterWaitForSingleObject(). Does the same - one thread per
 64 backends, but we don't have to deal with the queueing ourselves.
 Should be rather trivial to do.
 
 How can that possibly work?  Backends have to be able to run
 concurrently, and I don't see how they'll do that if they share a stack.

We're not talking about the backends, we're talking about the backend
waiter threads whose sole purpose is to wait for a backend to die and
then raise a signal when it does. We can easily have the kernel wait for
a whole bunch of them at once, and have it call our callback function
whenever anyone of them dies.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 We're not talking about the backends, we're talking about the backend
 waiter threads whose sole purpose is to wait for a backend to die and
 then raise a signal when it does.

Oh, OK, I had not twigged to exactly what the threads were being used
for.  Never mind ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-21 Thread Magnus Hagander
Trevor Talbot wrote:
 On 10/17/07, Magnus Hagander [EMAIL PROTECTED] wrote:
 On Wed, Oct 17, 2007 at 02:40:14AM -0400, Tom Lane wrote:
 
 Maybe we should put an #ifdef WIN32 into guc.c to limit max_connections
 to something we know the platform can stand?  It'd be more comfortable
 if we understood exactly where the limit was, but I think I'd rather
 have an I'm sorry Dave, I can't do that than random-seeming crashes.
 Yeayh, that's probably a good idea - except we never managed to figure out
 where the limit is. It appears to vary pretty wildly between different
 machines, for reasons we don't really know why (total RAM has some effect
 on it, but that's not the only one, for example)
 
 I tried generating idle connections in an effort to reproduce
 Laurent's problem, but I ran into a local limit instead: for each
 backend, postmaster creates a thread and burns 4MB of its 2GB address
 space.  It fails around 490.

Oh, that's interesting. That's actually a sideeffect of us increasing
the stack size for the postgres.exe executable in order to work on other
things. By default, it burns 1MB/thread, but ours will do 4MB. Never
really thought of the problem that it'll run out of address space.
Unfortunately, that size can't be changed in the CreateThread() call -
only the initially committed size can be changed there.

There are two ways to get around it - one is not using a thread for each
backend, but a single thread that handles them all and then some sync
objects around it. We originally considered this but said we won't
bother changing it because the current way is simpler, and the overhead
of a thread is tiny compared to a process. I don't think anybody even
thought about the fact that it'd run you out of address space...

The other way is to finish off win64 support :-) Which I plan to look
at, but I don't think that alone should be considered a solution.

The question is if it's worth fixing that part, if it will just fall
down for other reasons before we reach these 500 connections anyway. Can
you try having your program actually run some queries and so, and not
just do a PQconnect? To see if it falls over then, because it's been
doing more?


 Laurent's issue must depend on other load characteristics.  It's
 possible to get a trace of DLL loads, but I haven't found a
 noninvasive way of doing that.  It seems to require a debugger be
 attached.

AFAIK, it does require that, yes.

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster