On 12/09/10 20:13, Jeff Davis wrote:
On Sun, 2010-09-12 at 12:29 -0400, Tom Lane wrote:
... why throw an ERROR there if it can't happen (or
indicates an inconsistent state when it does happen)?
Are you suggesting that an Assert would be sufficient?
I'm not too picky about whether it's
On Mon, 2010-09-13 at 09:10 +0300, Heikki Linnakangas wrote:
but we should be consistent and document that:
(a) it shouldn't happen
(b) that it's just a sanity check and we're ignoring the race
Would this be sufficient?
--- a/src/backend/port/unix_latch.c
+++
On 13/09/10 20:43, Jeff Davis wrote:
On Mon, 2010-09-13 at 09:10 +0300, Heikki Linnakangas wrote:
but we should be consistent and document that:
(a) it shouldn't happen
(b) that it's just a sanity check and we're ignoring the race
Would this be sufficient?
---
On Sat, 2010-09-11 at 19:15 +0300, Heikki Linnakangas wrote:
Committed. I'll take a look at making walreceiver respond quickly when
WAL arrives in the standby, using latches, but that shouldn't interfere
with what you're doing.
I glanced at the code, and I see (in OwnLatch()):
+ if
Jeff Davis pg...@j-davis.com writes:
I glanced at the code, and I see (in OwnLatch()):
+ if (latch-owner_pid != 0)
+ elog(ERROR, latch already owned);
+ latch-owner_pid = MyProcPid;
But it looks like there may be a race there.
Yeah, that error check is only
On Sun, 2010-09-12 at 12:29 -0400, Tom Lane wrote:
... why throw an ERROR there if it can't happen (or
indicates an inconsistent state when it does happen)?
Are you suggesting that an Assert would be sufficient?
I'm not too picky about whether it's Assert, ERROR, or PANIC (Asserts
aren't
On Sun, 2010-09-12 at 10:13 -0700, Jeff Davis wrote:
I'm not too picky about whether it's Assert, ERROR, or PANIC (Asserts
aren't available in production systems, which I assume is why elog was
used); but we should be consistent and document that:
(a) it shouldn't happen
(b) that it's just
Jeff Davis pg...@j-davis.com writes:
However, that also means that the whole concept of OwnLatch/DisownLatch
is entirely redundant, and only there for asserts because it doesn't do
anything else. That seems a little strange to me, as well, so (at
minimum) it should be documented that the
On Sun, 2010-09-12 at 14:12 -0400, Tom Lane wrote:
Uh, this is nonsense. You have to have something like these functions
to support transferring ownership of a latch from one process to
another, which is required at least for the walreceiver usage.
Oh, I see. It's needed to know where to send
On 06/09/10 19:27, Heikki Linnakangas wrote:
On 06/09/10 17:18, Tom Lane wrote:
BTW, on reflection the AcquireLatch/ReleaseLatch terminology seems a bit
ill chosen: ReleaseLatch sounds way too much like something that would
just unlock or clear the latch. Perhaps OwnLatch/DisownLatch, or
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 06/09/10 19:27, Heikki Linnakangas wrote:
It seems that NumSharedLatches() is entirely wrongly placed if it's in
the win32 specific code! That needs to be somewhere shared, so people find
it,
Yeah. There's a notice of that in
On 11/09/10 18:02, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
Barring any last-minute objections, I'll commit this in the next few
days. This patch doesn't affect walreceiver yet; I think the next step
is to use the latches to eliminate the polling loop in
On 09/08/2010 08:18 PM, Tom Lane wrote:
Considering that we know that major platforms
such as FreeBSD have changed their implementations *very* recently,
it seems foolish to assume that an executable built on a machine with
corrected pselect could not be run on one with an older implementation.
On 09/08/2010 08:01 PM, Heikki Linnakangas wrote:
Yeah, there isn't much you can do about it. Perhaps you could set a
mayday flag (a global boolean variable) if it fails, and check that in
the main loop, elogging a warning there instead. But I don't think we
need to go to such lengths,
Hi,
On 09/06/2010 11:03 PM, Tom Lane wrote:
I don't entirely see the point of opening ourselves up to the risk of
using a pselect that's not safe under the hood.
It should be possible to reliably determine the platforms that provide
an atomic pselect(). For those, I'm hesitant to use a
On 08/09/10 20:36, Markus Wanner wrote:
On 09/06/2010 11:03 PM, Tom Lane wrote:
I don't entirely see the point of opening ourselves up to the risk of
using a pselect that's not safe under the hood.
It should be possible to reliably determine the platforms that provide
an atomic pselect(). For
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 08/09/10 20:36, Markus Wanner wrote:
It should be possible to reliably determine the platforms that provide
an atomic pselect(). For those, I'm hesitant to use a trick, where
pselect() clearly provides a simpler and more
Martijn van Oosterhout klep...@svana.org writes:
If the issue is just that select() doesn't get interrupted and we don't
care about a couple of syscalls, would it not be better to simply use
sigaction to turn on SA_RESTART just prior to the select() and turn it
off just after. Or are these
On Mon, Sep 06, 2010 at 07:24:59PM +0200, Markus Wanner wrote:
Do I understand correctly that the purpose of this patch is to work
around the brokenness of select() on very few platforms? Or is there any
additional feature that plain signals don't give us?
If the issue is just that
On 08/09/10 23:07, Martijn van Oosterhout wrote:
On Mon, Sep 06, 2010 at 07:24:59PM +0200, Markus Wanner wrote:
Do I understand correctly that the purpose of this patch is to work
around the brokenness of select() on very few platforms? Or is there any
additional feature that plain signals
On 06/09/10 23:10, Markus Wanner wrote:
Good. How about syscall overhead? One more write operation to the
self-pipe per signal from within the signal handler and one read to
actually clear the 'ready' state of the pipe from the waiter portion of
the code, right?
Right.
Do we plan to replace
On 06/09/10 20:24, Markus Wanner wrote:
On 09/06/2010 06:27 PM, Heikki Linnakangas wrote:
+ * It's important to reset the latch*before* checking if there's work to
+ * do. Otherwise, if someone sets the latch between the check and the
+ * ResetLatch call, you will miss it and Wait will block.
On 09/07/2010 09:06 AM, Heikki Linnakangas wrote:
Setting a latch that's already set is very fast, so you want to keep it
set until the last moment. See the coding in walsender for example, it
goes to some lengths to avoid clearing the latch until it's very sure
there's no more work for it to
On 03/09/10 21:50, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
On 03/09/10 21:16, Tom Lane wrote:
It's probably not too unreasonable to assume that pid_t assignment is
atomic. But I'm still thinking that we have bigger problems than that
if there are really
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 03/09/10 21:50, Tom Lane wrote:
Well, in that case what we need to do is presume that the latch object
has a continuing existence but the owner/receiver can come and go.
I would suggest that InitLatch needs to initialize the
On 06/09/10 17:18, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
I think we have just a terminology issue. What you're describing is
exactly how it works now, if you just s/InitLatch/AcquireLatch.
No, it isn't. What I'm suggesting requires breaking InitLatch
Hi,
On 09/06/2010 06:27 PM, Heikki Linnakangas wrote:
Here's an updated patch, with all the issues reported this far fixed,
except for that naming issue, and Fujii's suggestion to use poll()
instead of select() where available. I've also polished it quite a bit,
improving comments etc. Magnus,
Markus Wanner mar...@bluegap.ch writes:
Is pselect() really as unportable as stated in the patch? What platforms
have problems with pselect()?
Well, it's not defined in the Single Unix Spec, which is our customary
reference for portability. Also, it's alleged that some platforms have
it but
On 09/06/2010 08:46 PM, Tom Lane wrote:
Well, it's not defined in the Single Unix Spec, which is our customary
reference for portability.
FWIW, I bet the self-pipe trick isn't mentioned there, either... any
good precedence that it actually works as expected on all of the target
platforms?
Markus Wanner mar...@bluegap.ch writes:
AFAICT the custom select() implementation we are using for Windows could
easily be changed to mimic pselect() instead. Thus most reasonably
up-to-date Linux distributions plus Windows certainly provide a workable
pselect() syscall. Would it be worth
On Fri, 2010-09-03 at 18:24 -0400, Tom Lane wrote:
Now the HS case likewise appears to be set up so that the signal can
only directly interrupt ProcWaitForSignal, so I think the core issue
is
whether any deadlock situations are possible. Given that this gets
called from a low-level place
On 02/09/10 23:13, Tom Lane wrote:
The WaitLatch ...timeout API could use a bit of refinement. I'd suggest
defining negative timeout as meaning wait forever, so that timeout = 0
can be used for check but don't wait. Also, it seems like the
function shouldn't just return void but should return
Fujii Masao masao.fu...@gmail.com writes:
On Fri, Sep 3, 2010 at 11:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
elog(FATAL) is *certainly* not a better idea. I think there's really
nothing that can be done, you just have to silently ignore the error.
Hmm.. some functions called by a signal
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 02/09/10 23:13, Tom Lane wrote:
(Yeah, I realize the caller
could look into the latch to find that out, but callers really ought to
treat latches as opaque structs.)
Hmm, maybe we need a TestLatch function to check if a latch
On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Fujii Masao masao.fu...@gmail.com writes:
On Fri, Sep 3, 2010 at 11:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
elog(FATAL) is *certainly* not a better idea. I think there's really
nothing that can be done, you just have to
Robert Haas robertmh...@gmail.com writes:
On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
[ shrug... ] I stated before that the Hot Standby patch is doing
utterly unsafe things in signal handlers. Simon rejected that.
I am waiting for irrefutable evidence to emerge from
On Fri, Sep 3, 2010 at 12:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
[ shrug... ] I stated before that the Hot Standby patch is doing
utterly unsafe things in signal handlers. Simon
On 03/09/10 17:51, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
On 02/09/10 23:13, Tom Lane wrote:
Also, using sig_atomic_t for owner_pid is entirely not sane.
Hmm, true, it doesn't need to be set from signal handler, but is there
an atomicity problem if
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 03/09/10 17:51, Tom Lane wrote:
If there is *any* possibility of that happening then you have far worse
problems than whether the field is atomically readable or not: the
behavior will be unpredictable at just slightly larger
On 03/09/10 21:16, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
WaitLatch had to set the pid on the Latch struct to allow other
processes to send the signal. Another process could call SetLatch and
read the pid field, while WaitLatch is just setting it. I think
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 03/09/10 21:16, Tom Lane wrote:
It's probably not too unreasonable to assume that pid_t assignment is
atomic. But I'm still thinking that we have bigger problems than that
if there are really cases where SetLatch can execute at
Tom Lane wrote:
Robert Haas robertmh...@gmail.com writes:
On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
[ shrug... ] I stated before that the Hot Standby patch is doing
utterly unsafe things in signal handlers. Simon rejected that.
I am waiting for irrefutable evidence
On Fri, Sep 3, 2010 at 3:11 PM, Ron Mayer rm...@cheapcomplexdevices.com wrote:
Tom Lane wrote:
Robert Haas robertmh...@gmail.com writes:
On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
[ shrug... ] I stated before that the Hot Standby patch is doing
utterly unsafe things
On 03/09/10 19:38, Robert Haas wrote:
On Fri, Sep 3, 2010 at 12:10 PM, Tom Lanet...@sss.pgh.pa.us wrote:
Robert Haasrobertmh...@gmail.com writes:
On Fri, Sep 3, 2010 at 10:07 AM, Tom Lanet...@sss.pgh.pa.us wrote:
[ shrug... ] I stated before that the Hot Standby patch is doing
utterly
On Fri, Sep 3, 2010 at 4:20 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Maybe that's ok, if I'm reading the deadlock checker code correctly, it also
calls semop() to increment the another process' semaphore, and the deadlock
checker can be invoked from a signal handler
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
A safer approach would be to just PGSemaphoreUnlock() in the signal
handler, and do all the other processing outside it.
I don't see any particularly good reason to assume that
PGSemaphoreUnlock is safe either: you're still talking
Robert Haas robertmh...@gmail.com writes:
Color me confused; I may need to backpedal rapidly here. I had
thought that what Tom was complaining about was the fact that the
signal handler was taking LWLocks, which I would have thought to be
totally unsafe.
Well, it's unsafe if the signal could
On Fri, Sep 3, 2010 at 6:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Now the HS case likewise appears to be set up so that the signal can
only directly interrupt ProcWaitForSignal, so I think the core issue is
whether any deadlock situations are possible. Given that this gets
called from a
On 02/09/10 06:46, Fujii Masao wrote:
On Wed, Sep 1, 2010 at 4:11 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
The obvious next question is how to wait for multiple sockets and a latch at
the same time? Perhaps we should have a select()-like interface where you
can pass
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
Ok, here's an updated patch with WaitLatchOrSocket that let's you do that.
Minor code review items:
s/There is three/There are three/ in unix_latch.c header comment
The header comment points out the correct usage of ResetLatch, but
On Fri, Sep 3, 2010 at 5:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Does ReleaseLatch() have any actual use-case, and if so what would it be?
I think we could do without it.
In Unix, probably we can live without that. But in Windows, we need to
free SharedEventHandles slot for upcoming process
Fujii Masao masao.fu...@gmail.com writes:
+ * XXX: Is it safe to elog(ERROR) in a signal handler?
No, it isn't.
We should use elog(FATAL) or check proc_exit_inprogress, instead?
elog(FATAL) is *certainly* not a better idea. I think there's really
nothing that can be done, you
On Fri, Sep 3, 2010 at 11:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Fujii Masao masao.fu...@gmail.com writes:
+ * XXX: Is it safe to elog(ERROR) in a signal handler?
No, it isn't.
We should use elog(FATAL) or check proc_exit_inprogress, instead?
elog(FATAL) is *certainly* not
On 31/08/10 15:47, Fujii Masao wrote:
On Fri, Aug 27, 2010 at 4:39 PM, Fujii Masaomasao.fu...@gmail.com wrote:
/*
* XXX: Should we invent an API to wait for data coming from the
* client connection too? It's not critical, but we could then
* eliminate the timeout altogether and go to
On Wed, Sep 1, 2010 at 4:11 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
The obvious next question is how to wait for multiple sockets and a latch at
the same time? Perhaps we should have a select()-like interface where you
can pass multiple file descriptors. Then again,
Here's a 2nd version of the latch patch. Now with a Windows
implementation. Comments welcome.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
diff --git a/configure b/configure
index bd9b347..432cd58 100755
--- a/configure
+++ b/configure
@@ -27773,6 +27773,13 @@ _ACEOF
On Tue, Aug 31, 2010 at 4:06 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Here's a 2nd version of the latch patch. Now with a Windows
implementation. Comments welcome.
Seems good.
Two minor comments:
rc = WaitForSingleObject(latch-event, timeout / 1000);
if (rc ==
On Fri, Aug 27, 2010 at 4:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
/*
* XXX: Should we invent an API to wait for data coming from the
* client connection too? It's not critical, but we could then
* eliminate the timeout altogether and go to sleep for good.
*/
Yes, it would be
Tom Lane wrote:
Greg Smith g...@2ndquadrant.com writes:
Tom Lane wrote:
Well, yes they are. They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle. See for example a
longstanding complaint here:
Tom Lane wrote:
Greg Smith g...@2ndquadrant.com writes:
... The only clear case where this is
always a win is when the system it totally idle.
If you'll climb down off that horse for a moment: yeah, the idle case is
*exactly* what they're complaining about.
I wasn't on a horse
On Thu, Aug 26, 2010 at 7:40 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Here's a first attempt at implementing that. To demonstrate how it works, I
modified walsender to use the new latch facility, also to respond quickly to
SIGHUP and SIGTERM.
Great!
There's two kinds
On 27/08/10 10:39, Fujii Masao wrote:
On Thu, Aug 26, 2010 at 7:40 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
There's two kinds of latches, local and global. Local latches can only be
set from the same process - allowing you to replace pg_usleep() with
something that is
Tom Lane wrote:
Well, yes they are. They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle. See for example a
longstanding complaint here:
https://bugzilla.redhat.com/show_bug.cgi?id=252129
If we're going to go to the trouble of having a mechanism
On Fri, Aug 27, 2010 at 5:54 PM, Greg Smith g...@2ndquadrant.com wrote:
Tom Lane wrote:
Well, yes they are. They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle. See for example a
longstanding complaint here:
Robert Haas robertmh...@gmail.com writes:
On Fri, Aug 27, 2010 at 5:54 PM, Greg Smith g...@2ndquadrant.com wrote:
The way the background writer wakes up periodically to absorb fsync requests
is already way too infrequent on a busy system.
Maybe instead of a fixed-duration sleep we could wake
Greg Smith g...@2ndquadrant.com writes:
Tom Lane wrote:
Well, yes they are. They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle. See for example a
longstanding complaint here:
https://bugzilla.redhat.com/show_bug.cgi?id=252129
... The only
On 24/08/10 02:44, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
[ latch proposal ]
This seems reasonably clean as far as signal conditions generated
internally to Postgres go, but I remain unclear on how it helps for
response to actual signals.
You can can
On 24/08/10 04:08, Alvaro Herrera wrote:
Excerpts from Tom Lane's message of lun ago 23 19:44:02 -0400 2010:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
[ latch proposal ]
This seems reasonably clean as far as signal conditions generated
internally to Postgres go, but I
On 20/08/10 17:28, Tom Lane wrote:
[ It's way past time to change the thread title ]
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
On 20/08/10 16:24, Tom Lane wrote:
You keep on proposing solutions that only work for walsender :-(.
Well yes, the other places where we use
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 20/08/10 17:28, Tom Lane wrote:
Well, yes they are. They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle. See for example a
longstanding complaint here:
Excerpts from Tom Lane's message of lun ago 23 19:44:02 -0400 2010:
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
[ latch proposal ]
This seems reasonably clean as far as signal conditions generated
internally to Postgres go, but I remain unclear on how it helps for
[ It's way past time to change the thread title ]
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 20/08/10 16:24, Tom Lane wrote:
You keep on proposing solutions that only work for walsender :-(.
Well yes, the other places where we use pg_usleep() are not really a
problem
72 matches
Mail list logo