Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

Boszormenyi Zoltan Fri, 13 Jul 2012 13:34:02 -0700

2012-07-12 19:05 keltezéssel, Tom Lane írta:

Here is a revised version of the timeout-infrastructure patch.
I whacked it around quite a bit, notably:


* I decided that the most convenient way to handle the initialization
issue was to combine establishment of the signal handler with resetting
of the per-process variables.  So handle_sig_alarm is no longer global,
and InitializeTimeouts is called at the places where we used to do
"pqsignal(SIGALRM, handle_sig_alarm);".  I believe this is correct
because any subprocess that was intending to use SIGALRM must have
called that before establishing any timeouts.

OK.

* BTW, doing that exposed the fact that walsender processes were failing
to establish a SIGALRM signal handler, which is a pre-existing bug,
because they run a normal authentication transaction during InitPostgres
and hence need to be able to cope with deadlock and statement timeouts.
I will do something about back-patching a fix for that.


Wow, my work uncovered a pre-existing bug in PostgreSQL. :-)

* I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
done in walsender and autovacuum processes.  I'm not totally satisfied
with that, but for sure it didn't work to only establish them in
regular backends.

* I didn't like the "TimeoutName" nomenclature, because to me "name"
suggests that the value is a string, not just an enum.  So I renamed
that to TimeoutId.

OK.

* I whacked around the logic in timeout.c a fair amount, because it
had race conditions if SIGALRM happened while enabling or disabling
a timeout.  I believe the attached coding is safe, but I'm not totally
happy with it from a performance standpoint, because it will do two
setitimer calls (a disable and then a re-enable) in many cases where
the old code did only one.


Disabling deadlock timeout, a.k.a. disable_sig_alarm(false) in
the original code called setitimer() twice if statement timeout
was still in effect, it was done in CheckStatementTimeout().
Considering this, I think there is no performance problem with
the new code you came up with.

I think what we ought to do is go ahead and apply this, so that we
can have the API nailed down, and then we can revisit the internals
of timeout.c to see if we can't get the performance back up.
It's clearly a much cleaner design than the old spaghetti logic in
proc.c, so I think we ought to go ahead with this independently of
whether the second patch gets accepted.


There is one tiny bit you might have broken. You wrote previously:

I am also underwhelmed by the "timeout_start" callback function concept.
In the first place, that's broken enable_timeout, which incorrectly
assumes that the value it gets must be "now" (see its schedule_alarm
call).  In the second place, it seems fairly likely that callers of
get_timeout_start would likewise want the clock time at which the
timeout was enabled, not the timeout_start reference time.  (If they
did want the latter, why couldn't they get it from wherever the callback
function had gotten it?)  I'm inclined to propose that we drop the
timeout_start concept and instead provide two functions for scheduling
interrupts:

        enable_timeout_after(TimeoutName tn, int delay_ms);
        enable_timeout_at(TimeoutName tn, TimestampTz fin_time);

where you use the former if you want the standard GetCurrentTimestamp +
n msec calculation, but if you want the stop time calculated in some
other way, you calculate it yourself and use the second function.


It's okay, but you haven't really followed it with STATEMENT_TIMEOUT:

-----8<----------8<----------8<----------8<----------8<-----
*** 2396,2404 ****
                /* Set statement timeout running, if any */
                /* NB: this mustn't be enabled until we are within an xact */
                if (StatementTimeout > 0)
!                       enable_sig_alarm(StatementTimeout, true);
                else
!                       cancel_from_timeout = false;

                xact_started = true;
        }
--- 2397,2405 ----
                /* Set statement timeout running, if any */
                /* NB: this mustn't be enabled until we are within an xact */
                if (StatementTimeout > 0)
!                       enable_timeout_after(STATEMENT_TIMEOUT, 
StatementTimeout);
                else
!                       disable_timeout(STATEMENT_TIMEOUT, false);

                xact_started = true;
        }
-----8<----------8<----------8<----------8<----------8<-----

It means that StatementTimeout losts its precision. It would trigger
in the future counting from "now" instead of counting from
GetCurrentStatementStartTimestamp(). It should be:

enable_timeout_at(STATEMENT_TIMEOUT,
TimestampTzPlusMilliseconds(GetCurrentStatementStartTimestamp(), 
StatementTimeout));

I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.


Yes, I will do that.

Thanks for your review and work.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

Reply via email to