On Thu, Sep 3, 2009 at 6:47 AM, Boszormenyi Zoltan <z...@cybertec.at> wrote:

> Boszormenyi Zoltan írta:
> > Alvaro Herrera írta:
> >
> >> Boszormenyi Zoltan wrote:
> >>
> >>
> >>
> >>> The vague consensus for syntax options was that the GUC
> >>> 'lock_timeout' and WAIT [N] extension (wherever NOWAIT
> >>> is allowed) both should be implemented.
> >>>
> >>> Behaviour would be that N seconds timeout should be
> >>> applied to every lock that the statement would take.
> >>>
> >>>
> >> In
> http://archives.postgresql.org/message-id/291.1242053...@sss.pgh.pa.us
> >> Tom argues that lock_timeout should be sufficient.  I'm not sure what
> >> does WAIT [N] buy
>

I disagree with Tom on this point.  *If* I was trying to implement  a server
policy, then sure, it should not be done by embedding the timeout in the SQL
statement.  But I don't think they want this to implement a server policy.
(And if we do, why would we thump the poor victims that are waiting on the
lock, rather than the rogue who decided to take a lock and then camp out on
it?)  The use case for WAIT [N] is not a server policy, but a UI policy.  I
have two ways to do this task.  The preferred way needs to lock a row, but
waiting for it may take too long.  So if I can't get the lock within a
reasonable time, I fall back on a less-preferred but still acceptable way of
doing the task, one that doesn't need the lock.  If we move to a new server,
the appropriate value for the time out does not change, because the
appropriate level is the concern of the UI and the end users, not the
database server.  This wouldn't be scattered all over the application,
either.  In my experience, if you have an application that could benefit
from this, you might have 1 or 2 uses for WAIT [N] out of 1,000+ statements
in the application.  (From my perspective, if there were to be a WAIT [N]
option, it could plug into the statement_timeout mechanism rather than the
proposed lock_timeout mechanism.)

I think that if the use case for a GUC is to set it, run a single very
specific statement, and then unset it, that is pretty clear evidence that
this should not be a GUC in the first place.

Maybe I am biased in this because I am primarily thinking about how I would
use such a feature, rather than how Hans-Juergen intends to use it, and
maybe those uses differ.  Hans-Juergen, could you describe your use case a
little bit more?   Who do is going to be getting these time-out errors, the
queries run by the web-app, or longer running back-office queries?  And when
they do get an error, what will they do about it?


>>
> >
> > Okay, we implemented only the lock_timeout GUC.
> > Patch attached, hopefully in an acceptable form.
> > Documentation included in the patch, lock_timeout
> > works the same way as statement_timeout, takes
> > value in milliseconds and 0 disables the timeout.
> >
> > Best regards,
> > Zoltán Böszörményi
> >
>
> New patch attached. It's only regenerated for current CVS
> so it should apply cleanly.
>


In addition to the previously mentioned seg-fault issues when attempting to
use this feature (confirmed in another machine, linux, 64 bit, and
--enable-cassert does not offer any help), I have some more concerns about
the patch.  From the docs:

doc/src/sgml/config.sgml

        Abort any statement that tries to lock any rows or tables and the
lock
        has to wait more than the specified number of milliseconds, starting
        from the time the command arrives at the server from the client.
        If <varname>log_min_error_statement</> is set to <literal>ERROR</>
or
        lower, the statement that timed out will also be logged.
        A value of zero (the default) turns off the limitation.

This suggests that all row locks will have this behavior.  However, my
experiments show that row locks attempted to be taken for ordinary UPDATE
commands do not time out.  If this is only intended to apply to SELECT ....
FOR UPDATE, that should be documented here.  It is documented elsewhere that
this applies to SELECT...FOR UPDATE, but it is not documented that this the
only row-locks it applies to.

"from the time the command arrives at the server".  I am pretty sure this is
not the desired behavior, otherwise how does it differ from
statement_timeout?  I think it must be a copy and paste error for the doc.


For the implementation, I think the patch touches too much code.  In
particular, lwlock.c.  Is the time spent waiting on ProcArrayLock
significant enough that it needs all of that code to support timing it out?
I don't think it should ever take more than a few microseconds to obtain
that light-weight lock.  And if we do want to time all of the light weight
access, shouldn't those times be summed up, rather than timing out only if
any single one of them exceeds the threshold in isolation?  (That is my
interpretation of how the code works currently, I could be wrong on that.)

If the seg-faults are fixed, I am still skeptical that this patch is
acceptable, because the problem it solves seems to be poorly or incompletely
specified.

Cheers,

Jeff

Reply via email to