Re: [HACKERS] dsm_unpin_segment

Thomas Munro Sun, 21 Aug 2016 16:55:37 -0700

On Sat, Aug 20, 2016 at 11:37 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Tue, Aug 9, 2016 at 10:07 AM, Thomas Munro
> <thomas.mu...@enterprisedb.com> wrote:
>> On Tue, Aug 9, 2016 at 12:53 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> The larger picture here is that Robert is exhibiting a touching but
>>> unfounded faith that extensions using this feature will contain zero bugs.
>>> IMO there needs to be some positive defense against mistakes in using the
>>> pin/unpin API.  As things stand, multiple pin requests don't have any
>>> fatal consequences (especially not on non-Windows), so I have little
>>> confidence that it's not happening in the field.  I have even less
>>> confidence that there wouldn't be too many unpin requests.
>>
>> Ok, here is a version that defends against invalid sequences of
>> pin/unpin calls.  I had to move dsm_impl_pin_segment into the block
>> protected by DynamicSharedMemoryControlLock, so that it could come
>> after the already-pinned check, but before updating any state, since
>> it makes a Windows syscall that can fail.  That said, I've only tested
>> on Unix and will need to ask someone to test on Windows.
>>
>
> Few review comments:

Thanks for the review!

> 1.
> + /* Note that 1 means no references (0 means unused slot). */
> + if (--dsm_control->item[i].refcnt == 1)
> + destroy = true;
> +
> + /*
> + * Allow implementation-specific code to run.  We have to do this before
> + * releasing the lock, because impl_private_pm_handle may get modified by
> + * dsm_impl_unpin_segment.
> + */
> + if (control_slot >= 0)
> + dsm_impl_unpin_segment(handle,
> + &dsm_control->item[control_slot].impl_private_pm_handle);
>
> If there is an error in dsm_impl_unpin_segment(), then we don't need
> to decrement the reference count.  Isn't it better to do it after the
> dsm_impl_unpin_segment() is successful.  Similarly, I think pinned
> should be set to false after dsm_impl_unpin_segment().

Hmm.  Yeah, OK.  Things are in pretty bad shape if you fail to unpin
despite having run the earlier checks, but you're right, it's better
that way.  New version attached.

> Do you need a check if (control_slot >= 0)?  In the code just above
> you have as Assert to ensure that it is >=0.

Right.  Furthermore, the code was using "i" in some places and
"control_slot" in others.  We might as well use control_slot
everywhere.

On Sun, Aug 21, 2016 at 5:45 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Sat, Aug 20, 2016 at 6:13 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> On Sat, Aug 20, 2016 at 7:37 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>> 2.
>>> + if (dsm_control->item[seg->control_slot].pinned)
>>> + elog(ERROR, "cannot pin a segment that is already pinned");
>>>
>>> Shouldn't this be a user facing error (which means we should use ereport)?
>>
>> Uh, certainly not.  This can't happen because of SQL the user enters;
>> it can only happen because of a C coding error.  elog() is the
>> appropriate tool for that case.
>>
>
> Okay, your point makes sense to me, but in that case why not use an
> Assert here?  I think elog() could also be used here as well, but it
> seems to me elog() is most appropriate for the cases where it is not
> straightforward to tell the behaviour of API or value of variable like
> when PageAddItem() is not successful.

Here's the rationale I'm using: if it's helpful to programmers of
client code, especially client code that might include extensions, and
nowhere near a hot code path, then why not use elog rather than
Assert?  I took inspiration for that from the pre-existing "debugging
cross-check"  in dsm_attach that does elog(ERROR, "can't attach the
same segment more than once").  On that basis, this new version
retains the elog you mentioned, and now also uses elog for the
you-tried-to-unpin-a-handle-I-couldn't-find case.  But I kept Assert
in places that detect bugs in *this* code, rather than client code.

> void
>  dsm_pin_segment(dsm_segment *seg)
>
> +void
> +dsm_unpin_segment(dsm_handle handle)
>
> Another point, I want to highlight here is that pin/unpin API's have
> different type of arguments.  The comments on top of function
> dsm_unpin_segment() explains the need of using dsm_handle which seems
> reasonable.  I just pointed out to see if anybody else has a different
> view.

Now that I have posted the DSA patch[1], it probably makes sense to
explain the path by which I arrived at the conclusion that unpin
should take a handle.  In an earlier version, dsm_unpin_segment took a
dsm_segment *, mirroring the pin function.  But then Robert complained
privately that my dsa_area destroy code path was sometimes having to
attach segments just to unpin them and then detach, which might fail
due to lack of virtual memory.  He was right to complain: that created
a fairly nasty failure mode where I'd unpinned some but not all of the
DSM segments that belong together.  So I concluded that it should be
possible to unpin a segment even when you haven't got it attached, and
rewrote it this way.  In general, any structure built from multiple
DSM segments which point to each other using handles could run into
this problem.  I think the function's comment covers it, but hopefully
this concrete example is convincing.

[1] 
https://www.postgresql.org/message-id/CAEepm%3D1z5WLuNoJ80PaCvz6EtG9dN0j-KuHcHtU6QEfcPP5-qA%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

dsm-unpin-segment-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dsm_unpin_segment

Reply via email to