Re: [HACKERS] Support for REINDEX CONCURRENTLY

Michael Paquier Thu, 26 Sep 2013 13:42:25 -0700

On Thu, Sep 26, 2013 at 8:56 PM, Andres Freund <and...@2ndquadrant.com> wrote:
> On 2013-09-26 20:47:33 +0900, Michael Paquier wrote:
>> On Thu, Sep 26, 2013 at 8:43 PM, Andres Freund <and...@2ndquadrant.com> 
>> wrote:
>> > On 2013-09-26 20:40:40 +0900, Michael Paquier wrote:
>> >> On Thu, Sep 26, 2013 at 7:34 PM, Andres Freund <and...@2ndquadrant.com> 
>> >> wrote:
>> >> > On 2013-09-26 12:13:30 +0900, Michael Paquier wrote:
>> >> >> > 2) I don't think the drop algorithm used now is correct. Your
>> >> >> > index_concurrent_set_dead() sets both indisvalid = false and 
>> >> >> > indislive =
>> >> >> > false at the same time. It does so after doing a 
>> >> >> > WaitForVirtualLocks() -
>> >> >> > but that's not sufficient. Between waiting and setting indisvalid =
>> >> >> > false another transaction could start which then would start using 
>> >> >> > that
>> >> >> > index. Which will not get updated anymore by other concurrent 
>> >> >> > backends
>> >> >> > because of inislive = false.
>> >> >> > You really need to follow index_drop's lead here and first unset
>> >> >> > indisvalid then wait till nobody can use the index for querying 
>> >> >> > anymore
>> >> >> > and only then unset indislive.
>> >> >
>> >> >> Sorry, I do not follow you here. index_concurrent_set_dead calls
>> >> >> index_set_state_flags that sets indislive and *indisready* to false,
>> >> >> not indisvalid. The concurrent index never uses indisvalid = true so
>> >> >> it can never be called by another backend for a read query. The drop
>> >> >> algorithm is made to be consistent with DROP INDEX CONCURRENTLY btw.
>> >> >
>> >> > That makes it even worse... You can do the concurrent drop only in the
>> >> > following steps:
>> >> > 1) set indisvalid = false, no future relcache lookups will have it as 
>> >> > valid
>> >
>> >> indisvalid is never set to true for the concurrent index. Swap is done
>> >> with concurrent index having indisvalid = false and former index with
>> >> indisvalid = true. The concurrent index is validated with
>> >> index_validate in a transaction before swap transaction.
>> >
>> > Yes. I've described how it *has* to be done, not how it's done.
>> >
>> > The current method of going straight to isready = false for the original
>> > index will result in wrong results because it's not updated anymore
>> > while it's still being used.
>
>> The index being dropped at the end of process is not the former index,
>> but the concurrent index. The index used after REINDEX CONCURRENTLY is
>> the old index but with the new relfilenode.
>
> That's not relevant unless I miss something.
>
> After phase 4 both indexes are valid (although only the old one is
> flagged as such), but due to the switching of the relfilenodes backends
> could have either of both open, depending on the time they built the
> relcache entry. Right?
> Then you go ahead and mark the old index - which still might be used! -
> as dead in phase 5. Which means other backends might (again, depending
> on the time they have built the relcache entry) not update it
> anymore. In read committed we very well might go ahead and use the index
> with the same plan as before, but with a new snapshot. Which now will
> miss entries.
In this case, doing a call to WaitForOldSnapshots after the swap phase
is enough. It was included in past versions of the patch but removed
in the last 2 versions.


Btw, taking the problem from another viewpoint... This feature has now
3 patches, the 2 first patches doing only code refactoring. Could it
be possible to have a look at those ones first? Straight-forward
things should go first, simplifying the core feature evaluation.

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

Reply via email to