On Tue, Nov 14, 2017 at 3:01 AM, Peter Geoghegan <p...@bowt.ie> wrote: > On Mon, Nov 13, 2017 at 12:25 AM, Masahiko Sawada <sawada.m...@gmail.com> > wrote: >> Commit e2c79e14 prevented multiple cleanup process for pending list in >> GIN index. But I think that there is still possibility that vacuum >> could miss tuples to be deleted if someone else is cleaning up the >> pending list. > > I've been suspicious of that commit (and related commits) for a while > now [1]. I think that it explains a couple of different problem > reports that we have seen.
Yeah, the problem here is that vacuum and analyze don't acquire a heavy weight lock for meta page using properly function. it seems not relevant with that problem. > >> In ginInsertCleanup(), we lock the GIN meta page by LockPage and could >> wait for the concurrent cleaning up process if stats == NULL. And the >> source code comment says that this happen is when ginINsertCleanup is >> called by [auto]vacuum/analyze or gin_clean_pending_list(). I agree >> with this behavior. However, looking at the callers the stats is NULL >> only either if pending list exceeds to threshold during insertions or >> if only analyzing is performed by an autovacum worker or ANALYZE >> command. So I think we should inVacuum = (stats != NULL) instead. >> Also, we might want autoanalyze and ANALYZE command to wait for >> concurrent process as well. Attached patch fixes these two issue. If >> this is a bug we should back-patch to 9.6. > > How did you figure this out? Did you just notice that the code wasn't > doing what it claimed to do, or was there a problem that you saw in > production? > I just noticed it during surveying the GIN code for the another patch[1]. [1] https://commitfest.postgresql.org/15/1133/ Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center