Mihail Nikalayeu <[email protected]> wrote:

> PART 1:
> 
> --------------
> 
> Something still wrong with 0006, check:
> 
> 'pgbench: error: client 12 script 0 aborted in command 2 query 0: ERROR:  
> attempted to overwrite invisible tuple
> https://cirrus-ci.com/task/6385612527239168?logs=test_world#L300
> 
> But it is hard to reproduce - happened once.
> 
> --------------
> 
> Also, once I got
> [16:25:18.641] #   at 
> /tmp/cirrus-ci-build/contrib/amcheck/t/007_repack_concurrently.pl line 57.
> [16:25:18.641] #                   'pgbench: error: client 6 script 0 aborted 
> in command 2 query 0: ERROR:  relation 21856 deleted while still in use
> https://cirrus-ci.com/task/4686014242881536?logs=test_world#L384
> 
> It was the PROC_IN_REPACK version (see below), but I think it is not related 
> to it. But I'm not 100% sure.

I think it *is* related. My earlier patch version, which used the
PROC_IN_VACUUM flag improperly [1] was also causing visibility issues.  Please
let me know if you manage to reproduce the issue with v32.

> PART 2:
> 
> > I'm considering a special kind of relation whose catalog entries remain in 
> > the
> >  catalog cache and are never written to the catalog tables. (Unlike 
> > temporary
> >  relation, it'd be WAL logged so that REPACK can be replayed on standby.)
> 
> I think it is too complicated, especially including replication logic.

I'm confused by hearing a complaint about complexity of code that I haven't
posted yet. And I don't understand the relationship to "replication logic":
REPACK (CONCURRENTLY) tries to avoid decoding of data changes in the *new*
(transient) relation anyway.

> Essentially we have two issues:
> 1) make sure catalog entities are not dropped because the vacuum
> 2) make sure data in new table is not vacuumed also

3) XID assigned early due to creation of catalog entries for the new table -
that XID prevents the VACUUM xmin horizon from advancing till the end of the
transaction, i.e. till the end of REPACK execution.

> Also, I am still not sure if MVCC-safe implementation is worth its complexity 
> compared with "relcheckxmin"approach [0].

IMO it's better for users to see the correct data than ERROR. But it still
needs work.

[1] https://www.postgresql.org/message-id/88003.1769511456%40localhost

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to