On Fri, Jun 10, 2016 at 8:27 AM, Andres Freund <and...@anarazel.de> wrote:
> > > On June 9, 2016 7:46:06 PM PDT, Amit Kapila <amit.kapil...@gmail.com> > wrote: > >On Fri, Jun 10, 2016 at 8:08 AM, Andres Freund <and...@anarazel.de> > >wrote: > > > >> On 2016-06-09 19:33:52 -0700, Andres Freund wrote: > >> > I played with it for a while, and besides > >> > finding intentionally caused corruption, it didn't flag anything > >> > (besides crashing on a standby, as in 2)). > >> > >> Ugh. Just sends after I sent that email: > >> > >> oid | t_ctid > >> ------------------+-------------- > >> pgbench_accounts | (889641,33) > >> pgbench_accounts | (893854,56) > >> pgbench_accounts | (924226,13) > >> pgbench_accounts | (1073457,51) > >> pgbench_accounts | (1084904,16) > >> pgbench_accounts | (1111996,26) > >> (6 rows) > >> > >> oid | t_ctid > >> -----+-------- > >> (0 rows) > >> > >> oid | t_ctid > >> ------------------+-------------- > >> pgbench_accounts | (739198,13) > >> pgbench_accounts | (887254,11) > >> pgbench_accounts | (1050391,6) > >> pgbench_accounts | (1158640,46) > >> pgbench_accounts | (1238067,18) > >> pgbench_accounts | (1273282,22) > >> pgbench_accounts | (1355816,54) > >> pgbench_accounts | (1361880,33) > >> (8 rows) > >> > >> > >Is this output of pg_check_visible() or pg_check_frozen()? > > Unfortunately I don't know. I was running a union of both, I didn't really > expect to hit an issue... I guess I'll put a PANIC in the relevant places > and check whether I cab reproduce. > > I have tried in multiple ways by running pgbench with read-write tests, but could not see any such behaviour. I have tried by even crashing and restarting the server and then again running pgbench. Do you see these records on master or slave? While looking at code in this area, I observed that during replay of records (heap_xlog_delete), we first clear the vm, then update the page. So we don't have Buffer lock while updating the vm where as in the patch (collect_corrupt_items()), we are relying on the fact that for clearing vm bit one needs to acquire buffer lock. Can that cause a problem? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com