Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-10 Thread Wood, Dan
I found one glitch with our merge of the original dup row fix. With that corrected AND Alvaro’s Friday fix things are solid. No dup’s. No index corruption. Thanks so much. On 10/10/17, 7:25 PM, "Michael Paquier" wrote: On Tue, Oct 10, 2017 at 11:14 PM, Alvaro

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-09 Thread Wood, Dan
I’m unclear on what is being repro’d in 9.6. Are you getting the duplicate rows problem or just the reindex problem? Are you testing with asserts enabled(I’m not)? If you are getting the dup rows consider the code in the block in heapam.c that starts with the comment “replace multi by update

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Wood, Dan
On Thu, Oct 5, 2017 at 10:39 AM, Wood, Dan <hexp...@amazon.com> wrote: > Whatever you do make sure to also test 250 clients running lock.sql. Even with the communities fix plus YiWen’s fix I can still get duplicate rows. What works for “in-block” hot chains may not

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-04 Thread Wood, Dan
Whatever you do make sure to also test 250 clients running lock.sql. Even with the communities fix plus YiWen’s fix I can still get duplicate rows. What works for “in-block” hot chains may not work when spanning blocks. Once nearly all 250 clients have done their updates and everybody is

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Wood, Dan
| 49155 | 36963 | 36961 7 | (0,7) | 8032 | 11010 | 32771 | 36961 | 0 (7 rows) On 10/3/17, 6:20 PM, "Peter Geoghegan" <p...@bowt.ie> wrote: On Tue, Oct 3, 2017 at 6:09 PM, Wood, Dan <hexp...@amazon.com> wrote: > I’ve just started looking

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Wood, Dan
I’ve just started looking at this again after a few weeks break. There is a tangled web of issues here. With the community fix we get a corrupted page(invalid redirect ptr from indexed item). The cause of that is: pruneheap.c: /* * Check the tuple XMIN