Hi,
On 20/07/2020 17:51, Tom Lane wrote:
Peter Geoghegan <p...@bowt.ie> writes:
Skink's latest run reports a failure that I surmise was caused by this patch:
Yeah, I've just been digging through that. The patch didn't create
the bug, but it allowed valgrind to detect it, because the column
status array is now "just big enough" rather than being always
MaxTupleAttributeNumber entries. To wit, the problem is that the
code in apply_handle_update that computes target_rte->updatedCols
is junk.
The immediate issue is that it fails to apply the remote-to-local
column number mapping, so that it's looking at the wrong colstatus
entries, possibly including entries past the end of the array.
I'm fixing that, but even after that, there's a semantic problem:
LOGICALREP_COLUMN_UNCHANGED is just a weak optimization, cf the code
that sends it, in proto.c around line 480. colstatus will often *not*
be that for columns that were in fact not updated on the remote side.
I wonder whether we need to take steps to improve that.
LOGICALREP_COLUMN_UNCHANGED is not trying to optimize anything, there is
certainly no effort made to not send columns that were not updated by
logical replication itself. It's just something we invented in order to
handle the fact that values for TOASTed columns that were not updated
are simply not visible to logical decoding (unless table has REPLICA
IDENTITY FULL) as they are not written to WAL nor accessible via
historic snapshot. So the output plugin simply does not see the real value.
--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/