Andres Freund <and...@anarazel.de> writes: > I guess either using valgrind's gdb server on error, or putting some > asserts checking the size would be best. I can look into it, but it'll > not be today likely.
I believe the problem is that DecodeUpdate is not on the same page as the WAL-writing routines about how much data there is for an old_key_tuple. Specifically, I see this in 9.4's log_heap_update(): if (old_key_tuple) { ... xlhdr_idx.t_len = old_key_tuple->t_len; rdata[nr].data = (char *) old_key_tuple->t_data + offsetof(HeapTupleHeaderData, t_bits); rdata[nr].len = old_key_tuple->t_len - offsetof(HeapTupleHeaderData, t_bits); ... } so that the amount of tuple data that's *actually* in WAL is offsetof(HeapTupleHeaderData, t_bits) less than what t_len says. However, over in DecodeUpdate, this is processed with xl_heap_header_len xlhdr; memcpy(&xlhdr, data, sizeof(xlhdr)); ... datalen = xlhdr.t_len + SizeOfHeapHeader; ... DecodeXLogTuple(data, datalen, change->data.tp.oldtuple); and what DecodeXLogTuple does is int datalen = len - SizeOfHeapHeader; (so we're back to datalen == xlhdr.t_len) ... memcpy(((char *) tuple->tuple.t_data) + offsetof(HeapTupleHeaderData, t_bits), data + SizeOfHeapHeader, datalen); so that we are copying offsetof(HeapTupleHeaderData, t_bits) too much data from the WAL buffer. Most of the time this doesn't hurt but it's making valgrind complain, and on a unlucky day we might crash entirely. I have not looked to see if the bug also exists in > 9.4. Also, it's not very clear to me whether other call sites for DecodeXLogTuple might have related bugs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers