I think this is an interesting idea. My first thought is that it would be nice if we didn't have to first compress the data to see whether we wanted to store it compressed or not, but I don't think there is much we can do about that. In any case, we aren't adding much work in the write path compared to the potential savings in the read path, so that is probably okay.
Do you think it is worth making this configurable? I don't think it is outside the realm of possibility for some users to care more about disk space than read performance. And maybe the threshold for this optimization could differ based on the workload. extern void toast_tuple_externalize(ToastTupleContext *ttc, int attribute, int options); +extern void toast_tuple_opt_externalize(ToastTupleContext *ttc, int attribute, + int options, Datum old_toast_value, + ToastAttrInfo *old_toast_attr); Could we bake this into toast_tuple_externalize() so that all existing callers benefit from this optimization? Is there a reason to export both functions? Perhaps toast_tuple_externalize() should be renamed and made static, and then this new function could be called toast_tuple_externalize() (which would be a wrapper around the internal function). + /* Sanity check: if data is not compressed then we can proceed as usual. */ + if (!VARATT_IS_COMPRESSED(DatumGetPointer(*value))) + toast_tuple_externalize(ttc, attribute, options); With a --enable-cassert build, this line causes assertion failures in the call to GetMemoryChunkContext() from pfree(). 'make check' is enough to reproduce it. Specifically, it fails the following assertion: AssertArg(MemoryContextIsValid(context)); I didn't see an existing commitfest entry for this patch. I'd encourage you to add one in the July commitfest: https://commitfest.postgresql.org/38/ -- Nathan Bossart Amazon Web Services: https://aws.amazon.com