On Fri, Jul 04, 2025 at 02:38:34PM +0300, Nikita Malakhov wrote: > Hannu, we'd already made an attempt to extract the TOAST functionality as > API and make it extensible and usable by other AMs in [1], the patch > set was met calmly but we still have some hopes on it.
Yeah, it's one of these I have studied, and just found that overcomplicated, preventing us from moving on with a simpler proposal, because I care about two things first: - More compression methods, with more meta-data, but let's just add more vartag_external for that once/if they're really required. - Enlarge optionally to 8-byte values. So I really want to stress about these two points, nothing else for now, echoing from the feedback from 2022 and the fact that all proposals done after that lacked a simple approach. IMO, we would live fine enough, *if* being able to plug in a pluggable TOAST engine makes sense, if we just limit ourselves with an external interface. We could allow backends to load their own vartag_external with their own set of callbacks like the ones I am proposing here, so as we can translate from/to a Datum in heap (or a different table AM) to an external source, with the backend able to understand what this external source should be. The key is to define a structure good enough for the backend (toast_external_data in the patch). So to answer your and Hannu's question: I had the case of different table AMs in mind with an interface able to plug into it, yes. And yes, I have designed the patch set with this in scope. Now there's also a data type component to that, so that's assuming that a table AM would want to rely on a varlena to store this data externally, somewhere else that may not be exactly TOAST, still we want an OID and a value to be able to retrieve this external value, and we want to store this external OID and this value (+extra like a compression method and sizes) in a Datum of the main relation file. FYI, the patch set posted on this thread is not the latest one. I have a v2, posted on this branch, where I have reordered things: https://github.com/michaelpq/postgres/tree/toast_64bit_v2 The refactoring to the new toast_external_data with its callbacks is done first, and the new vartag_external with 8-byte value support is added on top of that. There were still two things I wanted to do, and could not get down to it because I've spent my last week or so working on other's stuff so I lacked room: - Evaluate the cost of the transfer layer to toast_external_data. The worst case I was planning to work with is a non-compressed data stored in TOAST, then check profiles with the the detoasting path by grabbing slices of the data with pgbench and a read-only query. The write/insert path is not going to matter, the detoast is. The reordering is actually for this reason: I want to see the effect of the new interface first, and this needs to happen before we even consider the option of adding 8-byte values. - Add a callback for the value ID assignment. I was hesitating to add that when I first backed on the patch but I think that's the correct design moving forward, with an extra logic to be able to check if an 8-byte value is already in use in a relation, as we do for OID assignment, but applied to the Toast generator added to the patch. The backend should decide if a new value is required, we should not decide the rewrite cases in the callback. There is a second branch that I use for development, force-pushing to it periodically, as well: https://github.com/michaelpq/postgres/tree/toast_64bit That's much dirtier, always WIP, just one of my playgrounds. -- Michael
signature.asc
Description: PGP signature