> On Dec 10, 2018, at 10:30 AM, Stephen Frost <sfr...@snowman.net> wrote: > > Greetings, > > * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: >> On 2018-Dec-10, Paul Ramsey wrote: >>> Your analysis looks correct to me, I'm pretty sure I had the same reaction, >>> first time I read through. It would be nice to handle partial decompression >>> all the way down at this level, but unfortunately the comment at the >>> Assert() is right: there's no way to know how many of the toasted pieces >>> need to be read in order to have enough compressed input to create the >>> desired amount of decompressed output, so there's no choice except to read >>> the whole compressed thing, even in a slicing context. >> >> It'd be useful to have some sort of iterator-style API for detoasting. >> If you need more data, just call it again. It's more wasteful if you >> end up retrieving all of the toasted data, but if you just need a >> fraction it's obviously a win. > > I was wondering about that myself. I was looking this area with the > idea of pushing Paul's patch here: > > https://www.postgresql.org/message-id/cacowwr1vbmmje1hdzgdxwx_z5mkypqa1jyw8xxunyjq1mri...@mail.gmail.com > > but that's just "give me all the data from the front to X point." > > Paul, what do you think about implementing an iterator for decompressing > PGLZ data, and then using that? For your use-case, it'd be just one > call since we know how much we want, but for other use-cases (such as > searching a compressed TOAST item for something), it'd be an actual > iteration and we could potentially eliminate a lot of work in those > cases where we just need a boolean yes/no the TOAST'd data matches the > query.
I think an iterator on detoast is a precondition to an iterator on decompression which is a precondition to a workflow that allows functions to iterate through toasted objects instead of being restricted to the fetch/slice paradigm. I was surprised how *few* builtin functions benefited from my partial-decompression patch though: just substring() and left() for text/bytea. So I’m not sure how many functions could get a win from a fancy iterator. One common use case I thought *might* get some leverage is LIKE ‘foo%’, but I shied away from the kind of API mucking that would have been necessary to change that over to use slicing while still supporting all the other ways LIKE is called. P