> +static void > +init_toast_buffer(ToastBuffer *buf, int32 size, bool compressed) > +{ > + buf->buf = (const char *) palloc0(size);
This API is weird -- you always palloc the ToastBuffer first, then call init_toast_bufer on it. Why not palloc the ToastBuffer struct in init_toast_buffer and return it from there instead? This is particularly strange since the ToastBuffer itself is freed by the "free" routine ... so it's not like we're thinking that caller can take ownership of the struct by embedding it in a larger struct. Also, this function needs a comment on top explaining what it does and what the params are. Why do we need ToastBuffer->buf_size? Seems unused. > + if (iter == NULL) > + { > + return; > + } Please, no braces around single-statement blocks. (Many places). > +/* > + * If "ctrlc" field in iterator is equal to INVALID_CTRLC, it means that > + * the field is invalid and need to read the control byte from the > + * source buffer in the next iteration, see pglz_decompress_iterate(). > + */ > +#define INVALID_CTRLC 8 What does CTRLC stand for? Also: this comment should explain why the value 8 is what it is. > + /* > + * Now we copy the bytes specified by the tag > from OUTPUT to > + * OUTPUT. It is dangerous and platform > dependent to use > + * memcpy() here, because the copied areas > could overlap > + * extremely! > + */ > + len = Min(len, destend - dp); > + while (len--) > + { > + *dp = dp[-off]; > + dp++; > + } So why not use memmove? > + /* > + * Otherwise it contains the match length minus > 3 and the > + * upper 4 bits of the offset. The next > following byte > + * contains the lower 8 bits of the offset. If > the length is > + * coded as 18, another extension tag byte > tells how much > + * longer the match really was (0-255). > + */ > + int32 len; > + int32 off; > + > + len = (sp[0] & 0x0f) + 3; > + off = ((sp[0] & 0xf0) << 4) | sp[1]; > + sp += 2; > + if (len == 18) > + len += *sp++; Starting this para with "Otherwise" makes no sense, since there's no previous opposite case. Please reword. However, I don't recognize this code from anywhere, and it seems to have a lot of magical numbers. Is this code completely new? Didn't much like FetchDatumIteratorData SnapshotToast struct member name. How about just "snapshot"? > +#define PG_DETOAST_ITERATE(iter, need) > \ > + do { > \ > + Assert(need >= iter->buf->buf && need <= iter->buf->capacity); > \ > + while (!iter->done && need >= iter->buf->limit) { > \ > + detoast_iterate(iter); > \ > + } > \ > + } while (0) This needs parens around each "iter" and "need" in the macro definition. Also, please add a comment documenting what the arguments are, since it's not immediately obvious. > +void free_detoast_iterator(DetoastIterator iter) > +{ > + if (iter == NULL) > + { > + return; > + } If this function is going to do this, why do callers need to check for NULL also? Seems pointless. I'd rather make callers simpler and keep only the NULL-check inside the function, since this is not perf-critical anyway. > +extern void detoast_iterate(DetoastIterator detoast_iter) > +{ Please, no "extern" in function definitions, only in prototypes in the .h files. Also, we indent the function name at the start of line, with the return type appearing on its own in the previous line. > + if (!VARATT_IS_EXTERNAL_ONDISK(attr)) > + elog(ERROR, "create_fetch_datum_itearator shouldn't be called > for non-ondisk datums"); Typo for "iterator". > + iter->fetch_datum_iterator = create_fetch_datum_iterator(attr); > + VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr); > + if (VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer)) > + { > [...] > + } > + else > + { > + iter->compressed = false; > + > + /* point the buffer directly at the raw data */ > + iter->buf = iter->fetch_datum_iterator->buf; > + } This arrangement where there are two ToastBuffers and they sometimes are the same is cute, but I think we need a better way to know when each needs to be freed afterwards; the proposed coding is confusing. And it certainly it needs more than zero comments about what's going on there. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services