Re: Column lookup in a row performance

Tom Lane Tue, 02 Apr 2019 08:41:38 -0700

=?UTF-8?B?0J/QsNCy0LvRg9GF0LjQvSDQmNCy0LDQvQ==?= <[email protected]> writes:
>> (1) Backwards compatibility, and (2) it's not clear that a different
>> layout would be a win for all cases.


> I am curious regarding (2), for my understanding it is good to find
> out at least one case when layout with lengths/offsets in a header
> will be crucially worse. I will be happy if someone can elaborate.

It seems like you think the only figure of merit here is how fast
deform_heap_tuple runs.  That's not the case.  There are at least
two issues:

1.  You're not going to be able to do this without making tuples
larger overall in many cases; but more data means more I/O which
means less performance.  I base this objection on the observation
that our existing design allows single-byte length "words" in many
common cases, but it's really hard to see how you could avoid
storing a full-size offset for each column if you want to be able
to access each column in O(1) time without any examination of other
columns.

2.  Our existing system design has an across-the-board assumption
that each variable-length datum has its length embedded in it,
so that a single pointer carries enough information for any called
function to work with the value.  If you remove the length word
and expect the length to be computed by subtracting two offsets that
are not even physically adjacent to the datum, that stops working.
There is no fix for that that doesn't add performance costs and
complexity.

Practically speaking, even if we were willing to lose on-disk database
compatibility, point 2 breaks so many internal and extension APIs that
there's no chance whatever that we could remove the length-word datum
headers.  That means that the added fields in tuple headers would be
pure added space with no offsetting savings in the data size, making
point 1 quite a lot worse.

                        regards, tom lane

Re: Column lookup in a row performance

Reply via email to