On Mon, Jan 9, 2017 at 11:47 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Jan 9, 2017 at 7:50 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>> One idea could be that we have some fixed number of
>> slots (i think we can make it variable as well, but for simplicity,
>> lets consider it as fixed) in the page header which will store the
>> offset to the transaction id inside a TPD entry of the page.  Consider
>> a TPD entry of page contains four transactions, so we will just store
>> enough information in heap page header to reach the transaction id for
>> these four transactions. I think each such page header slot could be
>> three or four bits long depending upon how many concurrent
>> transactions we want to support on a page after which a new
>> transaction has to wait (I think in most workloads supporting
>> simultaneous eight transactions on a page should be sufficient).
>> Then we can have an additional byte (or less than byte) in the tuple
>> header to store lock info which is nothing but an offset to the slot
>> in the page header.   We might find some other locking technique as
>> well, but I think keeping it same as current has benefit.
> Yes, something like this can be done.  You don't really need any new
> page-level header data, because you can get the XIDs from the TPD
> entry (or from the page itself if there's only one).  But you could
> expand the single "is-modified" bit that I've proposed adding to each
> tuple to multiple bits.  0 means not recently modified.  1 means
> modified by the first or only transaction that has recently modified
> the page.  2 means modified by the second transaction that has
> recently modified the page.  Etc.

makes sense.

> What I was thinking about doing instead is storing an array in the TPD
> containing the same information.  There would be one byte or one half
> a byte or whatever per TID and it would contain the index of the XID
> in the TPD that had most recently modified or locked that TID.  Your
> solution might be better, though, at least for cases where the number
> of tuples that have modified the page is small.

I think we also need to prevent multiple backends trying to reserve a
slot in this array which can be a point of contention.  Another point
is during pruning, if due to row movement TIDs are changed, we need to
keep this array in sync.

>  However, I'm not
> totally sure.  I think it's important to keep the tuple headers VERY
> small, like 3 bytes.  Or 2 bytes.  Or maybe even variable size but
> only 1 byte in common cases.  So I expect bit space in those places to
> be fairly scarce and precious.

I agree that we should carefully choose the format so as to keep a
trade-off between performance and space savings.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to