[HACKERS] Getting rid of cmin and cmax

Heikki Linnakangas Tue, 19 Sep 2006 06:39:09 -0700

We currently use 4 int32s to store xmin, xmax, cmin, cmax and xvac onevery heap tuple. That's a lot of overhead, especially on tables withnarrow rows. Reduction in header size would give us considerable spaceand I/O savings.

I'm thinking of removing cmin and cmax, and keeping that information inbackend-private memory instead. cmin and cmax are only interesting tothe inserting/deleting transaction, so using precious tuple header spacefor that is a waste. This has been discussed before, and Manfred Koizareven had a patch for 7.4 but didn't submit it because it was incomplete(http://archives.postgresql.org/pgsql-hackers/2005-09/msg00172.php).BTW: Manfred, do you still have the patch? It'd be interesting to lookat, even if it's not finished.

This reduces the tuple header size by 4 bytes, or 8 bytes if we canlater get rid of xvac as well.

There's some interesting properties we can exploit in implementing thebackend-private storage:

1. in small OLTP transactions that touch few rows, the cmin/cmaxinformation will easily fit in memory.

2. if current commandid == 1, every tuple with xmin == current xid isnot visible. Similarly, every tuple with xmax = current xid is visible.So we don't need to store anything for the first command in a transaction.

3. we can forget modifications by command X as soon as there's no livesnapshots with curcid <= X.

4. we don't need to record the information, if there's no livesnapshots, and we know that the current command is not going to readexisting rows.

These optimizations take care of bulk inserts nicely. In particular,pg_restore and similar applications wouldn't need to keep track ofinserted records.

By choosing a clever data structure, we might be able to get awaywithout spilling to disk. For example, a hash table works great forsmall transactions. If a transaction does bulk modifications, a bitmapper relation could be used. And to optimize for the cases when theinformation is not actually used, we can just collect the information toan array, and turn it into a more lookup-friendly data structure thefirst time it's needed.

Even if we do have to spill to disk on large transactions that domassive updates and selects, it would still be a win for most databases.And it penalizes the transactions that do massive updates, instead ofevery transaction in the system as we do now.


Comments?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

[HACKERS] Getting rid of cmin and cmax

Reply via email to