Hello all, I have been digging into the database page layout (specifically the tuples) to ensure the unsigned integer types were consuming the proper storage. While digging around, I found one thing surprising:
It appears the heap tuples are padded at the end to the MAXALIGN distance. Below is my data that I used to come to this conclusion. (This test was performed on a 64-bit system with --with-blocksize=32). The goal was to compare data from comparable type sizes. The first column indicates the type (char, uint1, int2, uint2, int4, and uint4), the number in () indicates the number of columns in the table. The Length is from the .lp_off field in the ItemId structure. The Offset is from the .lp_len field in the ItemId structure. The Size is the offset difference. char (1) Length Offset Size char (9) Length Offset Size 25 32736 32 33 32728 40 25 32704 32 33 32688 40 25 32672 32 33 32648 40 25 32640 33 32608 uint1 (1) Length Offset Size uint1 (9) Length Offset Size 25 32736 32 33 32728 40 25 32704 32 33 32688 40 25 32672 32 33 32648 40 25 32640 33 32608 int2 (1) Length Offset Size int2 (5) Length Offset Size 26 32736 32 34 32728 40 26 32704 32 34 32688 40 26 32672 32 34 32648 40 26 32640 34 32608 uint2 (1) Length Offset Size unt2 (5) Length Offset Size 26 32736 32 34 32728 40 26 32704 32 34 32688 40 26 32672 32 34 32648 40 26 32640 34 32608 int4 (1) Length Offset Size int4 (3) Length Offset Size 28 32736 32 36 32728 40 28 32704 32 36 32688 40 28 32672 32 36 32648 40 28 32640 36 32608 uint4 (1) Length Offset Size uint4 (3) Length Offset Size 28 32736 32 36 32728 40 28 32704 32 36 32688 40 28 32672 32 36 32648 40 28 32640 36 32608 >From the documentation at: http://www.postgresql.org/docs/8.3/static/storage-page-layout.html and from the comments in src/include/access/htup.h I understand the user data (indicated by t_hoff) must by a multiple of MAXALIGN distance, but I did not find anything suggesting the heap tuple itself had this requirement. After a cursory glance at the HeapTupleHeaderData structure, it appears it could be aligned with INTALIGN instead of MAXALIGN. The one structure I was worried about was the 6 byte t_ctid structure. The comments in src/include/storage/itemptr.h file indicate the ItemPointerData structure is composed of 3 int16 fields. So everthing in the HeapTupleHeaderData structure is 32-bits or less. I am interested in attempting to generate a patch if this idea appears feasible. The current data set I am playing with it would save over 3GB of disk space. (Back of the envelope calculations indicate that 5% of my current storage is consumed by this padding. My tuple length is 44 bytes.) Thanks, - Ryan