Hello all,

I have been digging into the database page layout (specifically the tuples)
to ensure the unsigned integer types were consuming the proper storage.
While digging around, I found one thing surprising:

It appears the heap tuples are padded at the end to the MAXALIGN distance.

Below is my data that I used to come to this conclusion.
(This test was performed on a 64-bit system with --with-blocksize=32).

The goal was to compare data from comparable type sizes.
The first column indicates the type (char, uint1, int2, uint2, int4, and
uint4),
the number in () indicates the number of columns in the table.

The Length is from the .lp_off field in the ItemId structure.
The Offset is from the .lp_len field in the ItemId structure.
The Size is the offset difference.

char (1)        Length      Offset  Size            char (9)
Length       Offset   Size
                            25      32736
32                                         33      32728      40
                            25      32704
32                                         33      32688      40
                            25      32672
32                                         33      32648      40
                            25
32640                                                  33      32608

uint1 (1)       Length       Offset   Size            uint1 (9)
Length       Offset  Size
                             25      32736
32                                        33      32728     40
                             25      32704
32                                        33      32688     40
                             25      32672
32                                        33      32648     40
                             25      32640
                33      32608

int2 (1)         Length       Offset   Size            int2 (5)
Length       Offset  Size
                             26      32736
32                                        34      32728     40
                             26      32704
32                                        34      32688     40
                             26      32672
32                                        34      32648     40
                             26
32640                                                  34      32608

uint2 (1)        Length      Offset   Size            unt2 (5)
Length       Offset  Size
                             26      32736
32                                        34      32728     40
                             26      32704
32                                        34      32688     40
                             26      32672
32                                        34      32648     40
                             26
32640                                                  34      32608

int4 (1)           Length      Offset  Size            int4 (3)
Length     Offset  Size
                             28      32736
32                                          36     32728     40
                             28      32704
32                                          36     32688     40
                             28      32672
32                                          36     32648     40
                             28
32640                                                   36     32608

uint4 (1)         Length       Offset  Size            uint4 (3)
Length      Offset  Size
                              28      32736
32                                         36     32728     40
                              28      32704
32                                         36     32688     40
                              28      32672
32                                         36     32648     40
                              28
32640                                                  36     32608

>From the documentation at:
http://www.postgresql.org/docs/8.3/static/storage-page-layout.html
and from the comments in src/include/access/htup.h I understand the user
data (indicated by t_hoff)
must by a multiple of MAXALIGN distance, but I did not find anything
suggesting the heap tuple itself
had this requirement.

After a cursory glance at the HeapTupleHeaderData structure, it appears it
could be aligned with
INTALIGN instead of MAXALIGN.  The one structure I was worried about was the
6 byte t_ctid
structure.  The comments in src/include/storage/itemptr.h file indicate the
ItemPointerData structure
is composed of 3 int16 fields.  So everthing in the HeapTupleHeaderData
structure is 32-bits or less.

I am interested in attempting to generate a patch if this idea appears
feasible.   The current data
set I am playing with it would save over 3GB of disk space.  (Back of the
envelope calculations
indicate that 5% of my current storage is consumed by this padding.   My
tuple length is 44 bytes.)

Thanks,

- Ryan

Reply via email to