Tom Lane wrote:
Tom Dunstan <[EMAIL PROTECTED]> writes:
On disk, enums will occupy 4 bytes: the high 22 bits will be an enum
identifier, with the bottom 10 bits being the enum value. This allows
1024 values for a given enum, and 2^22 different enum types, both of
which should be heaps. The exact distribution of bits doesn't matter all
that much, we just picked some that we were comfortable with.
I think this is excessive concern for bit-shaving. Make the on-disk
representation be 8 bytes instead of 4, then you can store the OID
directly and have no need for the separate identifier concept. This
in turn eliminates one index, one syscache, and one set of lookup/cache
routines. And you can have as many values of an enum as you darn please.
That's all true. It's a bit depressing to think that IMO 99% of users of
this will have enum values whose range would fit into 1 byte, but we'll
be using 8 to store it on disk. I had convinced myself that 4 was ok on
the basis that alignment issues in surrounding columns would pad out the
remaining bits anyway much of the time. Was I correct in that
assumption? Would e.g. an int after a char require 3 bytes of padding?
Ok, I'll run one more idea up the flagpole before giving up on a 4 byte
on disk representation. :) How about assigning a unique 4 byte id to
each enum value, and storing that on disk. This would be unique across
the database, not per enum type. The structure of pg_enum would be a bit
different, as the per-type enum id would be gone, and there would be
multiple rows for each enum type. The columns would be: the type oid,
the associated unique id and the textual representation. That would
probably simplify the caching mechanism as well, since input function
lookups could do a straight syscache lookup on type oid and text
representation, and the output function could do a straight lookup on
the unique id. No need to muck around creating a little dynahash or
whatever to attach to the fn_entra pointer.
It does still require the extra syscache, but it removes the limitations
on number of enum types and number of values per type while keeping the
on disk size smallish. I like that better than the original idea, actually.
If you didn't notice already: typcache is the place to put any
type-related caching you need to add.
I hadn't. I'll investigate. Thanks.
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend