On Thu, Nov 03, 2005 at 01:49:46PM +0000, Simon Riggs wrote: > In other databases, CHAR(12) and NUMERIC(12) are fixed length datatypes. > In PostgreSQL, they are dynamically varying datatypes.
Please explain how a CHAR(12) can store 12 UTF-8 characters when each character may be 1 to 4 bytes, unless the CHAR itself is variable length... > What actually happens is that in many other systems the datatype is the > same, but additional metadata is provided for that particular attribute. > So CHAR(12) is a datatype of CHAR with a metadata item called length > which is set to 12 for that attribute. We already have this metadata, it's called atttypmod and it's stored in pg_attribute. That's where the 12 for CHAR(12) is stored BTW. > On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of > that datatype having a 4 byte varlena header. In this example, all of > those instantiations having the varlena header set to 12, so essentially > wasting the 4 byte header. Nope, the verlena header stores the actual length on disk. If you store "hello" in a char(12) field it takes only 9 bytes (4 for the header, 5 for the data), which is less than 12. Good ideas, but it all hinges on the fact that CHAR(12) can take a fixed amount of space, which simply isn't true in a multibyte encoding. Having a different header for things shorter than 255 bytes has been discussed before, that's another argument though. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
pgps475ZKndag.pgp
Description: PGP signature