Tom Lane wrote:
Because the length specification is in *characters*, which is not by any
means the same as *bytes*.

We could possibly put enough intelligence into the low-level tuple
manipulation routines to count characters in whatever encoding we happen
to be using, but it's a lot faster and more robust to insist on a count
word for every variable-width field.

I guess what you're saying is that PostgreSQL stores characters in varying-length encodings. If it stored character data in Unicode (UCS-16) it would always take up two-bytes per character. Have you considered supporting NCHAR/NVARCHAR, aka NATIONAL character data? Wouldn't UCS-16 be needed to support multi-locale clusters (as someone as inquiring about recently)?

Joe


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to