On Thu, Dec 24, 2009 at 5:40 PM, Andrew Dunstan <and...@dunslane.net> wrote: >> 1) If I set my database and connection encoding to UTF-8, does pg (and >> future versions of it) guarantee that unicode code points are stored >> unmodified? or could it be that pg does some unicode >> normalization/manipulation with them before storing a string, or when >> retrieving a string? >> >> The reason why I'm asking is, I've built a little program that reads >> in and stores text and explicilty analyzes the text at a later point >> in time, also regarding things like if the text is in NFC, NFD or >> neither. and since I want to store them in the database, it is very >> imporant for PG not to fiddle around with the normalization unless my >> program explicitly told PG to do that. > > We don't do any normalization. If the client gives us UTF8 then we store > exactly what it gives us, and return exactly that.
OK. > > (This question is not really a -hackers question. The correct forum is > pgsql-general. Please make sure you use the correct forum in future.) Are you sure? The description for -hackers says: "Discussion of current development issues, problems and bugs, and proposed new features.", which seems to be exactly where you'd ask my 2nd question, which is still unanswered. >> >> 2) How far is normalization support in PG? When I checked a long time >> ago, there was no such support. Now that the SQL standard mandates a >> NORMALIZE function that may have changed. Any updates? >> Kind regards. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers