On Wed, 2023-10-04 at 14:14 -0400, Isaac Morland wrote: > Always store only UTF-8 in the database
What problem does that solve? I don't see our encoding support as a big source of problems, given that database-wide UTF-8 already works fine. In fact, some postgres features only work with UTF-8. I agree that we shouldn't add a bunch of bookkeeping and type system support for per-column encodings without a clear use case, because that would have a cost. But right now it's just a database-wide thing. I don't see encodings as a major area to solve problems or innovate. At the end of the day, encodings have little semantic significance, and therefore limited upside and limited downside. Collations and normalization get more interesting, but those are happening at a higher layer than the encoding. > What about characters not in UTF-8? Honestly I'm not clear on this topic. Are the "private use" areas in unicode enough to cover use cases for characters not recognized by unicode? Which encodings in postgres can represent characters that can't be automatically transcoded (without failure) to unicode? Obviously if we have some kind of unicode-based type, it would only work with encodings that are a subset of unicode. Regards, Jeff Davis