[Add Cc: to pgsql-hackers] From: Zhongpu Chen <[email protected]> Subject: Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8 Date: Mon, 11 May 2026 09:56:20 +0800 Message-ID: <ca+1gyqjwpdhocim2wrctffbbtdq2gwivzzikiqfkkmtng5h...@mail.gmail.com>
> I see. The settings may be used in a finer way. For example, `set > euc-cn-encoding-valiation = 'read_compatible'`. It will make pg_dumpall not working. Suppose there's a database populated with `set euc-cn-encoding-valiation = 'native'. 1. Dump the database cluster using pg_dumpall. 2. Create a new database cluster using initdb. 3. Set euc-cn-encoding-valiation = 'read_compatible' in the postgresql.conf. 4. Restore from the dump --- failure because of disallowed EUC_CN characters. I think encoding properties (including character validation) should belong to encoding itself, rather than GUC parameters. If you want to have "strict" EUC_CN and "non-strict" EUC_CN at the same time, I think the best way to implement it is, add new EUC_CN variant encoding. Regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
