________________________________ 发件人: Tatsuo Ishii <is...@sraoss.co.jp> 发送时间: 2020年10月6日 2:15 收件人: t...@sss.pgh.pa.us <t...@sss.pgh.pa.us> 抄送: parker....@outlook.com <parker....@outlook.com>; pgsql-gene...@postgresql.org <pgsql-gene...@postgresql.org> 主题: Re: 回复: May "PostgreSQL server side GB18030 character set support" reconsidered?
> Hmm ... interesting idea, basically invent our own modified version > of GB18030 (or SJIS?) for backend-internal storage. But I'm not > sure how to make it work without enlarging the string, which'd defeat > the OP's argument. It looks to me like the second-byte code space is > already pretty full in both encodings. >But as he already admitted, actually GB18030 is 4 byte encoding, rather >than 2 bytes. So maybe we could find a way to map original GB18030 to >ASCII-safe GB18030 using 4 bytes.> >As for SJIS, no big demand for the encoding in Japan these days. So I >think we can leave it as it is.> >Best regards, >-- >Tatsuo Ishii >SRA OSS, Inc. Japan >English: http://www.sraoss.co.jp/index_en.php >Japanese:http://www.sraoss.co.jp So the key lies in a ASCII-safe GB18030 simple mapping algorithm (Maybe named with abbreviation "GB18030as" of GB18030_ascii_safe?), which not break "ASCII-safe" while save lots of storage (The ANSI-safe GB2312 contains most frequently used 6763 characters). In fact, it was GBK designed by Microsoft broke "ASCII-safe" in about 1995 with the popular of Win95. Later GB18030 inherited it because it had to compatible with GBK. Thanks. I will try to find whether any opinions regarding "a ASCII-safe GB18030 simple mapping algorithm" exist in GB18030 standard maintainers community.