On Mon, Jun 21, 2021 at 02:47:16PM +0900, Tatsuo Ishii wrote:
> > I got the parse error after applying the patch:
> > 
> > release-14.sgml:3562: parser error : Input is not proper UTF-8,
> > indicate encoding !
> > Bytes: 0xE9 0x20 0x53 0x61
> >         (Juan Jos Santamara Flecha)
> >                  ^
> > 
> > Is that a problem with my environment?
> 
> Me too. I think the problem is, Bruce's patch is encoded in
> ISO-8859-1, not UTF-8. As far as I know PostgreSQL never encodes
> *.sgml files in ISO-8859-1. Anyway, attached is the Bruce's patch
> encoded in UTF-8. This works for me.
> 
> My guess is, when Bruce attached the file, his MUA automatically
> changed the file encoding from UTF-8 to ISO-8859-1 (it could happen in
> many MUA). Also that's the reason why he does not see the problem
> while compiling the sgml files. In his environment release-14.sgml is
> encoded in UTF-8, I guess. To prevent the problem next time, it's
> better to change the mime type of the attached file to
> Application/Octet-Stream.

Oh, people were testing by building from the attached patch, not from
the git tree.  Yes, I see now the email was switched to a single-byte
encoding, and the attachment header confirms it:

        Content-Type: text/x-diff; charset=iso-8859-1
                                           ----------
        Content-Disposition: attachment; filename="master.diff"
        Content-Transfer-Encoding: 8bit

I guess my email program, mutt, is trying to be helpful by using a
single-byte encoding when UTF is not necessary, which I guess makes
sense.  I will try to remember this can cause problems with SGML
attachments.

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.



Reply via email to