>>>>> "DK" == David Kastrup <d...@gnu.org> writes:

DK> Sure, it isn't.  But pdfmarks are not encoded in UTF-8.  They are
DK> encoded either in PDFDocEncoding (a subset of Latin-1) or in UTF16BE
DK> with byte order mark.

The error evince reports is about the /Metadata obj (20 0 obj),
which *is* xml.  Try something like:

  mupdfshow -b unicode.pdf 20

The first line of the stream is:

  <?xpacket begin='<U+FEFF>' id='W5M0MpCehiHzreSzNTczkc9d'?>

where the <U+FEFF> is the character, encoded in UTF-8.

At the end of the xml, one finds:

  <rdf:li><E4></rdf:li>

where the <E4> is a single octet, the 8859-1 encoding of ä (U+00E4).

So evince's complain is correct.

-JimC
-- 
James Cloos <cl...@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6
_______________________________________________
evince-list mailing list
evince-list@gnome.org
https://mail.gnome.org/mailman/listinfo/evince-list

Reply via email to