[bug #51330] preconv fails to detect utf-8 without BOM

Bertrand Garrigues Tue, 27 Jun 2017 15:32:23 -0700

URL:
  <http://savannah.gnu.org/bugs/?51330>


                 Summary: preconv fails to detect utf-8 without BOM
                 Project: GNU troff
            Submitted by: bgarrigues
            Submitted on: Tue 27 Jun 2017 10:31:10 PM UTC
                Severity: 3 - Normal
              Item Group: None
                  Status: Confirmed
                 Privacy: Public
             Assigned to: bgarrigues
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None

    _______________________________________________________

Details:

(See also comment #1 from bug #50989)

typesetting.pdf (utf-8 file without BOM; contains some characters with French
accents) is not correctly generated in the build tree because LC_ALL=C is
passed: this causes `preconv' to use "latin1" as default encoding, which is
the expected behaviour according to the man page of `preconv', and therefore
characters with accents are not properly handled.

There are several quick fixes to the generation of mom examples:
- Add a BOM to the .mom files.
- Use '-K utf8' instead of just '-k'
- Add a tag to the .mom files.

However it seems to me that `preconv' should not rely on the locale to detect
the file encoding.

Would it make sense to use, for example, libmagic (from the `file' utility) to
make preconv correctly detect the input file encoding?





    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?51330>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/


_______________________________________________
bug-groff mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-groff

[bug #51330] preconv fails to detect utf-8 without BOM

Reply via email to