On 2015-06-26, Jürgen Spitzmüller wrote:
> 2015-06-26 16:44 GMT+02:00 Guenter Milde <mi...@users.sf.net>:

>> Please don't check for unencodable characters in comments.

> It's still invalid encoding, since the output file contains invalid glyphs
> (no matter if this line is processed by LaTeX or not).

I have a different view on this:

**Invalid** characters may only occure in utf-8 encoded files,
for example in a file generated by LyX with default settings if
* the document language defaults to utf8
* a second language defaults to an 8-bit encoding:

  %% LyX 2.1.3 created this file.  For more info, see http://www.lyx.org/.
  %% Do not edit unless you really know what you are doing.
  \documentclass[a4paper]{article}
  \usepackage{lmodern}
  \renewcommand{\sfdefault}{lmss}
  \renewcommand{\ttdefault}{lmtt}
  \usepackage[T1]{fontenc}
  \usepackage[latin9,utf8]{inputenc}

  \makeatletter

  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands.
  \special{papersize=\the\paperwidth,\the\paperheight}


  \makeatother

  \usepackage[ngerman,mongolian]{babel}
  \begin{document}
  Test

  \selectlanguage{ngerman}%
  \inputencoding{latin9}%
  Gr��e \selectlanguage{mongolian}%

  \end{document}

Here, the German word Grüße contains 2 invalid characters if you want to
process/view/edit the whole file as Utf-8.

With TeX, there is no problem, as the German text part is read using a
different encoding.

Similarily, all text parts in a comment are uncritical if the file is
processed by TeX, because comments are not decoded at all.

Just like in a code source file, comments may contain characters that are
not valid in the code.


> Change the comment and use valid glyphs.

This would mean I have to be very verbose and write 0218 LATIN CAPITAL
LETTER S WITH COMMA BELOW (or some other unambiguous ASCII representation
of the to-be-tested letter Ș) in the LaTeX preamble of my LyX file to
be able to use LyX "unicodesymbols" export conversions.

In my view, LyX is overly restrictive here and the new feature stands in the
way.

Günter


PS: A glyph is a graphical representation of a character.
    LyX files, TeX files, C++ files and E-Mails only contain characters
    (in various character encodings), not glyphs. Glyphs are defined in
    font files (otf, tff, metafont, ...). Unicode defines code-points for
    characters and (as a non-binding information) shows sample glyphs for
    printable characters.


Reply via email to