Hello, The issue reported by Ludovic lead me to look at how space characters are handled in general in the perl makeinfo implementation. The \s (and \S, which is simply the complement of \s) character class is used all over, in the Parser and the converters. The Parser should not remove any character (or maybe in very specific circumstances with user errors, not worth looking at), however, the interpretation of spaces is important in constructing the tree, for instance delineating paragraphs, spaces after commands... In most converters, the spaces are all output, such that input space characters are kept as is. But in Plaintext/Info spaces are removed as part of paragraph/lines formatting, also lines consisting only of spaces are emptied in @example and the like and lines consisting only of spaces between paragraphs are completly removed.
Now, what is in \s? It turns out that it is not that simple. It is explained in http://perldoc.perl.org/perlrecharclass.html#Backslash-sequences in the 'Whitespace' part. The smallest set is [\t\n\f\r ], which includes the '^L' character (\f). But depending on the setting, there may be additional characters, like '0x2000 EN QUAD'. I have tested that all those appears in html output, but none in Info (except for LINE TABULATION) with @documentencoding utf-8. I attach the file. So, this means that all those spaces except for LINE TABULATION have their special meaning not kept. I think that what should be nice would be to have both something sensible and consistent with TeX/LaTeX, having something sensible coming first. It seems that that the makeinfo in C considered explicitly spaces to be something along [\r\n\t ]. So, what should be done? Do something different for parsing or is it ok to have all the space like characters be considered as spaces? And for the output? Break words only at [\r\n\t ]? Keep the first space character only if it is not [\r\n]? As a side note, when trying all the spaces advertised on the perl documentation, without @documentencoding, the result is messed up because of unicode, certainly. If @documentencoding us-ascii is used, the result is not pretty (though this has not much to do with spaces) perl complains (rightly), although we may want to catch that to give another error message: ascii "\xA0" does not map to Unicode at Texinfo/Parser.pm line 1909, <FH> line 1. -- Pat
test_spaces.texi
Description: TeXInfo document
