Hi,
Yavor Doganov wrote:
GSHTML assumes Latin 1 if not specified.
That's a poor default these days.
true, but I fear it comes from libxml2 itself and/or old HTML
specification which is anyway not directly the primary purpose of libxml2!
The issue is that syntax-wise help files are html-like (e.g. <br> tag)
but contain many extra tags for structure that HTML does not have and is
not intended to be.
So technically XML more appropriate, with the extension of some HTML
convenience.
Testing different parsers beyond bug search is also useful for an
evaluation of how one or the other standards work and maybe cleanup the
format by changing it.
tests:
Removing previously all the <meta charset="utf-8" /> in all the .xlp
files.
Worth noting that if these are retained HelpViewer from SVN trunk
displays nothing.
it works for me? I also added a specific test file which contains
subsections with different encodings.
All files give you issues? with which parser(s)?
So No Parser set?
The default is Internal if it's not set.
Actually there was some in-congruence. Richard modified it to be GSHTML
to maintain the existing behaviour (internal parser was commented out).
Then I accidentally commited a change in one place, but not the other
class so I don't know how the preference is set. I just changed now to
be always Internal, which would be my goal. Other parsers are for
evaluation until a decision is made.
Accented characters well interpreted.
Some unexpected double quotes after some titles.
Not only quotes but opening tags as well, plus missing text, like:
Tags Disponibles
# The entire paragpraph after the title is not displayed, just this:
'<b> :
<b>votre texte en
Hmm... does this happen inside legends/boxes ? We have a bug under
investigation there that causes string truncation, just a one or a
couple of characters though.
Pure XML does not handle HTML entities, so it either needs to be done in
UTF-8 or not used.
Riccardo