Re: Non-ASCII characters in @include search path

Patrice Dumas Wed, 23 Feb 2022 06:45:23 -0800

On Tue, Feb 22, 2022 at 08:52:56PM +0000, Gavin Smith wrote:
> 
> I've done more tonight but I still have more to do.  There will have to
> be some decoding of filenames when they are being put into
> an error message (e.g. "@include: could not find..." and possibly others).
> It would make sense to use the document encoding for this.
> 
> That's for the error messages.  As for actually finding the files, that's
> a different question.  I'll read through what you wrote again and reply
> another time.


I have constructed an example that fails for @image (non_ascii_test_epub
in tests/formatting/list-of-tests), but I'll wait for you to read
my messages to come to a decision on the encoding to encode to before
doing some code.

I also checked that in a 8 bit locale an @include file with accent in
the name is not found (because the file name is encoded to utf-8).

> With the current code, non-ASCII bytes are output incorrectly in the
> filename parts of errors from the XS parser.  I intend to fix this
> by replacing the code in tp/Texinfo/XS/parsetexi/errors.c that outputs
> errors as a dump of Perl code that Perl part of the module has to 'eval'.
> Instead, I intend to create the error message data structures more
> directly.  This has long been a desideratum for this module.

I commited a temporary 'fix' by encoding to utf8 to have the same result for
the XS and NonXS parser, it should be ok until you do a better fix with
a better interface.

-- 
Pat

Re: Non-ASCII characters in @include search path

Reply via email to