On Tue, Feb 22, 2022 at 08:52:56PM +0000, Gavin Smith wrote: > > I've done more tonight but I still have more to do. There will have to > be some decoding of filenames when they are being put into > an error message (e.g. "@include: could not find..." and possibly others). > It would make sense to use the document encoding for this. > > That's for the error messages. As for actually finding the files, that's > a different question. I'll read through what you wrote again and reply > another time.
I have constructed an example that fails for @image (non_ascii_test_epub in tests/formatting/list-of-tests), but I'll wait for you to read my messages to come to a decision on the encoding to encode to before doing some code. I also checked that in a 8 bit locale an @include file with accent in the name is not found (because the file name is encoded to utf-8). > With the current code, non-ASCII bytes are output incorrectly in the > filename parts of errors from the XS parser. I intend to fix this > by replacing the code in tp/Texinfo/XS/parsetexi/errors.c that outputs > errors as a dump of Perl code that Perl part of the module has to 'eval'. > Instead, I intend to create the error message data structures more > directly. This has long been a desideratum for this module. I commited a temporary 'fix' by encoding to utf8 to have the same result for the XS and NonXS parser, it should be ok until you do a better fix with a better interface. -- Pat
