On Wed, May 27, 2015 at 12:21 PM, luigi scarso <[email protected]>
wrote:

>
>
> On Wed, May 20, 2015 at 4:15 PM, David Kastrup <[email protected]> wrote:
>
>> Hans Hagen <[email protected]> writes:
>>
>> > (Concerning parsing logs: as the cnf is under user control you cannot
>> > assume that the log lines are the same always, as some users can set
>> > them different; i always did. So log file parsers should be flexible
>> > in this respect.)
>>
>> Standard TeX is the most fun in that respect.  It wraps after 79 bytes,
>> never mind whether you are in the middle of a UTF-8 character or not.
>>
>> That's sort of ugly to process with a UTF-8-aware system.
>>
>
> infact I see different output in pdftex  luatex and xetex:
>
> Hello\message{%
> 1xxxxxxxxxx%
> 2xxxxxxxxx%
> 3xxxxxxxxx%
> 4xxxxxxxxx%
> 5xxxxxxxxx%
> 6xxxxxxxxx%
> 7xxxxxxxxx%
> 8xxxxxx鹿xx%
> 9xxxxxxxxx%
> 10xxxxxxxx%
> 11xxxxxxxx%
> 12xxxxxxxx%
> 13xxxxxxxx%
> 14xxxxxxxx%
> 15xxxxxxxx%
> 16xxxxxxxx%
> }
> \bye
>
> xetex and luatex correctly display 鹿 but luatex has this off-by-one "bug"
> that I still have to catch.
> --
> luigi
>
Ok, not a bug.
0) xetex and luatex don't break a utf-8 sequence (or better, at least
luatex should not break an utf-8 sequence in output);
1) xetex show 79 (ie. max_print_line) unicode chars in a utf-8 encoding, so
in this case we have that the  line with 鹿  is longer than 79 bytes;
2) luatex always shows at max 79 bytes, so in this case that line is
shorter .

So an applications that expect at max 79 bytes is ok with luatex, as also
is ok an application that expect a valid utf-8 line that ends with "\n" .




-- 
luigi
_______________________________________________
dev-luatex mailing list
[email protected]
http://www.ntg.nl/mailman/listinfo/dev-luatex

Reply via email to