On Wed, May 27, 2015 at 12:21 PM, luigi scarso <[email protected]> wrote:
> > > On Wed, May 20, 2015 at 4:15 PM, David Kastrup <[email protected]> wrote: > >> Hans Hagen <[email protected]> writes: >> >> > (Concerning parsing logs: as the cnf is under user control you cannot >> > assume that the log lines are the same always, as some users can set >> > them different; i always did. So log file parsers should be flexible >> > in this respect.) >> >> Standard TeX is the most fun in that respect. It wraps after 79 bytes, >> never mind whether you are in the middle of a UTF-8 character or not. >> >> That's sort of ugly to process with a UTF-8-aware system. >> > > infact I see different output in pdftex luatex and xetex: > > Hello\message{% > 1xxxxxxxxxx% > 2xxxxxxxxx% > 3xxxxxxxxx% > 4xxxxxxxxx% > 5xxxxxxxxx% > 6xxxxxxxxx% > 7xxxxxxxxx% > 8xxxxxx鹿xx% > 9xxxxxxxxx% > 10xxxxxxxx% > 11xxxxxxxx% > 12xxxxxxxx% > 13xxxxxxxx% > 14xxxxxxxx% > 15xxxxxxxx% > 16xxxxxxxx% > } > \bye > > xetex and luatex correctly display 鹿 but luatex has this off-by-one "bug" > that I still have to catch. > -- > luigi > Ok, not a bug. 0) xetex and luatex don't break a utf-8 sequence (or better, at least luatex should not break an utf-8 sequence in output); 1) xetex show 79 (ie. max_print_line) unicode chars in a utf-8 encoding, so in this case we have that the line with 鹿 is longer than 79 bytes; 2) luatex always shows at max 79 bytes, so in this case that line is shorter . So an applications that expect at max 79 bytes is ok with luatex, as also is ok an application that expect a valid utf-8 line that ends with "\n" . -- luigi
_______________________________________________ dev-luatex mailing list [email protected] http://www.ntg.nl/mailman/listinfo/dev-luatex
