Means we are good to go and ship another release? Would love to do that and then update TomEE and Meecrowave with it (tests do look good in both). Wdyt? Would start the release train in 2h, love getting it out before the weekend. LieGrue, strub
> Am 28.04.2022 um 08:27 schrieb David Blevins <david.blev...@gmail.com>: > >> On Apr 26, 2022, at 10:55 AM, David Blevins <david.blev...@gmail.com> wrote: >> >> I'd need to check on the character encoding issue you mention. In my mind >> the original code and current code is trying to create a string of max >> snippet length. If it doesn't do that, it's a bug. > > So I dug into this and it looks like counting bytes is very flawed and > counting chars is as perfect as it gets in java. > > It looks like even with UTF-8 you can have a single character be anywhere > from 1 to 4 bytes. The character `ñ` is string length of 1 but a byte length > of 2. If you grabbed the first 3 bytes of "mañana" you'd get "ma�..." > > If you create a UTF-8 string from a four-byte UTF-8 character you get of > course 4 bytes in the OutputStream, but you also get a string instance that > claims to be of length 2 not 1. If you call substring(0,1) on that you get > an unprintable result. > > So we fixed a bug in the switch from OutputStream to Writer. Any issues > there are with counting chars passed to the Writer and shared by > java.lang.String so users should not be surprised if they see a funny > character at the end of the snippet sometimes. > > > -David > > >