https://bz.apache.org/bugzilla/show_bug.cgi?id=60677
--- Comment #1 from Tim Allison <talli...@mitre.org> --- This was caused by my change to byte[] from String. For multibyte encodings (e.g. Shift_JIS), things get interesting. The byte length might not equal the string length(). The x information is stored in dx[], an array parallel to the byte array text[]. dx[] stores x position in the first byte of a multibyte character, but 0 for the other bytes in that character. We need to map this information to the String offsets dx[0] = 13 text[0] = -125 dx[1] = 0 text[1] = 118 dx[2] = 14 text[2] = -125 dx[3] = 0 text[3] = -115 needs to be remapped as: dxNormed[0] = 13 textString.get(0) = U+30D7 dxNormed[1] = 14 textString.get(1) = U+30ED I have a patch ready, and I'll apply it once 3.16-beta2 is released. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org