https://bz.apache.org/bugzilla/show_bug.cgi?id=63813
--- Comment #3 from Axel Howind <a...@dua3.com> --- When reading the word file, text pieces are read by converting `byte[]` to String in `buildInitSB()`. I investigated the raw data passed to that method: - so according to the unicode table, the "greater or equal sign" has the code 0x2265 which I also see in the debugger right before the "good one" bytes. - right before "bad one" there's a 0x0028, which in Unicode is the left parenthesis. So it seems that the error happens at a very low level when reading the byte stream. ----- Additional findings: LibreOffice doesn't render the symbol in front of "bad one" at all. Pages displays the correct symbol. ----- Extracting the file on the command line yields: axel@xiaolong tmp % unzip ../symbol_test.doc Archive: ../symbol_test.doc warning [../symbol_test.doc]: 10574 extra bytes at beginning or within zipfile (attempting to process anyway) inflating: [Content_Types].xml inflating: _rels/.rels inflating: theme/theme/themeManager.xml inflating: theme/theme/theme1.xml inflating: theme/theme/_rels/themeManager.xml.rels Could it be that the file is corrupt? Compare with a simple test document: axel@xiaolong tmp % unzip ../Test.docx Archive: ../Test.docx inflating: [Content_Types].xml inflating: _rels/.rels inflating: word/_rels/document.xml.rels inflating: word/document.xml inflating: word/theme/theme1.xml inflating: word/settings.xml inflating: docProps/core.xml inflating: word/fontTable.xml inflating: word/webSettings.xml inflating: word/styles.xml inflating: docProps/app.xml But since Apple pages renders it correctly and you said that you have multiple such documents, maybe I am missing something. Anyway, I'm out of this one. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org