https://bz.apache.org/bugzilla/show_bug.cgi?id=63813
--- Comment #6 from teresa....@linguamatics.com --- (In reply to Dominik Stadler from comment #5) > Unfortunately this seems to be caused somewhere deep in the Microsoft DOC > binary format, the text-bytes that we read from the document-stream in class > TextPiece already results in ") bad one", so there is no conversion in > Apache POI as far as I see, but still LibreOffice can display this > correctly, so it seems there is some additional information stored somewhere > in the data which Apache POI does not read/interpret yet. > > This would need much more knowledge about this format than I can provide, > sorry, hopefully someone else can come up with a clue why this happens. Thanks Dominik I downloaded Libreoffice and saved the document into HTML output. You're right that the libreloffice outputs this correctly. Is there any way to mimic this behaviour in Apache POI? -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org