https://bz.apache.org/bugzilla/show_bug.cgi?id=58858
Bug ID: 58858 Summary: hidden characters not removed Product: POI Version: unspecified Hardware: PC Status: NEW Severity: critical Priority: P2 Component: HWPF Assignee: dev@poi.apache.org Reporter: sebastian.a.agui...@gmail.com Created attachment 33442 --> https://bz.apache.org/bugzilla/attachment.cgi?id=33442&action=edit sample doc file to test After reading the file and turning it into a String the hidden characters are not removed. This happens in XWPF as well. For reading the file I'm using a very simple method. File file = new File("file.doc"); FileInputStream fis; fis = new FileInputStream(file); HWPFDocument doc = new HWPFDocument(fis); WordExtractor ex = new WordExtractor(doc); String toReturn = ex.getText(); Same thing happens when using XWPF, very simple code. XWPFDocument doc = new XWPFDocument(fis); XWPFWordExtractor ex = new XWPFWordExtractor(doc); String toReturn = ex.getText(); I'm attaching a file you can use as sample. You can show/hide the hidden characters with ctrl+shift+8 Thanks. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org