Hi. When I'm trying to extract pure text from doc-file with org.apache.poi.hwpf.extractor.WordExtractor, I get text with rubbish like
\* ARABIC 3 PAGE PAGE 7 and other unreadable characters. Is it possible to restrict it while extracting or by using some additional POI tools? Thanks in advance. -- View this message in context: http://www.nabble.com/Rubbish-in-extracted-text-tp17207175p17207175.html Sent from the POI - User mailing list archive at Nabble.com.
