Hi.
When I'm trying to extract pure text from doc-file with
org.apache.poi.hwpf.extractor.WordExtractor, I get text with rubbish like


\* ARABIC 3
PAGE  


PAGE  7

and other unreadable characters.

Is it possible to restrict it while extracting or by using some additional
POI tools?

Thanks in advance.
-- 
View this message in context: 
http://www.nabble.com/Rubbish-in-extracted-text-tp17207175p17207175.html
Sent from the POI - User mailing list archive at Nabble.com.

Reply via email to