On Thu, 13 Jul 2006, [EMAIL PROTECTED] wrote:
I used WordExtractor to extract texts from MS Word documents. The documents have many non-text charaters that display as squares, and sometimes as lines. However, most of the texts appear clearly. I did hex dumps of the texts and found that some squares have the values A0 and some have B7. I tried to remove them using the String method "String replace(char oldChar, char newChar)", but it does not remove them.

That sounds like it's a string replacement issue, and not a poi issue. My guess is that you're not correctly identifying the codes for the characters. Most good learning java books should help you there.

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Reply via email to