Ok. After some digging, I've found that cChFtnEdn has something to do with footers.
But I did find that what I'm looking for is the getChs() and setChs() to determine the default extended character set id for the text stream, and getChsTables() for determining the default extended charset id for internal data. They are fields' field_11 and field_12 respectively. Is there any documentation on what these fields are, or their mappings? thanks -----Original Message----- From: Justin Warren Sent: Thursday, May 03, 2007 11:56 AM To: POI Users List Subject: character encoding and charsets Hi guys.. I have an interesting problem. I am using POI to extract text from a word doc. (word 2000/03 usually). But the document is written in Chinese. So naturally, when I write the extracted text to a plaintext file, I get random ascii characters. So, I want to be able to decode the charset into UTF-8. Is there any way to determine the charset so I can decode it? In eclipse, I am doing a WordExtractor.getParagraphs() and if I set a breakpoint, I can see the Chinese characters. Also, I noticed that there is a property in HWPFDocument called field_27_cChFtnEdn. Is that possibly what I should be looking at? Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/