Hi guys..
I have an interesting problem. I am using POI to extract text from a word doc. (word 2000/03 usually). But the document is written in Chinese. So naturally, when I write the extracted text to a plaintext file, I get random ascii characters. So, I want to be able to decode the charset into UTF-8. Is there any way to determine the charset so I can decode it? In eclipse, I am doing a WordExtractor.getParagraphs() and if I set a breakpoint, I can see the Chinese characters. Also, I noticed that there is a property in HWPFDocument called field_27_cChFtnEdn. Is that possibly what I should be looking at? Thanks