Hello POI Community, I have written some Java code very much like the HPSF example on the HPSF HOW-TO <http://poi.apache.org/hpsf/how-to.html> page which reads Microsoft Office document property (or metadata) information. This code works fine for documents that have been saved with English versions of Microsoft Office. However, when I try to use it with a Microsoft document that is saved with a Chinese version, the code fails. I get an exception and the only message is "GBK". I assume that this is referring to the character set.
Has anyone succeeded in using POI to get property data for UTF-8 documents (especially Chinese)? I have tried Tika and it failed as well. At this point I'm running out of options. Any help would be greatly appreciated. Thanks you, Ian Kaplan www.bearcave.com
