Problems getting summary information with HPSF for Chinese documents

Ian Kaplan Mon, 24 Nov 2008 08:10:28 -0800

  Hello POI Community,

  I have written some Java code very much like the HPSF example on the HPSF
HOW-TO <http://poi.apache.org/hpsf/how-to.html> page which reads Microsoft
Office document property (or metadata) information.  This code works fine
for documents that have been saved with English versions of Microsoft
Office.  However, when I try to use it with a Microsoft document that is
saved with a Chinese version, the code fails.  I get an exception and the
only message is "GBK".  I assume that this is referring to the character
set.


  Has anyone succeeded in using POI to get property data for UTF-8 documents
(especially Chinese)?  I have tried Tika and it failed as well.  At this
point I'm running out of options.  Any help would be greatly appreciated.

  Thanks you,

  Ian Kaplan
  www.bearcave.com

Problems getting summary information with HPSF for Chinese documents

Reply via email to