On Tue, 8 Sep 2009, Som Satpathy wrote:
Does apache POI follow any particular encoding internally while extracting MS office documents? If so what is the encoding that POI uses?
POI is written in Java, so uses native java strings almost everywhere. These are unicode
The microsoft file formats generally store text as either US-ASCII or UCS-2. The type of the record/block/etc tells you which it is, so we can turn that into java (unicode) strings
Nick --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
