On Thu, 19 Aug 2010, Thomas P Laford wrote:
Every so often, this line PowerPointExtractor txex1 = new PowerPointExtractor(pptFile); causes the message Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4006 to be displayed to the console.What is causing this? The PPT file looks fine when I open it.
This is caused by the structure of the powerpoint file not matching what POI expects to find. If you could upload an example file to a new entry in bugzilla, then someone can take a look and hopefully update POI to expect the format you find.
We use the code snippet below to extract custom properties from a directory full of powerpoint files.
If you only want the OLE2 properties, then you don't need the full PowerPointExtractor. HPSFPropertiesExtractor should do you just fine
Nick --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
