Hi.
I am trying to use the PowerPointExtractor class to extract text from a Chinese
powerpoint document. I have been able to successfully extract text from other
Chinese powerpoint documents, but with this particular document I am getting
the following exception.
The Current User stream must be at least 28 bytes long, but was only 7
not continuing anyways
org.apache.poi.hslf.exceptions.CorruptPowerPointFileException: The Current User
stream must be at least 28 bytes long, but was only 7
at org.apache.poi.hslf.record.CurrentUserAtom.<init>(CurrentUserAtom.java:120)
at
org.apache.poi.hslf.HSLFSlideShow.readCurrentUserStream(HSLFSlideShow.java:276)
at org.apache.poi.hslf.HSLFSlideShow.<init>(HSLFSlideShow.java:133)
at org.apache.poi.hslf.HSLFSlideShow.<init>(HSLFSlideShow.java:115)
at
org.apache.poi.hslf.extractor.PowerPointExtractor.<init>(PowerPointExtractor.java:98)
at
org.apache.poi.hslf.extractor.PowerPointExtractor.<init>(PowerPointExtractor.java:91)
at
org.apache.poi.hslf.extractor.PowerPointExtractor.<init>(PowerPointExtractor.java:84)
at TestPW.main(TestPW.java:15)
I know that this is a valid PowerPoint since I am able to open it up and view
it using Microsoft PowerPoint. This issue is happening with a handful of files
What could be causing this problem, and how can I overcome it?
Please let me know.
Thanks!
Sana