On Thu, 23 Jun 2011, Tom Gross wrote:
which tika 0.9 can't parse. It fails with:
Caused by: java.lang.NullPointerException
at
org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:47)
at
org.apache.poi.hwpf.model.PAPX.getParagraphProperties(PAPX.java:136)
This is an Apache POI bug. Can you try with a newer copy of Apache POI?
(If you build Tika from svn it'll pull down a newer one, 3.8 beta 3)
If that doesn't fix it, you'll need to file a bug report with POI, but
you'll need to try with the latest version first!
Nick