On Wed, 7 Mar 2012, Harry Simons wrote:
When converting a bunch of Microsoft Word documents using the command,

   java -jar tika-app-1.1-SNAPSHOT.jar -v -t

, I'm getting the following exception.

Caused by: java.lang.ArrayIndexOutOfBoundsException: 487
at org.apache.poi.hwpf.sprm.SprmOperation.initSize(SprmOperation.java:174)
   at org.apache.poi.hwpf.sprm.SprmOperation.<init>(SprmOperation.java:80)

This looks like a POI bug


Because these are internal business documents, I may not be able to share them with you guys so would greatly appreciate a fix or a workaround.

That's going to make fixing it much trickier. You'll need to raise a POI bug, and be willing to do lots of investigating

It may also be worth running the Binary File Format Validator <http://poi.apache.org/faq.html#faq-N10109> against the file, to check it's a valid and not corrupted

Nick

Reply via email to