Using Tika from trunk produces the same error with the XLS.

When the .XLS is saved as XSX antoher error comes up:

Could not parse document:class
java.lang.NoSuchMethodError:org.apache.poi.poifs.filesystem.POIFSFileSystem.hasPOIFSHeader(Ljava/io/InputStream;)Z
java.lang.NoSuchMethodError:
org.apache.poi.poifs.filesystem.POIFSFileSystem.hasPOIFSHeader(Ljava/io/InputStream;)Z
    at
org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:148)
    at
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:65)
    at
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:67)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
    at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
    at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:150)
    at metricAv.TikaParser.parse(TikaParser.java:57)
    at metricAv.TikaParser.main(TikaParser.java:39)


I see there is a new release of POI (3.7) since 29. oct.
I would like to build this in TIKA, but I am not familiar with Maven.
Maybe someone could explain how to modify the POMs in order to use POI
3.7  with TIKA?

Thanks
Roland





On 11/05/2010 04:49 PM, Nick Burch wrote:
> On Fri, 5 Nov 2010, Roland Cornelissen wrote:
>> Caused by: java.io.IOException: Unable to read entire block; 1 byte
>> read; expected 512 bytes
>>    at
>> org.apache.poi.poifs.storage.RawDataBlock.<init>(RawDataBlock.java:62)
>
> This is normally caused by truncated files. However, it might be worth
> trying with a recent nightly build of tika, as that has a newer POI in
> it, and I can't remember if there have been POIFS fixes since 0.7
>
> Nick
>

Reply via email to