Parikshit Phukan created TIKA-3163:
--------------------------------------

             Summary: Java null pointer exception thrown while parsing an xlsx 
file to string even though the xlsx file is working fine in the wps
                 Key: TIKA-3163
                 URL: https://issues.apache.org/jira/browse/TIKA-3163
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.24.1
            Reporter: Parikshit Phukan
         Attachments: CVLKRA-KYC_Download_File_Structure_V3.1.xlsx

I am using tika to extract text and feed it to my lucene indexer. Tika is 
throwing a null pointer exception for a particular xlsx file. It works fine 
while testing on other xlsx file and only throws an exception on this 
particular file. I'll be attaching the xlslx file for you to check out. Kindly 
help me out. 

Code :-

String path = "D:\\CVLKRA-KYC_Download_File_Structure_V3.1.xlsx";String path = 
"D:\\CVLKRA-KYC_Download_File_Structure_V3.1.xlsx";

File file = new File(path); 

System.out.print(tika.parseToString(file));

 

Error :-

Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected 
RuntimeException from 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser@54a67a45Exception in thread 
"main" org.apache.tika.exception.TikaException: Unexpected RuntimeException 
from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@54a67a45 at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293) at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at 
org.apache.tika.Tika.parseToString(Tika.java:527) at 
org.apache.tika.Tika.parseToString(Tika.java:642) at 
poc.please.TikaPoc.main(TikaPoc.java:42)Caused by: 
java.lang.NullPointerException at 
org.apache.poi.xssf.usermodel.XSSFTableStyle.<init>(XSSFTableStyle.java:64) at 
org.apache.poi.xssf.model.StylesTable.readFrom(StylesTable.java:245) at 
org.apache.poi.xssf.model.StylesTable.<init>(StylesTable.java:138) at 
org.apache.poi.xssf.eventusermodel.XSSFReader.getStylesTable(XSSFReader.java:127)
 at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:143)
 at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:136)
 at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:126)
 at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:210)
 at 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:113) 
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... 5 
more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to