[ 
https://issues.apache.org/jira/browse/TIKA-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Burch resolved TIKA-2014.
------------------------------
    Resolution: Duplicate

> Unable to parse doc file
> ------------------------
>
>                 Key: TIKA-2014
>                 URL: https://issues.apache.org/jira/browse/TIKA-2014
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.12, 1.13
>         Environment: Ubuntu 14.04
>            Reporter: Richa Garg
>            Priority: Critical
>              Labels: maven
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.microsoft.OfficeParser@65a3ca0
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>       at 
> com.headhonchos.dal.operations.TikaParsing.process(TikaParsing.java:59)
> .......
> Caused by: java.lang.UnsupportedOperationException: Non-extended character 
> Pascal strings are not supported right now. Please, contact POI developers 
> for update.
>       at org.apache.poi.hwpf.model.Sttb.fillFields(Sttb.java:82)
>       at org.apache.poi.hwpf.model.Sttb.<init>(Sttb.java:61)
>       at 
> org.apache.poi.hwpf.model.SttbUtils.readSttbSavedBy(SttbUtils.java:52)
>       at org.apache.poi.hwpf.model.SavedByTable.<init>(SavedByTable.java:53)
>       at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:361)
>       at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:81)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:201)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:172)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>       ... 34 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to