[ 
https://issues.apache.org/jira/browse/TIKA-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589660#comment-14589660
 ] 

Nick Burch commented on TIKA-1658:
----------------------------------

It looks like this is caused by an older visio file with v5 pointers - Apache 
POI (which powers this area of Tika) only supports version 6 at the moment

Are you willing to spend some time reading file format specs, helping provide 
some sample files with this kind of pointer, and contributing the missing 
functionality?

> unable to parse microsoft visio files with tika
> -----------------------------------------------
>
>                 Key: TIKA-1658
>                 URL: https://issues.apache.org/jira/browse/TIKA-1658
>             Project: Tika
>          Issue Type: Bug
>          Components: metadata
>    Affects Versions: 1.3, 1.4, 1.5, 1.8
>         Environment: ubuntu 14.04 and windows 7
>            Reporter: senthil
>
> hi
> With parsing an microsoft visio it throws an exception.
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.microsoft.OfficeParser@13d28e3
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>        
> Caused by: java.lang.RuntimeException: TODO
>       at 
> org.apache.poi.hdgf.pointers.PointerFactory.createPointer(PointerFactory.java:45)
>       at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:99)
> application/vnd.visio
>       at 
> org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:55)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:200)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>       ... 4 more
> Please help with a resolution
> regards
> sentil



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to