[
https://issues.apache.org/jira/browse/TIKA-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589660#comment-14589660
]
Nick Burch commented on TIKA-1658:
----------------------------------
It looks like this is caused by an older visio file with v5 pointers - Apache
POI (which powers this area of Tika) only supports version 6 at the moment
Are you willing to spend some time reading file format specs, helping provide
some sample files with this kind of pointer, and contributing the missing
functionality?
> unable to parse microsoft visio files with tika
> -----------------------------------------------
>
> Key: TIKA-1658
> URL: https://issues.apache.org/jira/browse/TIKA-1658
> Project: Tika
> Issue Type: Bug
> Components: metadata
> Affects Versions: 1.3, 1.4, 1.5, 1.8
> Environment: ubuntu 14.04 and windows 7
> Reporter: senthil
>
> hi
> With parsing an microsoft visio it throws an exception.
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser@13d28e3
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>
> Caused by: java.lang.RuntimeException: TODO
> at
> org.apache.poi.hdgf.pointers.PointerFactory.createPointer(PointerFactory.java:45)
> at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:99)
> application/vnd.visio
> at
> org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:55)
> at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:200)
> at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> ... 4 more
> Please help with a resolution
> regards
> sentil
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)