[ 
https://issues.apache.org/jira/browse/TIKA-530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605212#comment-14605212
 ] 

Jason Borg commented on TIKA-530:
---------------------------------

I have replicated this issue. Using the combination of MS Powerpoint 2013 and 
MS Excel 2013, do the following.

1. Create spreadsheet using MS Excel 2013. Save.
2. Create a new presentation using Powerpoint 2013.
3. "Copy" the Excel spreadsheet (file system copy will do).
4. "Paste Special" in the Powerpoint presentation and choose the "Paste Link" 
option instead of regular "Paste".
5. Save the presentation. File is one that will cause issue.

The "linked" file is referenced with an absolute URI, as visible in the 
stacktrace originally provided, this leads to an unhandled exception.

> InvalidFormatException on a PackagePart in OOXML
> ------------------------------------------------
>
>                 Key: TIKA-530
>                 URL: https://issues.apache.org/jira/browse/TIKA-530
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>            Reporter: Sjoerd Smeets
>         Attachments: Presentation1.pptx
>
>
> Hi,
> I receive the following error when parsing an ooxml file:
> Caused by: org.apache.poi.openxml4j.exceptions.InvalidFormatException: 
> Absolute URI forbidden:  
> file://///ravn.co.uk/London/Jobs/first%20introduction%20/Welcome%20day/1.avi
>     at 
> org.apache.poi.openxml4j.opc.PackagePartName?.throwExceptionIfAbsoluteUri(PackagePartName?.java:426)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.poi.openxml4j.opc.PackagePartName?.throwExceptionIfInvalidPartUri(PackagePartName?.java:175)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.poi.openxml4j.opc.PackagePartName?.<init>(PackagePartName?.java:83)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.poi.openxml4j.opc.PackagingURIHelper.createPartName(PackagingURIHelper.java:470)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.POIXMLDocument.getTargetPart(POIXMLDocument.java:95) 
> ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.POIXMLDocument.getTargetPart(POIXMLDocument.java:84) 
> ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.xslf.XSLFSlideShow.<init>(XSLFSlideShow.java:89) 
> ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.poi.xslf.extractor.XSLFPowerPointExtractor.<init>(XSLFPowerPointExtractor.java:45)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory?.java:183)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory?.java:150)
>  ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at 
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:53)
>  ~[tika-parsers-0.8-SNAPSHOT.jar:na]
> I can see that Absolute URI is forbidden, however, should it not just ignore 
> the PackagePartName in POI and move on with the other parts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to