[
https://issues.apache.org/jira/browse/TIKA-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977380#comment-15977380
]
Tim Allison commented on TIKA-2024:
-----------------------------------
To put off until after Tika 1.15, but while I'm thinking about it:
{{<x15ac:absPath>}} in xlsx's workbook.xml can store the last saved full path,
but not the file name, e.g.
{noformat}
<x15ac:absPath url="C:\Users\tallison\Desktop\working\xlsb\"
xmlns:x15ac="http://schemas.microsoft.com/office/spreadsheetml/2010/11/ac"/>
{noformat}
This info is also stored in xlsb's workbook.bin in our existing unit test file
{{testEXCEL_various.xlsb}}.
> Extract original filename/path when possible
> --------------------------------------------
>
> Key: TIKA-2024
> URL: https://issues.apache.org/jira/browse/TIKA-2024
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Fix For: 2.0, 1.14
>
>
> Several file formats include original file names or original file paths for
> themselves or for embedded documents. Let's extract that information.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)