Tim Allison created TIKA-4033:
---------------------------------

             Summary: Improve metadata for incremental updates, take 2
                 Key: TIKA-4033
                 URL: https://issues.apache.org/jira/browse/TIKA-4033
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


We're currently generating a "resourceName" in the PDFParser for incremental 
updates.  The following isn't well documented (I don't think?), but we try to 
reserve "resourceName" for embedded files to be the actual name that the 
container document has for that embedded file.  

Now, we need some kind of name for the embedded resource path in 
RecursiveParserWrapper, so we generate something based on the resourceName or, 
if that doesn't exist, the the relationship id, and if that doesn't exist we 
create /embedded-NUM.

But that's a separate issue.

We should use another option so that RecursiveParserWrapper knows to name the 
path /version-number-0 or similar.  We should not misuse "resourceName".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to