Tim Allison created SOLR-7229:
---------------------------------

             Summary: Allow DIH to handle attachments as separate documents
                 Key: SOLR-7229
                 URL: https://issues.apache.org/jira/browse/SOLR-7229
             Project: Solr
          Issue Type: Improvement
            Reporter: Tim Allison
            Priority: Minor


With Tika 1.7's RecursiveParserWrapper, it is possible to maintain metadata of 
individual attachments/embedded documents.  Tika's default handling was to 
maintain the metadata of the container document and concatenate the contents of 
all embedded files.  With SOLR-7189, we added the legacy behavior.

It might be handy, for example, to be able to send an MSG file through DIH and 
treat the container email as well each attachment as separate (child?) 
documents, or send a zip of jpeg files and correctly index the geo locations 
for each image file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to