Tim Allison created SOLR-7229:
---------------------------------
Summary: Allow DIH to handle attachments as separate documents
Key: SOLR-7229
URL: https://issues.apache.org/jira/browse/SOLR-7229
Project: Solr
Issue Type: Improvement
Reporter: Tim Allison
Priority: Minor
With Tika 1.7's RecursiveParserWrapper, it is possible to maintain metadata of
individual attachments/embedded documents. Tika's default handling was to
maintain the metadata of the container document and concatenate the contents of
all embedded files. With SOLR-7189, we added the legacy behavior.
It might be handy, for example, to be able to send an MSG file through DIH and
treat the container email as well each attachment as separate (child?)
documents, or send a zip of jpeg files and correctly index the geo locations
for each image file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]