Tim Allison created TIKA-4449:
---------------------------------
Summary: Improve xmp metadata key precision for PDFs
Key: TIKA-4449
URL: https://issues.apache.org/jira/browse/TIKA-4449
Project: Tika
Issue Type: Task
Reporter: Tim Allison
PDFs (and other file formats) may have conflicting information within them
about, for example, the "title" field or the "author" field.
Tika's parsers typically pick one source over another and normalize the keys to
dublin core or other standards.
[~peterhoogendijk] and other users (likely?) want to be able to identify
whether a given piece of information comes from the XMP or the docinfo. This is
follow on work from TIKA-4444. The proposal is to add new metadata keys to
specify when dublin core information comes directly from xmp.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)