On Mon, 28 Dec 2020, Peter Kronenberg wrote:
For the metadata that comes back from a parse (example below), clearly, the fields are dependent on the file type and information available. Are there any 'standard' fields that come back for all/any files? Such as Author, date, x-parsed-by, etc. Is there a list of these somewhere?

Main ones are taken from Dublin Core, see:
http://tika.apache.org/1.25/api/org/apache/tika/metadata/DublinCore.html

Other ones that a fair number use come from:
http://tika.apache.org/1.25/api/org/apache/tika/metadata/TikaMetadataKeys.html
http://tika.apache.org/1.25/api/org/apache/tika/metadata/HttpHeaders.html

The full set of properties is defined in the interfaces at:
http://tika.apache.org/1.25/api/org/apache/tika/metadata/package-summary.html

Nick
  • Metadata Peter Kronenberg
    • Re: Metadata Nick Burch

Reply via email to