The attached file was created in Google Docs with an image inside and saved as an .odt file. After saving, I opened the file with LibreOffice and added a hyperlink to the image.
When I parse the file with Tika, neither LinkContentHandler or ToXMLContentHandler show any trace of the hyperlink. The link is clickable when I open the document, and I can see it inside content.xml after extracting the document with 7zip: *<draw:a xlink:type="simple" xlink:href="http://example.test/ <http://example.test/>">* I tried enabling all options in OfficeParserConfig and OOXMLParser but it hasn't made a difference so far. The X-Parsed-By header shows it is being parsed with org.apache.tika.parser.odf.OpenDocumentParser. Could this be a bug with the org.apache.tika.parser.odf.OpenDocumentParser? -- This email, its contents and attachments contain information from J2 Global, Inc. and/or its affiliates which may be privileged, confidential or otherwise protected from disclosure. The information is intended to be for the addressee(s) only. If you are not an addressee, any disclosure, copy, distribution or use of the contents of this message is prohibited. If you have received this email in error, please notify the sender by reply email and delete the original message and any copies.
link-gdocs.odt
Description: application/vnd.oasis.opendocument.text
