Y.  Looks like a bug.  Please open an issue on our JIRA:
https://issues.apache.org/jira/projects/TIKA/summary

On Wed, Aug 5, 2020 at 8:27 PM Robert Kaulbach <[email protected]>
wrote:

> The attached file was created in Google Docs with an image inside and
> saved as an .odt file. After saving, I opened the file with LibreOffice and
> added a hyperlink to the image.
>
> When I parse the file with Tika, neither LinkContentHandler or
> ToXMLContentHandler show any trace of the hyperlink.
>
> The link is clickable when I open the document, and I can see it inside
> content.xml after extracting the document with 7zip:
> *<draw:a xlink:type="simple" xlink:href="http://example.test/
> <http://example.test/>">*
>
> I tried enabling all options in OfficeParserConfig and OOXMLParser but it
> hasn't made a difference so far. The X-Parsed-By header shows it is being
> parsed with org.apache.tika.parser.odf.OpenDocumentParser.
>
> Could this be a bug with the org.apache.tika.parser.odf.OpenDocumentParser?
>
>
> ------------------------------
>
> This email, its contents and attachments contain information from J2
> Global, Inc. and/or its affiliates which may be privileged, confidential or
> otherwise protected from disclosure. The information is intended to be for
> the addressee(s) only. If you are not an addressee, any disclosure, copy,
> distribution or use of the contents of this message is prohibited. If you
> have received this email in error, please notify the sender by reply email
> and delete the original message and any copies.
>

Reply via email to