Andrew Skiba created TIKA-1344:
----------------------------------
Summary: Ability to generate self-contained HTML with images
Key: TIKA-1344
URL: https://issues.apache.org/jira/browse/TIKA-1344
Project: Tika
Issue Type: Improvement
Components: parser
Reporter: Andrew Skiba
n the current code, the images from Word documents are referenced by
"embedded:xxx" links in the generated HTML. This causes the browsers display
"x" icon instead of the image.
The proposed patch encodes the images using Data URI, if there is
-Dtika.parsers.urlimages system property.
http://en.wikipedia.org/wiki/Data_URI_scheme
So the default behavior is the same, but users of the library can optionally
generate self-contained HTML with correct images.
--
This message was sent by Atlassian JIRA
(v6.2#6252)