Tim Allison created TIKA-2563:
---------------------------------
Summary: Extract embedded files in HTML
Key: TIKA-2563
URL: https://issues.apache.org/jira/browse/TIKA-2563
Project: Tika
Issue Type: Improvement
Reporter: Tim AllisonFiles (esp images) can be base64 encoded in HTML files. We should extract those like any other embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
