[jira] [Updated] (TIKA-2563) Extract embedded files in HTML
[ https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2563: -- Description: Files (esp images) and other objects can be embedded in html/css/javascript with the {{data: uri scheme}}. We should extract those like any other embedded file. (was: Files (esp images) can be base64 encoded in HTML files. We should extract those like any other embedded file.) > Extract embedded files in HTML > -- > > Key: TIKA-2563 > URL: https://issues.apache.org/jira/browse/TIKA-2563 > Project: Tika > Issue Type: Improvement >Reporter: Tim Allison >Priority: Trivial > Attachments: consumentenbond.html, testHTML_embedded_img.html > > > Files (esp images) and other objects can be embedded in html/css/javascript > with the {{data: uri scheme}}. We should extract those like any other > embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TIKA-2563) Extract embedded files in HTML
[ https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2563: -- Description: Files (esp images) and other objects can be embedded in html/css/javascript with the [data: uri scheme|https://en.wikipedia.org/wiki/Data_URI_scheme]. We should extract those like any other embedded file. (was: Files (esp images) and other objects can be embedded in html/css/javascript with the {{data: uri scheme}}. We should extract those like any other embedded file.) > Extract embedded files in HTML > -- > > Key: TIKA-2563 > URL: https://issues.apache.org/jira/browse/TIKA-2563 > Project: Tika > Issue Type: Improvement >Reporter: Tim Allison >Priority: Trivial > Attachments: consumentenbond.html, testHTML_embedded_img.html > > > Files (esp images) and other objects can be embedded in html/css/javascript > with the [data: uri scheme|https://en.wikipedia.org/wiki/Data_URI_scheme]. > We should extract those like any other embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TIKA-2563) Extract embedded files in HTML
[ https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2563: -- Attachment: testHTML_embedded_img.html > Extract embedded files in HTML > -- > > Key: TIKA-2563 > URL: https://issues.apache.org/jira/browse/TIKA-2563 > Project: Tika > Issue Type: Improvement >Reporter: Tim Allison >Priority: Trivial > Attachments: consumentenbond.html, testHTML_embedded_img.html > > > Files (esp images) can be base64 encoded in HTML files. We should extract > those like any other embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TIKA-2563) Extract embedded files in HTML
[ https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2563: -- Attachment: consumentenbond.html > Extract embedded files in HTML > -- > > Key: TIKA-2563 > URL: https://issues.apache.org/jira/browse/TIKA-2563 > Project: Tika > Issue Type: Improvement >Reporter: Tim Allison >Priority: Trivial > Attachments: consumentenbond.html > > > Files (esp images) can be base64 encoded in HTML files. We should extract > those like any other embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)