Tim Allison created TIKA-4226:
---------------------------------
Summary: Use jsoup for epubs
Key: TIKA-4226
URL: https://issues.apache.org/jira/browse/TIKA-4226
Project: Tika
Issue Type: Improvement
Reporter: Tim AllisonWe're getting quite a few xml exceptions when parsing epubs (roughly 1k out of 8k total). We should use Jsoup to handle contents of epubs more robustly. This is a proposal for 3.x. WDYT? -- This message was sent by Atlassian Jira (v8.20.10#820010)
