Tim Allison created TIKA-1786: --------------------------------- Summary: Downgrade logging severity in FileResourceConsumer and fix handling of illegal xml characters Key: TIKA-1786 URL: https://issues.apache.org/jira/browse/TIKA-1786 Project: Tika Issue Type: Improvement Components: tika-batch Reporter: Tim Allison Assignee: Tim Allison Priority: Trivial
FileResourceConsumer logs an xmlified snippet to record problems encountered during parsing. If a parser includes illegal xml characters in the ParseException, this exception is caught by the xmlification code and then logged as an error. The xmlification code should be robust against illegal characters and we should downgrade logging severity from error to warnings when there wasn't an actual error thrown by a parser. -- This message was sent by Atlassian JIRA (v6.3.4#6332)