HtmlParser's http-equiv code needs to be more flexible ------------------------------------------------------
Key: TIKA-349 URL: https://issues.apache.org/jira/browse/TIKA-349 Project: Tika Issue Type: Improvement Affects Versions: 0.6 Reporter: Ken Krugler Priority: Minor Some http-equiv meta tags in HTML documents have charset attributes that currently aren't handled properly. For example, <meta http-equiv="content-type" content="text/html; charset=utf-8; charset=UTF-8"> Or where content="text/html;; charset="utf-8" (note double semi-colons) The parsing code needs to be more flexible to handle these edge cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.