[ https://issues.apache.org/jira/browse/TIKA-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-349. -------------------------------- Resolution: Fixed Fix Version/s: 0.6 Assignee: Jukka Zitting Patch committed in revision 891074. > HtmlParser's http-equiv code needs to be more flexible > ------------------------------------------------------ > > Key: TIKA-349 > URL: https://issues.apache.org/jira/browse/TIKA-349 > Project: Tika > Issue Type: Improvement > Affects Versions: 0.6 > Reporter: Ken Krugler > Assignee: Jukka Zitting > Priority: Minor > Fix For: 0.6 > > Attachments: TIKA-349.patch > > > Some http-equiv meta tags in HTML documents have charset attributes that > currently aren't handled properly. > For example, <meta http-equiv="content-type" content="text/html; > charset=utf-8; charset=UTF-8"> > Or where content="text/html;; charset="utf-8" (note double semi-colons) > The parsing code needs to be more flexible to handle these edge cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.