On Thu, 12 Apr 2012, William Hays wrote:
Using the API, I have extracted the supported media types for the AutoDetectParser in Tika 1.1 and I'm not seeing HTML or XHTML mimetypes in that list of 92 items, though it parses such files fine.
Hmm, HTML is showing up for me:
java -jar tika-app-1.1.jar --list-parser-details | grep -A 4 HtmlParser
org.apache.tika.parser.html.HtmlParser
application/x-asp
application/xhtml+xml
application/vnd.wap.xhtml+xml
text/html
Nick
