I'm crawling urls ended with .shtml. it may not be parsed by tika. how to setup nutch and tika-mimetypes.xml to parse .shtml files? thanks,
-- AJ Chen, PhD Chair, Semantic Web SIG, sdforum.org http://web2express.org twitter @web2express Palo Alto, CA, USA

