Hi Renato, Regarding places in Nutch code to look:
You can look on HtmlParser.getParse() (resides at plugin/parse-html in Nutch source distribution ) ParserJob.$ParserMapper.map() invokes ParseUtil.process(), it calls ParseUtil.parse(), it calls Parser.getParse() (which is HtmlParser.getParse() here). Regards, Alexey -- View this message in context: http://lucene.472066.n3.nabble.com/Slow-parse-on-hadoop-tp4040215p4041039.html Sent from the Nutch - User mailing list archive at Nabble.com.

