Hi Nutchers I think parse-html parse should be enhanced. In some of my projects(Intranet search engine), we only need the content in the specified detectors and filter the junk, say the content between <div class="start-here"> and </div> or some detectors like XPath. Any thoughts on this enhancement?
Regards /Jack -- Keep Discovering ... ... http://www.jroller.com/page/jmars ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
