Will an extension from existing point be a solution? Our on-going project also needs to deal specific crawling cases in some sites. We think about extending the current java class to fit our usage.
Michael Ji, --- Jack Tang <[EMAIL PROTECTED]> wrote: > Hi Nutchers > > I think parse-html parse should be enhanced. In some > of my > projects(Intranet search engine), we only need the > content in the > specified detectors and filter the junk, say the > content between <div > class="start-here"> and </div> or some detectors > like XPath. Any > thoughts on this enhancement? > > Regards > /Jack > -- > Keep Discovering ... ... > http://www.jroller.com/page/jmars > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
