Will an extension from existing point be a solution? 

Our on-going project also needs to deal specific
crawling cases in some sites. We think about extending
the current java class to fit our usage.

Michael Ji,

--- Jack Tang <[EMAIL PROTECTED]> wrote:

> Hi Nutchers
> 
> I think parse-html parse should be enhanced. In some
> of  my
> projects(Intranet search engine), we only need the
> content in the
> specified detectors and filter the junk, say the
> content between <div
> class="start-here"> and </div> or some detectors
> like XPath. Any
> thoughts on this enhancement?
> 
> Regards
> /Jack
> -- 
> Keep Discovering ... ...
> http://www.jroller.com/page/jmars
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to