Will an extension from existing point be a solution? Our on-going project also needs to deal specific crawling cases in some sites. We think about extending the current java class to fit our usage.
Michael Ji, --- Jack Tang <[EMAIL PROTECTED]> wrote: > Hi Nutchers > > I think parse-html parse should be enhanced. In some > of my > projects(Intranet search engine), we only need the > content in the > specified detectors and filter the junk, say the > content between <div > class="start-here"> and </div> or some detectors > like XPath. Any > thoughts on this enhancement? > > Regards > /Jack > -- > Keep Discovering ... ... > http://www.jroller.com/page/jmars > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
