Thanks Sourabh, actually, i was asking why nutch has two classes to parse content.
Tiger 2010-1-19 2011/1/19 Sourabh Kasliwal <[email protected]> > I think parse is called only once.. > There is a config param in nutch-default.xml that might help you.... > <property> > <name>fetcher.parse</name> > <value>true</value> > <description>If true, fetcher will parse content.</description> > </property> > > regards > Sourabh > > On Tue, Jan 18, 2011 at 2:39 PM, 黄淑明 <[email protected]> wrote: > > > I use Nutch-1.1. > > I want to add a plugin to parse webpage and store it in my database, I > add > > a > > class implements HtmlParseFilter, > > but found that even when the page is redirect to another > > page, HtmlParseFilter still get called . > > I thought ParseSegment.parse would be better, but why nutch1.1 use parse > > fuction both in Fether.output method and ParseSegment.parse? > > > > thanks in advance. > > > > Tiger > > 2011-1-18 > > >

