Thanks Sourabh,
actually, i was asking why nutch has two classes to parse content.

Tiger
2010-1-19


2011/1/19 Sourabh Kasliwal <[email protected]>

> I think parse is called only once..
> There is a config param in nutch-default.xml that might help you....
> <property>
>  <name>fetcher.parse</name>
>  <value>true</value>
>  <description>If true, fetcher will parse content.</description>
> </property>
>
> regards
> Sourabh
>
> On Tue, Jan 18, 2011 at 2:39 PM, 黄淑明 <[email protected]> wrote:
>
> > I use Nutch-1.1.
> > I want to add a plugin to parse webpage and store it in my database, I
> add
> > a
> > class implements HtmlParseFilter,
> > but found that even when the page is redirect to another
> > page, HtmlParseFilter still get called .
> > I thought ParseSegment.parse would be better, but why nutch1.1 use parse
> > fuction both in Fether.output method and ParseSegment.parse?
> >
> > thanks in advance.
> >
> > Tiger
> > 2011-1-18
> >
>

Reply via email to