Thanks Markus, would try with the classpath. I believe I did try that

> <property>
>   <name>db.fetch.schedule.class</name>
>   <value>com.custom.CustomEventFetchScheduler</value>
> </property>

but would give it a try again and let the group know...

On Tue, May 29, 2012 at 2:49 PM, Markus Jelsma
<[email protected]>wrote:

> -----Original message-----
> > From:Vikas Hazrati <[email protected]>
> > Sent: Mon 28-May-2012 13:55
> > To: [email protected]
> > Subject: Re: Setting the Fetch time with a CustomFetchSchedule
> >
> > Thanks Markus, what I understand from the code is that I should be able
> to
> > extract and pass meta information from my ParsePlugin and access that as
> a
> > part of the custom fetch schedule which extends AbstractFetchSchedule.
> >
> > If I create a custom fetch class as
> >
> > class CustomEventFetchScheduler extends AbstractFetchSchedule { ...}
> >
> > how do i include this custom class a part of my crawl cycle? I understand
> > that there is no extension point for this?
>
> Indeed, there is no extension point so you cannot make a nice plugin. What
> you can do is make sure it's on the classpath and simply tell the scheduler
> to use it via db.fetch.schedule.class, that should work just fine.
>
> >
> > I get this -> Caused by: java.lang.RuntimeException: Plugin
> > (12kdaggregator), extension point: org.apache.nutch.crawl.FetchSchedule
> > does not exist.
> >
> > Also I could not successfully plug it as a part of nutch-site.xml by
> > overriding the nutch-default.xml
> >
> >
> > <property>
> >   <name>db.fetch.schedule.class</name>
> >   <value>com.custom.CustomEventFetchScheduler</value>
> > </property>
> >
> >
> > How do I include my custom logic so that it gets picked as a part of the
> > crawl cycle.
> >
> > Regards | Vikas
> >
> > On Mon, May 21, 2012 at 6:14 PM, Markus Jelsma
> > <[email protected]>wrote:
> >
> > > Yes, you can pass ParseMeta keys to the FetchSchedule as part of the
> > > CrawlDatum's meta data as i did with:
> > > https://issues.apache.org/jira/browse/NUTCH-1024
> > >
> > >
> > > -----Original message-----
> > > > From:Vikas Hazrati <[email protected]>
> > > > Sent: Mon 21-May-2012 13:44
> > > > To: [email protected]
> > > > Subject: Setting the Fetch time with a CustomFetchSchedule
> > > >
> > > > Hi,
> > > >
> > > > I would like to implement a custom implementation of
> > > AbstractFetchSchedule
> > > > and would like to change the FetchTime on the basis of some
> parameters
> > > that
> > > > I get as a part of my parsing.
> > > >
> > > > // something like this
> > > > datum.setFetchTime(fetchTime + (long)datum.getFetchInterval() * 1000
> +
> > > > customLogic);
> > > >
> > > > Right now I have a custom URLFilter and a custom parser which extends
> > > > HtmlParseFilter. At the time of custom parsing, I come across some
> > > > parameters which would help me define how should I define the
> fetchtime
> > > for
> > > > that URL. I would like to pass these values to my
> CustomFetchSchedule.
> > > >
> > > > Is there a way to do that? Can I pass them as a part of
> configuration?
> > > >
> > > > Since I would get the data that i need to make a decision only as a
> part
> > > of
> > > > Parse, would it be possible to pass this data to the FetchSchedule?
> > > >
> > > > Thoughts?
> > > >
> > > > Regards | Vikas
> > > >
> > >
> >
>

Reply via email to