Anyone? Any idea on what could be going wrong? Is it possible to inject a
custom fetch scheduler?

On Mon, May 28, 2012 at 5:25 PM, Vikas Hazrati <[email protected]> wrote:

> Thanks Markus, what I understand from the code is that I should be able to
> extract and pass meta information from my ParsePlugin and access that as a
> part of the custom fetch schedule which extends AbstractFetchSchedule.
>
> If I create a custom fetch class as
>
> class CustomEventFetchScheduler extends AbstractFetchSchedule { ...}
>
> how do i include this custom class a part of my crawl cycle? I understand
> that there is no extension point for this?
>
> I get this -> Caused by: java.lang.RuntimeException: Plugin
> (myaggregator), extension point: org.apache.nutch.crawl.FetchSchedule does
> not exist.
>
> Also I could not successfully plug it as a part of nutch-site.xml by
> overriding the nutch-default.xml
>
>
> <property>
>   <name>db.fetch.schedule.class</name>
>   <value>com.custom.CustomEventFetchScheduler</value>
> </property>
>
>
> How do I include my custom logic so that it gets picked as a part of the
> crawl cycle.
>
> Regards | Vikas
>
> On Mon, May 21, 2012 at 6:14 PM, Markus Jelsma <[email protected]
> > wrote:
>
>> Yes, you can pass ParseMeta keys to the FetchSchedule as part of the
>> CrawlDatum's meta data as i did with:
>> https://issues.apache.org/jira/browse/NUTCH-1024
>>
>>
>> -----Original message-----
>> > From:Vikas Hazrati <[email protected]>
>> > Sent: Mon 21-May-2012 13:44
>> > To: [email protected]
>> > Subject: Setting the Fetch time with a CustomFetchSchedule
>> >
>> > Hi,
>> >
>> > I would like to implement a custom implementation of
>> AbstractFetchSchedule
>> > and would like to change the FetchTime on the basis of some parameters
>> that
>> > I get as a part of my parsing.
>> >
>> > // something like this
>> > datum.setFetchTime(fetchTime + (long)datum.getFetchInterval() * 1000 +
>> > customLogic);
>> >
>> > Right now I have a custom URLFilter and a custom parser which extends
>> > HtmlParseFilter. At the time of custom parsing, I come across some
>> > parameters which would help me define how should I define the fetchtime
>> for
>> > that URL. I would like to pass these values to my CustomFetchSchedule.
>> >
>> > Is there a way to do that? Can I pass them as a part of configuration?
>> >
>> > Since I would get the data that i need to make a decision only as a
>> part of
>> > Parse, would it be possible to pass this data to the FetchSchedule?
>> >
>> > Thoughts?
>> >
>> > Regards | Vikas
>> >
>>
>
>

Reply via email to