Ok, the class gets called after I include it as a part of the classpath.
Thanks

On Tue, May 29, 2012 at 4:28 PM, Vikas Hazrati <[email protected]> wrote:

> Thanks Markus, would try with the classpath. I believe I did try that
>
> > <property>
> >   <name>db.fetch.schedule.class</name>
> >   <value>com.custom.CustomEventFetchScheduler</value>
> > </property>
>
> but would give it a try again and let the group know...
>
> On Tue, May 29, 2012 at 2:49 PM, Markus Jelsma <[email protected]
> > wrote:
>
>> -----Original message-----
>> > From:Vikas Hazrati <[email protected]>
>> > Sent: Mon 28-May-2012 13:55
>> > To: [email protected]
>> > Subject: Re: Setting the Fetch time with a CustomFetchSchedule
>> >
>> > Thanks Markus, what I understand from the code is that I should be able
>> to
>> > extract and pass meta information from my ParsePlugin and access that
>> as a
>> > part of the custom fetch schedule which extends AbstractFetchSchedule.
>> >
>> > If I create a custom fetch class as
>> >
>> > class CustomEventFetchScheduler extends AbstractFetchSchedule { ...}
>> >
>> > how do i include this custom class a part of my crawl cycle? I
>> understand
>> > that there is no extension point for this?
>>
>> Indeed, there is no extension point so you cannot make a nice plugin.
>> What you can do is make sure it's on the classpath and simply tell the
>> scheduler to use it via db.fetch.schedule.class, that should work just fine.
>>
>> >
>> > I get this -> Caused by: java.lang.RuntimeException: Plugin
>> > (12kdaggregator), extension point: org.apache.nutch.crawl.FetchSchedule
>> > does not exist.
>> >
>> > Also I could not successfully plug it as a part of nutch-site.xml by
>> > overriding the nutch-default.xml
>> >
>> >
>> > <property>
>> >   <name>db.fetch.schedule.class</name>
>> >   <value>com.custom.CustomEventFetchScheduler</value>
>> > </property>
>> >
>> >
>> > How do I include my custom logic so that it gets picked as a part of the
>> > crawl cycle.
>> >
>> > Regards | Vikas
>> >
>> > On Mon, May 21, 2012 at 6:14 PM, Markus Jelsma
>> > <[email protected]>wrote:
>> >
>> > > Yes, you can pass ParseMeta keys to the FetchSchedule as part of the
>> > > CrawlDatum's meta data as i did with:
>> > > https://issues.apache.org/jira/browse/NUTCH-1024
>> > >
>> > >
>> > > -----Original message-----
>> > > > From:Vikas Hazrati <[email protected]>
>> > > > Sent: Mon 21-May-2012 13:44
>> > > > To: [email protected]
>> > > > Subject: Setting the Fetch time with a CustomFetchSchedule
>> > > >
>> > > > Hi,
>> > > >
>> > > > I would like to implement a custom implementation of
>> > > AbstractFetchSchedule
>> > > > and would like to change the FetchTime on the basis of some
>> parameters
>> > > that
>> > > > I get as a part of my parsing.
>> > > >
>> > > > // something like this
>> > > > datum.setFetchTime(fetchTime + (long)datum.getFetchInterval() *
>> 1000 +
>> > > > customLogic);
>> > > >
>> > > > Right now I have a custom URLFilter and a custom parser which
>> extends
>> > > > HtmlParseFilter. At the time of custom parsing, I come across some
>> > > > parameters which would help me define how should I define the
>> fetchtime
>> > > for
>> > > > that URL. I would like to pass these values to my
>> CustomFetchSchedule.
>> > > >
>> > > > Is there a way to do that? Can I pass them as a part of
>> configuration?
>> > > >
>> > > > Since I would get the data that i need to make a decision only as a
>> part
>> > > of
>> > > > Parse, would it be possible to pass this data to the FetchSchedule?
>> > > >
>> > > > Thoughts?
>> > > >
>> > > > Regards | Vikas
>> > > >
>> > >
>> >
>>
>
>

Reply via email to