Ok, the class gets called after I include it as a part of the classpath. Thanks
On Tue, May 29, 2012 at 4:28 PM, Vikas Hazrati <[email protected]> wrote: > Thanks Markus, would try with the classpath. I believe I did try that > > > <property> > > <name>db.fetch.schedule.class</name> > > <value>com.custom.CustomEventFetchScheduler</value> > > </property> > > but would give it a try again and let the group know... > > On Tue, May 29, 2012 at 2:49 PM, Markus Jelsma <[email protected] > > wrote: > >> -----Original message----- >> > From:Vikas Hazrati <[email protected]> >> > Sent: Mon 28-May-2012 13:55 >> > To: [email protected] >> > Subject: Re: Setting the Fetch time with a CustomFetchSchedule >> > >> > Thanks Markus, what I understand from the code is that I should be able >> to >> > extract and pass meta information from my ParsePlugin and access that >> as a >> > part of the custom fetch schedule which extends AbstractFetchSchedule. >> > >> > If I create a custom fetch class as >> > >> > class CustomEventFetchScheduler extends AbstractFetchSchedule { ...} >> > >> > how do i include this custom class a part of my crawl cycle? I >> understand >> > that there is no extension point for this? >> >> Indeed, there is no extension point so you cannot make a nice plugin. >> What you can do is make sure it's on the classpath and simply tell the >> scheduler to use it via db.fetch.schedule.class, that should work just fine. >> >> > >> > I get this -> Caused by: java.lang.RuntimeException: Plugin >> > (12kdaggregator), extension point: org.apache.nutch.crawl.FetchSchedule >> > does not exist. >> > >> > Also I could not successfully plug it as a part of nutch-site.xml by >> > overriding the nutch-default.xml >> > >> > >> > <property> >> > <name>db.fetch.schedule.class</name> >> > <value>com.custom.CustomEventFetchScheduler</value> >> > </property> >> > >> > >> > How do I include my custom logic so that it gets picked as a part of the >> > crawl cycle. >> > >> > Regards | Vikas >> > >> > On Mon, May 21, 2012 at 6:14 PM, Markus Jelsma >> > <[email protected]>wrote: >> > >> > > Yes, you can pass ParseMeta keys to the FetchSchedule as part of the >> > > CrawlDatum's meta data as i did with: >> > > https://issues.apache.org/jira/browse/NUTCH-1024 >> > > >> > > >> > > -----Original message----- >> > > > From:Vikas Hazrati <[email protected]> >> > > > Sent: Mon 21-May-2012 13:44 >> > > > To: [email protected] >> > > > Subject: Setting the Fetch time with a CustomFetchSchedule >> > > > >> > > > Hi, >> > > > >> > > > I would like to implement a custom implementation of >> > > AbstractFetchSchedule >> > > > and would like to change the FetchTime on the basis of some >> parameters >> > > that >> > > > I get as a part of my parsing. >> > > > >> > > > // something like this >> > > > datum.setFetchTime(fetchTime + (long)datum.getFetchInterval() * >> 1000 + >> > > > customLogic); >> > > > >> > > > Right now I have a custom URLFilter and a custom parser which >> extends >> > > > HtmlParseFilter. At the time of custom parsing, I come across some >> > > > parameters which would help me define how should I define the >> fetchtime >> > > for >> > > > that URL. I would like to pass these values to my >> CustomFetchSchedule. >> > > > >> > > > Is there a way to do that? Can I pass them as a part of >> configuration? >> > > > >> > > > Since I would get the data that i need to make a decision only as a >> part >> > > of >> > > > Parse, would it be possible to pass this data to the FetchSchedule? >> > > > >> > > > Thoughts? >> > > > >> > > > Regards | Vikas >> > > > >> > > >> > >> > >

