Anyone? Any idea on what could be going wrong? Is it possible to inject a custom fetch scheduler?
On Mon, May 28, 2012 at 5:25 PM, Vikas Hazrati <[email protected]> wrote: > Thanks Markus, what I understand from the code is that I should be able to > extract and pass meta information from my ParsePlugin and access that as a > part of the custom fetch schedule which extends AbstractFetchSchedule. > > If I create a custom fetch class as > > class CustomEventFetchScheduler extends AbstractFetchSchedule { ...} > > how do i include this custom class a part of my crawl cycle? I understand > that there is no extension point for this? > > I get this -> Caused by: java.lang.RuntimeException: Plugin > (myaggregator), extension point: org.apache.nutch.crawl.FetchSchedule does > not exist. > > Also I could not successfully plug it as a part of nutch-site.xml by > overriding the nutch-default.xml > > > <property> > <name>db.fetch.schedule.class</name> > <value>com.custom.CustomEventFetchScheduler</value> > </property> > > > How do I include my custom logic so that it gets picked as a part of the > crawl cycle. > > Regards | Vikas > > On Mon, May 21, 2012 at 6:14 PM, Markus Jelsma <[email protected] > > wrote: > >> Yes, you can pass ParseMeta keys to the FetchSchedule as part of the >> CrawlDatum's meta data as i did with: >> https://issues.apache.org/jira/browse/NUTCH-1024 >> >> >> -----Original message----- >> > From:Vikas Hazrati <[email protected]> >> > Sent: Mon 21-May-2012 13:44 >> > To: [email protected] >> > Subject: Setting the Fetch time with a CustomFetchSchedule >> > >> > Hi, >> > >> > I would like to implement a custom implementation of >> AbstractFetchSchedule >> > and would like to change the FetchTime on the basis of some parameters >> that >> > I get as a part of my parsing. >> > >> > // something like this >> > datum.setFetchTime(fetchTime + (long)datum.getFetchInterval() * 1000 + >> > customLogic); >> > >> > Right now I have a custom URLFilter and a custom parser which extends >> > HtmlParseFilter. At the time of custom parsing, I come across some >> > parameters which would help me define how should I define the fetchtime >> for >> > that URL. I would like to pass these values to my CustomFetchSchedule. >> > >> > Is there a way to do that? Can I pass them as a part of configuration? >> > >> > Since I would get the data that i need to make a decision only as a >> part of >> > Parse, would it be possible to pass this data to the FetchSchedule? >> > >> > Thoughts? >> > >> > Regards | Vikas >> > >> > >

