I believe the Plugin system caches plugins, but you will need to confirm (haven’t looked in a long time).
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Renxia Wang <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Sunday, February 22, 2015 at 6:37 PM To: "[email protected]" <[email protected]> Subject: Re: How to read metadata/content of an URL in URLFilter? > > > >Is there only one instance of a plugin for all fetch circles? I am >assuming that when the job is started, a plugin instance is initialized >and used in every fetching circle. Is it correct? > >On Sunday, February 22, 2015, Mattmann, Chris A (3980) ><[email protected]> wrote: > >In the constructor of your URLFilter, why not consider passing >in a NutchConfiguration object, and then reading the path to e.g, >the LinkDb from the config. Then have a private member variable >for the LinkDbReader (maybe static initialized for efficiency) >and use that in your interface method. > >Cheers, >Chris > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: >[email protected] <javascript:;> >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >-----Original Message----- >From: Renxia Wang <[email protected] <javascript:;>> >Reply-To: "[email protected] <javascript:;>" <[email protected] ><javascript:;>> >Date: Sunday, February 22, 2015 at 3:36 PM >To: "[email protected] <javascript:;>" <[email protected] ><javascript:;>> >Subject: How to read metadata/content of an URL in URLFilter? > >> >> >> >>Hi >> >> >>I want to develop an UrlFIlter which takes an url, takes its metadata or >>even the fetched content, then use some duplicate detection algorithms to >>determine if it is a duplicate of any url in bitch. However, the only >>parameter passed into the Urlfilter >> is the url, is it possible to get the data I want of that input url in >>Urlfilter? >> >> >>Thanks, >> >> >>Zhique > > >

