Hi Sebastian, Alright. How about a performance penalty if we get a new instance of filters and normalizers for each parse? Right now each thread has its own instances. Some filters can be very costly to load too frequently.
Thanks, Markus -----Original message----- > From:Sebastian Nagel <[email protected]> > Sent: Tue 29-Jan-2013 22:22 > To: [email protected] > Subject: Re: Outlinks in parse filter > > Hi Markus, > > this would mean that urlfilter and urlnormalizer plugins are accessed from > parse plugins. > At a first glance, sounds somewhat oddish. But it's already the case for the > feed parser. > > We would have to do it for all parse plugins. Since there not so many that's > no argument against. > > Supposed you can still switch it off via the parse.(filter|normalize).urls > properties I see no > serious reason why it can't be done. > > Sebastian > > On 01/29/2013 01:16 PM, Markus Jelsma wrote: > > Hi, > > > > Outlinks that reach the parse filters via ParseData are not normalized or > > filtered but i believe they should be. If you would try to do something > > sensible with the outlinks in the parse filter you cannot rely on their > > accuracy. Should we not move the calls to ParseOutputFormat.filterNormalize > > to the parse plugin? > > > > Any thoughts? > > Markus > > > >

