Hi : Which setting should I modify in order to do normalization before filtering? Should I swap the order in "plugin.includes" property?
Regards On 7 June 2012 21:24, Lewis John Mcgibbney <[email protected]>wrote: > Hi, > > On Wed, Jun 6, 2012 at 10:16 AM, Markus Jelsma > <[email protected]> wrote: > > > > > > Better would be to run the normalizers first because that will solve the > problem. The default normalizers add a trailing slash to hosts when it's > missing, that means .au/ is not a suffix anymore and is not going to be > filtered out. > > > > > Please check out NUTCH-1373 which 'may' begin to help solve the > problem. I've only scanned over this one so apologies if this is out > of context... I'm trying to get caught up with traffic!!! > > hth > > Lewis >

