Hi Lewis: I'll try to find a time to do it. Thanks for the reply.
Regards Andy On 31 May 2012 20:37, Lewis John Mcgibbney <[email protected]>wrote: > Hi Andy, > > This is a good catch and I would suggest you open an issue on the Jira > and submit a patch for the few instances of where this actually > occurs... e.g. I think there are currently 4 such instances in > nutch-default which concern the ordering of such tools. Admittedly > though I haven't dug down into the code to see if it is consistent as > you assume... > > If you begin by investigating (and patching if necessary) these parts > then this would make a nice patch. As you are using trunk, I wouldn't > imagine it would take you too long. > > Thanks very much > > Lewis > > On Thu, May 31, 2012 at 2:34 AM, Andy Xue <[email protected]> wrote: > > Hi all: > > > > The following situation has come to my attention regarding > "*nutch-site.xml*" > > when I'm using nutch trunk: > > When listing multiple scoring filters in the property > "*scoring.filter.order > > *", it is vital that no spaces/newlines/tabs are placed in front of the > > first value. E.g.: > > This is fine: > > <value>org.apache.nutch.scoring.opic.OPICScoringFilter myFilter</value> > > > > Either of these will generate an exception: > > <value> org.apache.nutch.scoring.opic.OPICScoringFilter myFilter</value> > > <value> > > org.apache.nutch.scoring.opic.OPICScoringFilter > > myFilter > > </value> > > > > The reason is: In *org.apache.nutch.scoring.ScoringFilters*, a statement > > (on line 59) "orderedFilters = order.split("\\s+");" tries to split the > > aforementioned string. The leading spaces will cause an empty separate > > array element as the first element, hence result in a ClassNotFound / > > NullPointer exception. > > > > > > It can be easily fixed of course, but what concerns me is that I suspect > > the fact that other properties will have the same problem (i.e., must > have > > the value content immediately follow the *<value>* tag. This is not > > considered robust. > > > > Any thoughts? > > > > Regards > > Andy > > > > -- > Lewis >

