Hi Lewis: Sorry for the delay. Sure, I'll open a ticket in a bit.
Regards Andy On 7 June 2012 21:28, Lewis John Mcgibbney <[email protected]>wrote: > Hi Andy, > Even opening a ticket and getting it logged would b great. > Thanks > Lewis > > On Wed, Jun 6, 2012 at 3:53 AM, Andy Xue <[email protected]> wrote: > > Hi Lewis: > > > > I'll try to find a time to do it. Thanks for the reply. > > > > Regards > > Andy > > > > > > > > On 31 May 2012 20:37, Lewis John Mcgibbney <[email protected] > >wrote: > > > >> Hi Andy, > >> > >> This is a good catch and I would suggest you open an issue on the Jira > >> and submit a patch for the few instances of where this actually > >> occurs... e.g. I think there are currently 4 such instances in > >> nutch-default which concern the ordering of such tools. Admittedly > >> though I haven't dug down into the code to see if it is consistent as > >> you assume... > >> > >> If you begin by investigating (and patching if necessary) these parts > >> then this would make a nice patch. As you are using trunk, I wouldn't > >> imagine it would take you too long. > >> > >> Thanks very much > >> > >> Lewis > >> > >> On Thu, May 31, 2012 at 2:34 AM, Andy Xue <[email protected]> > wrote: > >> > Hi all: > >> > > >> > The following situation has come to my attention regarding > >> "*nutch-site.xml*" > >> > when I'm using nutch trunk: > >> > When listing multiple scoring filters in the property > >> "*scoring.filter.order > >> > *", it is vital that no spaces/newlines/tabs are placed in front of > the > >> > first value. E.g.: > >> > This is fine: > >> > <value>org.apache.nutch.scoring.opic.OPICScoringFilter > myFilter</value> > >> > > >> > Either of these will generate an exception: > >> > <value> org.apache.nutch.scoring.opic.OPICScoringFilter > myFilter</value> > >> > <value> > >> > org.apache.nutch.scoring.opic.OPICScoringFilter > >> > myFilter > >> > </value> > >> > > >> > The reason is: In *org.apache.nutch.scoring.ScoringFilters*, a > statement > >> > (on line 59) "orderedFilters = order.split("\\s+");" tries to split > the > >> > aforementioned string. The leading spaces will cause an empty separate > >> > array element as the first element, hence result in a ClassNotFound / > >> > NullPointer exception. > >> > > >> > > >> > It can be easily fixed of course, but what concerns me is that I > suspect > >> > the fact that other properties will have the same problem (i.e., must > >> have > >> > the value content immediately follow the *<value>* tag. This is not > >> > considered robust. > >> > > >> > Any thoughts? > >> > > >> > Regards > >> > Andy > >> > >> > >> > >> -- > >> Lewis > >> > > > > -- > Lewis >

