Hi all:

Like I suspected, this vulnerability affects more properties apart from the
ones I described in NUTCH-1385.
For instance, the property "plugin.includes":

      <value>plugin_1|plugin_2</value>
This is fine, it will load both plugins.

      <value>plugin_1|plugin_2
      </value>
This is not fine since (I guess) the program will try to find a plugin
named "plugin_2\n" (maybe not precise, but you get the idea).

I've been debugging for this bug for hours and finally found it. The cause
is that my editor automatically formats long line by splitting it into
multiple lines.

So the rule here is: no matter how long a property value is, do not spread
it into multiple lines. Otherwise something unexpected will happen.

At this point, I'm not sure whether I should submit another ticket because
I don't know exactly which properties are affected by this problem. Just a
heads up for all of you who might encounter the same problem in the future.

Regards
Andy


On 9 June 2012 11:42, Andy Xue <[email protected]> wrote:

> Hi Lewis:
>
> Sorry for the delay. Sure, I'll open a ticket in a bit.
>
> Regards
> Andy
>
>
>
> On 7 June 2012 21:28, Lewis John Mcgibbney <[email protected]>wrote:
>
>> Hi Andy,
>> Even opening a ticket and getting it logged would b great.
>> Thanks
>> Lewis
>>
>> On Wed, Jun 6, 2012 at 3:53 AM, Andy Xue <[email protected]> wrote:
>> > Hi Lewis:
>> >
>> > I'll try to find a time to do it. Thanks for the reply.
>> >
>> > Regards
>> > Andy
>> >
>> >
>> >
>> > On 31 May 2012 20:37, Lewis John Mcgibbney <[email protected]
>> >wrote:
>> >
>> >> Hi Andy,
>> >>
>> >> This is a good catch and I would suggest you open an issue on the Jira
>> >> and submit a patch for the few instances of where this actually
>> >> occurs... e.g. I think there are currently 4 such instances in
>> >> nutch-default which concern the ordering of such tools. Admittedly
>> >> though I haven't dug down into the code to see if it is consistent as
>> >> you assume...
>> >>
>> >> If you begin by investigating (and patching if necessary) these parts
>> >> then this would make a nice patch. As you are using trunk, I wouldn't
>> >> imagine it would take you too long.
>> >>
>> >> Thanks very much
>> >>
>> >> Lewis
>> >>
>> >> On Thu, May 31, 2012 at 2:34 AM, Andy Xue <[email protected]>
>> wrote:
>> >> > Hi all:
>> >> >
>> >> > The following situation has come to my attention regarding
>> >> "*nutch-site.xml*"
>> >> > when I'm using nutch trunk:
>> >> > When listing multiple scoring filters in the property
>> >> "*scoring.filter.order
>> >> > *", it is vital that no spaces/newlines/tabs are placed in front of
>> the
>> >> > first value. E.g.:
>> >> > This is fine:
>> >> > <value>org.apache.nutch.scoring.opic.OPICScoringFilter
>> myFilter</value>
>> >> >
>> >> > Either of these will generate an exception:
>> >> > <value> org.apache.nutch.scoring.opic.OPICScoringFilter
>> myFilter</value>
>> >> > <value>
>> >> > org.apache.nutch.scoring.opic.OPICScoringFilter
>> >> > myFilter
>> >> > </value>
>> >> >
>> >> > The reason is: In *org.apache.nutch.scoring.ScoringFilters*, a
>> statement
>> >> > (on line 59) "orderedFilters = order.split("\\s+");" tries to split
>> the
>> >> > aforementioned string. The leading spaces will cause an empty
>> separate
>> >> > array element as the first element, hence result in a ClassNotFound /
>> >> > NullPointer exception.
>> >> >
>> >> >
>> >> > It can be easily fixed of course, but what concerns me is that I
>> suspect
>> >> > the fact that other properties will have the same problem (i.e., must
>> >> have
>> >> > the value content immediately follow the *<value>* tag. This is not
>> >> > considered robust.
>> >> >
>> >> > Any thoughts?
>> >> >
>> >> > Regards
>> >> > Andy
>> >>
>> >>
>> >>
>> >> --
>> >> Lewis
>> >>
>>
>>
>>
>> --
>> Lewis
>>
>
>

Reply via email to