[ 
https://issues.apache.org/jira/browse/NIFI-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-1632:
------------------------------
    Description: 
If we use ExtractText and configure it with a regular expression that contains 
a Capturing Group that is "optional" (ends with a ?), and the regex matches 
some content where the capturing group is not present, then it will throw a 
NullPointerException



Conrad Crampton on the users mailing list reported:
Hi,
I don’t know if this is expected behaviour but I think I understand why this is 
happening now. I have a regexp in the ExtractText processors viz:
(?s:^.+: (\d\d?)(\w\w\w)(\d
{4}
) ([\d ]\d:\d\d:\d\d) Product=(.?) OriginIP=(.?) Origin=(.?) Action=(.?) 
SIP=(.?) Source=(.?) SPort=(\d+?) DIP=(.) Destination=(.?) DPort=(\d+?) 
Protocol=(.?)(?: ICMPType=(.?) ICMPCode=(.?))? IFName=(.?) IFDirection=(.?) 
Reason=(.?) Rule=(.?) PolicyName=(.?) Info=(.?) XlateSIP=(.?) 
XlateSPort=([\d]|-?) XlateDIP=(.?) XlateDPort=([\d]+|-?)(.*)$)
With this (?: ICMPType=(.?) ICMPCode=(.?))? the problem I think. Because I have 
made a non capturing matching group optional, for those log lines that don’t 
have this section matching the dynamic variable can’t set the index correctly 
as the match is returning null for these capture groups. Obviously I haven’t 
gone too deep into the code, but if I have a RouteOnContent processor before 
this testing for this string and remove this from regexp (and have two 
ExtractText processors) then it works. It appeared that all the NPE were thrown 
for those lines that didn’t match the optional matching group.
Has this been observed before?
Thanks
Conrad
— In looking at the code this line looks offensive:
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L325


  was:If we use ExtractText and configure it with a regular expression that 
contains a Capturing Group that is "optional" (ends with a ?), and the regex 
matches some content where the capturing group is not present, then it will 
throw a NullPointerException


> ExtractText throws NullPointerException if Regular Expression has optional 
> Capturing Group that is not present
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-1632
>                 URL: https://issues.apache.org/jira/browse/NIFI-1632
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Mark Payne
>             Fix For: 0.6.0
>
>         Attachments: 
> 0001-NIFI-1632-Generated-unit-test-to-prove-broken-behavi.patch
>
>
> If we use ExtractText and configure it with a regular expression that 
> contains a Capturing Group that is "optional" (ends with a ?), and the regex 
> matches some content where the capturing group is not present, then it will 
> throw a NullPointerException
> Conrad Crampton on the users mailing list reported:
> Hi,
> I don’t know if this is expected behaviour but I think I understand why this 
> is happening now. I have a regexp in the ExtractText processors viz:
> (?s:^.+: (\d\d?)(\w\w\w)(\d
> {4}
> ) ([\d ]\d:\d\d:\d\d) Product=(.?) OriginIP=(.?) Origin=(.?) Action=(.?) 
> SIP=(.?) Source=(.?) SPort=(\d+?) DIP=(.) Destination=(.?) DPort=(\d+?) 
> Protocol=(.?)(?: ICMPType=(.?) ICMPCode=(.?))? IFName=(.?) IFDirection=(.?) 
> Reason=(.?) Rule=(.?) PolicyName=(.?) Info=(.?) XlateSIP=(.?) 
> XlateSPort=([\d]|-?) XlateDIP=(.?) XlateDPort=([\d]+|-?)(.*)$)
> With this (?: ICMPType=(.?) ICMPCode=(.?))? the problem I think. Because I 
> have made a non capturing matching group optional, for those log lines that 
> don’t have this section matching the dynamic variable can’t set the index 
> correctly as the match is returning null for these capture groups. Obviously 
> I haven’t gone too deep into the code, but if I have a RouteOnContent 
> processor before this testing for this string and remove this from regexp 
> (and have two ExtractText processors) then it works. It appeared that all the 
> NPE were thrown for those lines that didn’t match the optional matching group.
> Has this been observed before?
> Thanks
> Conrad
> — In looking at the code this line looks offensive:
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L325



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to