[
https://issues.apache.org/jira/browse/NIFI-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph Witt updated NIFI-1632:
------------------------------
Description:
If we use ExtractText and configure it with a regular expression that contains
a Capturing Group that is "optional" (ends with a ?), and the regex matches
some content where the capturing group is not present, then it will throw a
NullPointerException
Conrad Crampton on the users mailing list reported:
Hi,
I don’t know if this is expected behaviour but I think I understand why this is
happening now. I have a regexp in the ExtractText processors viz:
(?s:^.+: (\d\d?)(\w\w\w)(\d
{4}
) ([\d ]\d:\d\d:\d\d) Product=(.?) OriginIP=(.?) Origin=(.?) Action=(.?)
SIP=(.?) Source=(.?) SPort=(\d+?) DIP=(.) Destination=(.?) DPort=(\d+?)
Protocol=(.?)(?: ICMPType=(.?) ICMPCode=(.?))? IFName=(.?) IFDirection=(.?)
Reason=(.?) Rule=(.?) PolicyName=(.?) Info=(.?) XlateSIP=(.?)
XlateSPort=([\d]|-?) XlateDIP=(.?) XlateDPort=([\d]+|-?)(.*)$)
With this (?: ICMPType=(.?) ICMPCode=(.?))? the problem I think. Because I have
made a non capturing matching group optional, for those log lines that don’t
have this section matching the dynamic variable can’t set the index correctly
as the match is returning null for these capture groups. Obviously I haven’t
gone too deep into the code, but if I have a RouteOnContent processor before
this testing for this string and remove this from regexp (and have two
ExtractText processors) then it works. It appeared that all the NPE were thrown
for those lines that didn’t match the optional matching group.
Has this been observed before?
Thanks
Conrad
— In looking at the code this line looks offensive:
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L325
was:If we use ExtractText and configure it with a regular expression that
contains a Capturing Group that is "optional" (ends with a ?), and the regex
matches some content where the capturing group is not present, then it will
throw a NullPointerException
> ExtractText throws NullPointerException if Regular Expression has optional
> Capturing Group that is not present
> --------------------------------------------------------------------------------------------------------------
>
> Key: NIFI-1632
> URL: https://issues.apache.org/jira/browse/NIFI-1632
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Mark Payne
> Fix For: 0.6.0
>
> Attachments:
> 0001-NIFI-1632-Generated-unit-test-to-prove-broken-behavi.patch
>
>
> If we use ExtractText and configure it with a regular expression that
> contains a Capturing Group that is "optional" (ends with a ?), and the regex
> matches some content where the capturing group is not present, then it will
> throw a NullPointerException
> Conrad Crampton on the users mailing list reported:
> Hi,
> I don’t know if this is expected behaviour but I think I understand why this
> is happening now. I have a regexp in the ExtractText processors viz:
> (?s:^.+: (\d\d?)(\w\w\w)(\d
> {4}
> ) ([\d ]\d:\d\d:\d\d) Product=(.?) OriginIP=(.?) Origin=(.?) Action=(.?)
> SIP=(.?) Source=(.?) SPort=(\d+?) DIP=(.) Destination=(.?) DPort=(\d+?)
> Protocol=(.?)(?: ICMPType=(.?) ICMPCode=(.?))? IFName=(.?) IFDirection=(.?)
> Reason=(.?) Rule=(.?) PolicyName=(.?) Info=(.?) XlateSIP=(.?)
> XlateSPort=([\d]|-?) XlateDIP=(.?) XlateDPort=([\d]+|-?)(.*)$)
> With this (?: ICMPType=(.?) ICMPCode=(.?))? the problem I think. Because I
> have made a non capturing matching group optional, for those log lines that
> don’t have this section matching the dynamic variable can’t set the index
> correctly as the match is returning null for these capture groups. Obviously
> I haven’t gone too deep into the code, but if I have a RouteOnContent
> processor before this testing for this string and remove this from regexp
> (and have two ExtractText processors) then it works. It appeared that all the
> NPE were thrown for those lines that didn’t match the optional matching group.
> Has this been observed before?
> Thanks
> Conrad
> — In looking at the code this line looks offensive:
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L325
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)