[ 
https://issues.apache.org/jira/browse/NIFI-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060041#comment-16060041
 ] 

ASF subversion and git services commented on NIFI-4095:
-------------------------------------------------------

Commit 253ea2e73bd271e82dcfd6c706f679ddad014101 in nifi's branch 
refs/heads/master from [~alopresto]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=253ea2e ]

NIFI-4095 Changed minimum capture group count in ExtractText from 1 to 0.
Added unit test and removed obsolete test.
Added custom validation to enforce capture group if "include capture group 0" 
is false.


> ExtractText should not require a capture group in every regular expression
> --------------------------------------------------------------------------
>
>                 Key: NIFI-4095
>                 URL: https://issues.apache.org/jira/browse/NIFI-4095
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 1.3.0
>            Reporter: Andy LoPresto
>            Assignee: Andy LoPresto
>              Labels: extracttext, regular_expression, validation
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The {{ExtractText}} processor currently validates every regular expression 
> and requires that it contain "between 1 and 40 capture groups". This seems to 
> be a design decision, as the values are hardcoded into the 
> [validator|https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L262-L262],
>  but there are valid regular expressions that do not need an explicit capture 
> group (especially when the expression is small and the full expression is the 
> desired match). This results in unnecessary duplicate matches ("some_attr" 
> and "some_attr.1" being identical). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to