[ 
https://issues.apache.org/jira/browse/NIFI-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292703#comment-16292703
 ] 

ASF GitHub Bot commented on NIFI-2169:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2343#discussion_r157230846
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/RouteText.java
 ---
    @@ -209,6 +215,30 @@
         private volatile Map<Relationship, PropertyValue> propertyMap = new 
HashMap<>();
         private volatile Pattern groupingRegex = null;
     
    +    @VisibleForTesting
    +    final static int PATTERNS_CACHE_MAXIMUM_ENTRIES = 10;
    --- End diff --
    
    We could probably cache more than 10 here. I think the idea on the PR was 
simply to convey that we need a reasonable upward bound, rather than allowing 
it to grow indefinitely. I would tend to lean more toward say 100 personally? 
Or even 1024 or so. A compiled Pattern is fairly small I believe in terms of 
heap utilization, so I wouldn't be concerned personally with such a limit.


> Improve RouteText performance with pre-compilation of RegEx in certain cases
> ----------------------------------------------------------------------------
>
>                 Key: NIFI-2169
>                 URL: https://issues.apache.org/jira/browse/NIFI-2169
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>    Affects Versions: 0.6.1
>            Reporter: Stephane Maarek
>            Assignee: Oleg Zhurakousky
>              Labels: beginner, easy
>
> When using RegEx matches for the RouteText processor (and possibly other 
> processors), the RegEx gets recompiled every time the processor works. The 
> RegEx could be precompiled / cached under certain conditions, in order to 
> improve the performance of the processor
> See email from Mark Payne:
> Re #2: The regular expression is compiled every time. This is done, though, 
> because the Regex allows the Expression
> Language to be used, so the Regex could actually be different for each 
> FlowFile. That being said, it could certainly be
> improved by either (a) pre-compiling in the case that no Expression Language 
> is used and/or (b) cache up to say 10
> Regex'es once they are compiled. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to