[jira] [Commented] (NIFI-4789) Enhance ExtractGrok processor to handle multiple grok expressions

ASF GitHub Bot (JIRA) Mon, 29 Jan 2018 13:09:24 -0800

    [ 
https://issues.apache.org/jira/browse/NIFI-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344031#comment-16344031
 ]


ASF GitHub Bot commented on NIFI-4789:
--------------------------------------

Github user charlesporter commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2411#discussion_r164564264
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java
 ---
    @@ -181,15 +243,28 @@ public void onStopped() {
     
         @OnScheduled
         public void onScheduled(final ProcessContext context) throws 
GrokException {
    +        grokList.clear();
             for (int i = 0; i < context.getMaxConcurrentTasks(); i++) {
                 final int maxBufferSize = 
context.getProperty(MAX_BUFFER_SIZE).asDataSize(DataUnit.B).intValue();
                 final byte[] buffer = new byte[maxBufferSize];
                 bufferQueue.add(buffer);
             }
     
    -        grok = new Grok();
    -        
grok.addPatternFromFile(context.getProperty(GROK_PATTERN_FILE).getValue());
    -        grok.compile(context.getProperty(GROK_EXPRESSION).getValue(), 
context.getProperty(NAMED_CAPTURES_ONLY).asBoolean());
    +        resultPrefix = context.getProperty(RESULT_PREFIX).getValue();
    +        breakOnFirstMatch = 
context.getProperty(BREAK_ON_FIRST_MATCH).asBoolean() ;
    +        matchedExpressionAttribute = 
context.getProperty(MATCHED_EXP_ATTR).getValue();
    +        expressionSeparator = 
context.getProperty(EXPRESSION_SEPARATOR).getValue();
    +
    +        String patterns  = context.getProperty(GROK_EXPRESSION).getValue();
    +        for (String patternName : patterns.split(expressionSeparator)) {
    +            Grok grok = new Grok();
    +            final String patternFileListString = 
context.getProperty(GROK_PATTERN_FILE).getValue();
    +            for (String patternFile : 
patternFileListString.split(PATTERN_FILE_LIST_SEPARATOR)) {
    +                grok.addPatternFromFile(patternFile);
    --- End diff --
    
    hmmm... ok. My feeling would be if they want to put spaces in their list 
they should include them in separator, but your way is more tolerant. Will fix.



> Enhance ExtractGrok processor to handle multiple grok expressions
> -----------------------------------------------------------------
>
>                 Key: NIFI-4789
>                 URL: https://issues.apache.org/jira/browse/NIFI-4789
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework
>    Affects Versions: 1.2.0, 1.5.0
>         Environment: all
>            Reporter: Charles Porter
>            Priority: Minor
>              Labels: features
>
> Many flows require running several grok expressions against an input to 
> correctly tag and extract data. using many separate grok processors to 
> accomplish this is unwieldy and hard to maintain.  Supporting multiple grok 
> expressions delimited by comma or user selected delimiter greatly simplifies 
> this.  
> Feature is coded and tested, ready for pull request, if feature is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-4789) Enhance ExtractGrok processor to handle multiple grok expressions

Reply via email to