[ 
https://issues.apache.org/jira/browse/NIFI-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708637#comment-16708637
 ] 

ASF GitHub Bot commented on NIFI-5826:
--------------------------------------

Github user bdesert commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/3183#discussion_r238642945
  
    --- Diff: 
nifi-commons/nifi-record-path/src/main/java/org/apache/nifi/record/path/util/RecordPathUtils.java
 ---
    @@ -39,4 +39,52 @@ public static String getFirstStringValue(final 
RecordPathSegment segment, final
     
             return stringValue;
         }
    +
    +    /**
    +     * This method handles backslash sequences after ANTLR parser converts 
all backslash into double ones
    +     * with exception for \t, \r and \n. See
    +     * <a 
href="file:../../../../../../../../../src/main/antlr3/org/apache/nifi/record/path/RecordPathParser.g">org/apache/nifi/record/path/RecordPathParser.g</a>
    --- End diff --
    
    @ijokarumawak ,
    I understand what you are talking about, that was my first idea as well.
    the problem is coming with regression on WORKING functionality, like EL.
    If we change Lexer in all three cases, we will have a problem of backward 
compatibility.
    As an example: define GFF with attribute "a1" having value (with actual new 
line): 
    ```
    "this is new line
    and this is just a backslash \n"
    ```
    (doesn't matter with quotes  or w/o).
    next: UpdateAttribute with:
    `a1: "${a:replaceAll('\\n','_')}"`
    and
    `a2: "${a:replaceAll('\n','_')}"`
    (note single and double backslash)
    In both cases the only character that will be replaced will be actual new 
line, resulting in:
    `a1=a2="this is new line_and this is just a backslash \n"
    `
    If we change Lexer for EL, this will be changed and will behave differently 
and will result in:
    `a1 = "his is new line`
    and this is just a backslash _"
    `a2 = "his is new line_and this is just a backslash \n"`
    
    Of course, we can do "right" way for RecordPath regex-based functions, but 
then we will have different user experience with regex in RecordPath and EL, 
which I think should be avoided.
    
    ----
    Regarding Java method "unescapeBackslash". As a name suggests, this 
function it to treat string values having "backslash" in their values. I do 
agree that for the test cases we have, some parts of the code are dead, but 
since this is public method in utility class, it can have generic functionality 
to support wider varieties of the use cases.
    
    Would appreciate your feedback!


> UpdateRecord processor throwing PatternSyntaxException
> ------------------------------------------------------
>
>                 Key: NIFI-5826
>                 URL: https://issues.apache.org/jira/browse/NIFI-5826
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1
>         Environment: Nifi in docker container
>            Reporter: ravi kargam
>            Assignee: Ed Berezitsky
>            Priority: Minor
>         Attachments: NIFI-5826_PR-3183.patch, 
> UpdateRecord_Config_Exception.JPG
>
>
> with replacement value strategy as Record Path Value,
> I am trying to replace square bracket symbol *[* with parenthesis symbol *(* 
> in my employeeName column in my csvReader structure with below syntax
> replaceRegex(/employeeName, "[\[]", "(")
> Processor is throwing following exception.
> RecordPathException: java.util.regex.PatternSyntaxException: Unclosed 
> character class near index 4 [\\[]
> It worked fine with other special characters such as \{, }, <, >, ;, _, "
> For double qoute ("), i had to use single escape character, for above listed 
> other characters, worked fine without any escape character. Other folks in 
> Nifi Slack tried \s, \d, \w, \. 
> looks like non of them worked.
> replace function worked for replacing [ and ]characters. didn't test any 
> other characters.
> Please address resolve the issue.
> Regards,
> Ravi
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to