[ 
https://issues.apache.org/jira/browse/NIFI-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051604#comment-17051604
 ] 

Otto Fowler commented on NIFI-3303:
-----------------------------------

So, the issue here is this:

In RegexReplace, for each match found we:

{code:java}
String replacement = 
replacementValueProperty.evaluateAttributeExpressions(flowFile, 
additionalAttrs, escapeBackRefDecorator).getValue();
 replacement = escapeLiteralBackReferences(replacement, numCapturingGroups);
  
String replacementFinal = normalizeReplacementString(replacement);              
    
matcher.appendReplacement(sb, replacementFinal);
{code}

So we find the matched text, evaluate the expressions ( with the group vars 
added ) and then escape some literals.

When we do the appendReplacement call, the string is correct.  The issue is 
that appendReplacement still wants to support the $ literals and \ escapes.  
And in this case it does and the escapes wipe out the inner quotes.

Adding a call before appendReplacment to matcherQuoteReplacement resolves this 
issue.

HOWEVER.

The regex stuff is very complex, and kind of fragile.  Trying to support so 
many things with overlapping symbols and escaping rules.

This fix actually breaks and regresses : 

{code:bash}
[ERROR] Failures: 
[ERROR]   TestReplaceText.testBackRefFollowedByNumbers:504 expected:<H[ell]23o, 
World!> but was:<H[$1]23o, World!>
[ERROR]   TestReplaceText.testBackRefWithNoCapturingGroup:520 
expected:<H[ell]123o, World!> but was:<H[$0]123o, World!>
[ERROR]   TestReplaceText.testBackReference:486 expected:<H[[ell]]o, World!> 
but was:<H[[$1]]o, World!>
[ERROR]   TestReplaceText.testBackReferenceEscapeWithRegexReplaceUsingEL:1565 
expected:<WO[]$RD> but was:<WO[\]$RD>
[ERROR]   TestReplaceText.testBackReferenceWithInvalidReferenceIsEscaped:626 
expected:<H[]$do, World!> but was:<H[\]$do, World!>
[ERROR]   TestReplaceText.testBackReferenceWithTooLargeOfIndexIsEscaped:608 
expected:<H[ell]$2o, World!> but was:<H[$1\]$2o, World!>
[ERROR]   TestReplaceText.testConfigurationCornerCase:65 FlowFile content 
differs from input at byte 0 with input having value 72 and FlowFile having 
value 36
[ERROR]   TestReplaceText.testEscapingDollarSign:644 expected:<H[]$1o, World!> 
but was:<H[\]$1o, World!>
[ERROR]   TestReplaceText.testGetExistingContent:775
[ERROR]   TestReplaceText.testIterativeRegexReplace:79 
expected:<{"NAME":"[Smith","MIDDLE":"nifi","FIRSTNAME":"John]"}> but 
was:<{"NAME":"[$2","MIDDLE":"$2","FIRSTNAME":"$2]"}>
[ERROR]   TestReplaceText.testRegexNoCaptureDefaultReplacement Expected test to 
throw (an instance of java.lang.AssertionError and exception with message a 
string containing "java.lang.IndexOutOfBoundsException: No group 1")
[ERROR]   TestReplaceText.testReplacementWithExpressionLanguageIsEscaped:554 
expected:<H[[]$1]o, World!> but was:<H[[\]$1]o, World!>
[ERROR]   TestReplaceText.testWithEscaped$InReplacement:123 FlowFile content 
differs from input at byte 2 with input having value 36 and FlowFile having 
value 92
[ERROR]   TestReplaceText.testWithUnEscaped$InReplacement:137 FlowFile content 
differs from input at byte 1 with input having value 36 and FlowFile having 
value 92
[INFO] 
[ERROR] Tests run: 1497, Failures: 14, Errors: 0, Skipped: 23
[INFO] 
{code}

I am not sure how we can untangle this.  

[~joewitt] [~mcgilman] ?

Who is the sme on this?

> escapeJson in ReplaceText
> -------------------------
>
>                 Key: NIFI-3303
>                 URL: https://issues.apache.org/jira/browse/NIFI-3303
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: tianzk
>            Priority: Major
>         Attachments: ReplaceText_Bug.xml, config.png, dataflow.png
>
>
> I have some problems while using excapeJson and unescapeJson in ReplaceText 
> processor.
> When I give a string: He didn’t say, “Stop”!  to ReplaceText as input,and 
> configure ReplaceText like: attachment config.png
> The output of ReplaceText is same with the input: He didn’t say, “Stop!” 
> ,nothing changed.
> As described in NiFI Documentation the output should be: He didn’t say, 
> \"Stop!\”.Did I miss something?
> Also there are problems with unescapeJson.If input is: He didn’t say, 
> \”Sto\\\"p!\”,the return string will be: He didn’t say, ”Sto"p!”.
> My dataflow:(GetFile just read a file with a  string as content.)
> dataflow.png
> Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to