[
https://issues.apache.org/jira/browse/NIFI-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051604#comment-17051604
]
Otto Fowler commented on NIFI-3303:
-----------------------------------
So, the issue here is this:
In RegexReplace, for each match found we:
{code:java}
String replacement =
replacementValueProperty.evaluateAttributeExpressions(flowFile,
additionalAttrs, escapeBackRefDecorator).getValue();
replacement = escapeLiteralBackReferences(replacement, numCapturingGroups);
String replacementFinal = normalizeReplacementString(replacement);
matcher.appendReplacement(sb, replacementFinal);
{code}
So we find the matched text, evaluate the expressions ( with the group vars
added ) and then escape some literals.
When we do the appendReplacement call, the string is correct. The issue is
that appendReplacement still wants to support the $ literals and \ escapes.
And in this case it does and the escapes wipe out the inner quotes.
Adding a call before appendReplacment to matcherQuoteReplacement resolves this
issue.
HOWEVER.
The regex stuff is very complex, and kind of fragile. Trying to support so
many things with overlapping symbols and escaping rules.
This fix actually breaks and regresses :
{code:bash}
[ERROR] Failures:
[ERROR] TestReplaceText.testBackRefFollowedByNumbers:504 expected:<H[ell]23o,
World!> but was:<H[$1]23o, World!>
[ERROR] TestReplaceText.testBackRefWithNoCapturingGroup:520
expected:<H[ell]123o, World!> but was:<H[$0]123o, World!>
[ERROR] TestReplaceText.testBackReference:486 expected:<H[[ell]]o, World!>
but was:<H[[$1]]o, World!>
[ERROR] TestReplaceText.testBackReferenceEscapeWithRegexReplaceUsingEL:1565
expected:<WO[]$RD> but was:<WO[\]$RD>
[ERROR] TestReplaceText.testBackReferenceWithInvalidReferenceIsEscaped:626
expected:<H[]$do, World!> but was:<H[\]$do, World!>
[ERROR] TestReplaceText.testBackReferenceWithTooLargeOfIndexIsEscaped:608
expected:<H[ell]$2o, World!> but was:<H[$1\]$2o, World!>
[ERROR] TestReplaceText.testConfigurationCornerCase:65 FlowFile content
differs from input at byte 0 with input having value 72 and FlowFile having
value 36
[ERROR] TestReplaceText.testEscapingDollarSign:644 expected:<H[]$1o, World!>
but was:<H[\]$1o, World!>
[ERROR] TestReplaceText.testGetExistingContent:775
[ERROR] TestReplaceText.testIterativeRegexReplace:79
expected:<{"NAME":"[Smith","MIDDLE":"nifi","FIRSTNAME":"John]"}> but
was:<{"NAME":"[$2","MIDDLE":"$2","FIRSTNAME":"$2]"}>
[ERROR] TestReplaceText.testRegexNoCaptureDefaultReplacement Expected test to
throw (an instance of java.lang.AssertionError and exception with message a
string containing "java.lang.IndexOutOfBoundsException: No group 1")
[ERROR] TestReplaceText.testReplacementWithExpressionLanguageIsEscaped:554
expected:<H[[]$1]o, World!> but was:<H[[\]$1]o, World!>
[ERROR] TestReplaceText.testWithEscaped$InReplacement:123 FlowFile content
differs from input at byte 2 with input having value 36 and FlowFile having
value 92
[ERROR] TestReplaceText.testWithUnEscaped$InReplacement:137 FlowFile content
differs from input at byte 1 with input having value 36 and FlowFile having
value 92
[INFO]
[ERROR] Tests run: 1497, Failures: 14, Errors: 0, Skipped: 23
[INFO]
{code}
I am not sure how we can untangle this.
[~joewitt] [~mcgilman] ?
Who is the sme on this?
> escapeJson in ReplaceText
> -------------------------
>
> Key: NIFI-3303
> URL: https://issues.apache.org/jira/browse/NIFI-3303
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 1.1.1
> Reporter: tianzk
> Priority: Major
> Attachments: ReplaceText_Bug.xml, config.png, dataflow.png
>
>
> I have some problems while using excapeJson and unescapeJson in ReplaceText
> processor.
> When I give a string: He didn’t say, “Stop”! to ReplaceText as input,and
> configure ReplaceText like: attachment config.png
> The output of ReplaceText is same with the input: He didn’t say, “Stop!”
> ,nothing changed.
> As described in NiFI Documentation the output should be: He didn’t say,
> \"Stop!\”.Did I miss something?
> Also there are problems with unescapeJson.If input is: He didn’t say,
> \”Sto\\\"p!\”,the return string will be: He didn’t say, ”Sto"p!”.
> My dataflow:(GetFile just read a file with a string as content.)
> dataflow.png
> Thanks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)