Joe,

These would be breaking changes and a lot of existing workflows would begin to 
behave differently. I would suggest making an incremental change here — simply 
adding replaceFirst as a non-destructive change as a solution for this issue, 
and opening a new Jira for the changes which break backward compatibility.

I do agree that option 1 is probably the cleaner way forward, and if we 
introduce new method names, we may be able to use StandardFlowSynchronizer to 
detect legacy methods from a pre-1.0 flow and update them automatically during 
flow migration.

Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On May 25, 2016, at 4:06 PM, Joe Percivall <[email protected]> 
> wrote:
> 
> Andy,
> 
> Nice write-up and thanks for bringing attention to this. I definitely assumed 
> for a while that replace vs replaceAll was the number of things replaced. The 
> underlying problem, I think, is that these EL methods are just wrappers 
> around the Java String methods and the Java String methods are named in a 
> confusing manner.
> 
> I am on board with adding a "replaceFirst(regex, replacement)" method. This 
> adds a bit more functionality and is just exposing another Java String method.
> 
> In addition to that, I would suggest doing something to alleviate the 
> confusion between "replace" and "replaceAll". In a similar fashion to adding 
> decimal support, I see two avenues we could take:
> 
> 1. Change the names of the functions to "replaceLiteral" and "replaceRegex" 
> (or "replaceAllLiteral" and "replaceAllRegex")
> 2. Remove one of the methods and add a third field to the remaining method to 
> indicate whether to replace literally or interpret as a regex
> 
> Both of these would be breaking changes and introduced with 1.0 and I am 
> leaning towards option 1 with the base name "replace". I believe when the 
> "replaceFirst" method is added, "replaceLiteral" and "replaceRegex" would be 
> easy to understand that they replace all occurrences.
> 
> Joe
> 
> - - - - - -
> Joseph Percivall
> linkedin.com/in/Percivall
> e: [email protected]
> 
> 
> 
> On Wednesday, May 25, 2016 6:24 PM, Andy LoPresto <[email protected]> 
> wrote:
> 
> 
> 
> Hi all,
> 
> During investigation of an expression language issue posted to the list, I 
> discovered that replace explicitly delegates to a String#replace invocation 
> that only accepts literal expressions, not regular expressions, while 
> replaceAll accepts regular expressions. I thought this was an oversight and 
> filed NIFI-1919 [1] to document and fix this, by changing the 
> ReplaceEvaluator [2] to use String#replaceFirst, which accepts regular 
> expressions. I wrote failing unit tests [3] to capture the fix. After 
> implementing the change, two existing unit tests [4] broke, which indicated a 
> regression. At first, I believed these two tests to be incorrect, but further 
> investigation showed they were merely surprising.
> 
> TestQuery#testQuotingQuotes (below) fails on the second verifyEquals call, 
> but the test is asserting that replace should replace multiple instances of 
> the single quote. While this is similar to String#replace, because the 
> expression language exposes only two methods — replace vs. replaceAll — one 
> could easily assume the difference between the two was the number of 
> attempted replacements, rather than the actual difference, which is literal 
> expression vs. pattern.
> 
> @Test
> public void testQuotingQuotes() {
> final Map<String, String> attributes = new HashMap<>();
> attributes.put("xx", "say 'hi'");
> 
> String query = "${xx:replaceAll( \"'.*'\", '\\\"hello\\\"' )}";
> verifyEquals(query, attributes, "say \"hello\"");
> 
> query = "${xx:replace( \"'\", '\"')}";
> verifyEquals(query, attributes, "say \"hi\"");
> 
> query = "${xx:replace( '\\'', '\"')}";
> System.out.println(query);
> verifyEquals(query, attributes, "say \"hi\"");
> }
> TestQuery#testReplaceAllWithOddNumberOfBackslashPairs (below) also fails on 
> the first verifyEquals call with a PatternSyntaxException. I am investigating 
> that further.
> 
> @Test
> public void testReplaceAllWithOddNumberOfBackslashPairs() {
> final Map<String, String> attributes = new HashMap<>();
> attributes.put("filename", "C:\\temp\\.txt");
> 
> verifyEquals("${filename:replace('\\\\', '/')}", attributes, "C:/temp/.txt");
> verifyEquals("${filename:replaceAll('\\\\\\\\', '/')}", attributes, 
> "C:/temp/.txt");
> verifyEquals("${filename:replaceAll('\\\\\\.txt$', '')}", attributes, 
> "C:\\temp");
> }
> While I originally had just modified replace, after looking at the EL 
> documentation [5], replace is explicitly documented to only replace literal 
> expressions, and does so multiple times, as does Java’s String#replace [6]. I 
> now propose to add another method replaceFirst, which accepts a pattern and 
> replaces only the first occurrence. I will update the unit tests to properly 
> capture this, and will update the documentation to reflect the new method.
> 
> Thoughts from the community?
> 
> [1] https://issues.apache.org/jira/browse/NIFI-1919
> [2] 
> https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/ReplaceEvaluator.java
> [3] 
> https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/groovy/org/apache/nifi/attribute/expression/language/QueryGroovyTest.groovy
> [4] 
> https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
> [5] 
> https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#replace
> [6] 
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(java.lang.CharSequence,%20java.lang.CharSequence)
> 
> 
> 
> Andy LoPresto
> [email protected]
> [email protected]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to