I think there is transitional path that is non-breaking and could prevent
0.x to 1.x migration issues, even if at the expense of a little extra code
for a period of time.

   1. Leave the existing "replace" and "replaceAll" functions as is for
   now.
   2. Implement the new "replaceRegex" and "replaceLiteral" functions.  I'd
   prefer giving them a parameter selecting first, last, or all occurrences
   over function variants, it could even be optional with all as the default.
   3. In a future release deprecate the "replace" and "replaceAll"
   functions.
   4. In a future, future release remove the "replace" and "replaceAll"
   functions.

>From past experience managing enterprise systems, users preferred a
transition period like this instead of requiring everything be updated at
once.  We could then provide them a means to find any outstanding use of
the old functionality and clean it up before it was retired.  And from the
system development and management perspective, that saved us a lot of pain
too since we didn't have an influx of minor but troublesome issues in the
midst transitions that could demand our attention be on other bigger
problems.

It would be possible to have a signle function that takes another parameter
indicating whether the target is a literal or regular expression, but I
like the separate functions, especially if they will be implemented with
different underlying APIs.

On Wed, May 25, 2016 at 10:43 PM, Andy LoPresto <[email protected]>
wrote:

> Joe,
>
> These would be breaking changes and a lot of existing workflows would
> begin to behave differently. I would suggest making an incremental change
> here — simply adding replaceFirst as a non-destructive change as a solution
> for this issue, and opening a new Jira for the changes which break backward
> compatibility.
>
> I do agree that option 1 is probably the cleaner way forward, and if we
> introduce new method names, we may be able to use StandardFlowSynchronizer
> to detect legacy methods from a pre-1.0 flow and update them automatically
> during flow migration.
>
> Andy LoPresto
> [email protected]
> *[email protected] <[email protected]>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On May 25, 2016, at 4:06 PM, Joe Percivall <[email protected]
> <[email protected]>> wrote:
>
> Andy,
>
> Nice write-up and thanks for bringing attention to this. I definitely
> assumed for a while that replace vs replaceAll was the number of things
> replaced. The underlying problem, I think, is that these EL methods are
> just wrappers around the Java String methods and the Java String methods
> are named in a confusing manner.
>
> I am on board with adding a "replaceFirst(regex, replacement)" method.
> This adds a bit more functionality and is just exposing another Java String
> method.
>
> In addition to that, I would suggest doing something to alleviate the
> confusion between "replace" and "replaceAll". In a similar fashion to
> adding decimal support, I see two avenues we could take:
>
> 1. Change the names of the functions to "replaceLiteral" and
> "replaceRegex" (or "replaceAllLiteral" and "replaceAllRegex")
> 2. Remove one of the methods and add a third field to the remaining method
> to indicate whether to replace literally or interpret as a regex
>
> Both of these would be breaking changes and introduced with 1.0 and I am
> leaning towards option 1 with the base name "replace". I believe when the
> "replaceFirst" method is added, "replaceLiteral" and "replaceRegex" would
> be easy to understand that they replace all occurrences.
>
> Joe
>
> - - - - - -
> Joseph Percivall
> linkedin.com/in/Percivall
> e: [email protected]
>
>
>
> On Wednesday, May 25, 2016 6:24 PM, Andy LoPresto <[email protected]>
> wrote:
>
>
>
> Hi all,
>
> During investigation of an expression language issue posted to the list, I
> discovered that replace explicitly delegates to a String#replace invocation
> that only accepts literal expressions, not regular expressions, while
> replaceAll accepts regular expressions. I thought this was an oversight and
> filed NIFI-1919 [1] to document and fix this, by changing the
> ReplaceEvaluator [2] to use String#replaceFirst, which accepts regular
> expressions. I wrote failing unit tests [3] to capture the fix. After
> implementing the change, two existing unit tests [4] broke, which indicated
> a regression. At first, I believed these two tests to be incorrect, but
> further investigation showed they were merely surprising.
>
> TestQuery#testQuotingQuotes (below) fails on the second verifyEquals call,
> but the test is asserting that replace should replace multiple instances of
> the single quote. While this is similar to String#replace, because the
> expression language exposes only two methods — replace vs. replaceAll — one
> could easily assume the difference between the two was the number of
> attempted replacements, rather than the actual difference, which is literal
> expression vs. pattern.
>
> @Test
> public void testQuotingQuotes() {
> final Map<String, String> attributes = new HashMap<>();
> attributes.put("xx", "say 'hi'");
>
> String query = "${xx:replaceAll( \"'.*'\", '\\\"hello\\\"' )}";
> verifyEquals(query, attributes, "say \"hello\"");
>
> query = "${xx:replace( \"'\", '\"')}";
> verifyEquals(query, attributes, "say \"hi\"");
>
> query = "${xx:replace( '\\'', '\"')}";
> System.out.println(query);
> verifyEquals(query, attributes, "say \"hi\"");
> }
> TestQuery#testReplaceAllWithOddNumberOfBackslashPairs (below) also fails
> on the first verifyEquals call with a PatternSyntaxException. I am
> investigating that further.
>
> @Test
> public void testReplaceAllWithOddNumberOfBackslashPairs() {
> final Map<String, String> attributes = new HashMap<>();
> attributes.put("filename", "C:\\temp\\.txt");
>
> verifyEquals("${filename:replace('\\\\', '/')}", attributes,
> "C:/temp/.txt");
> verifyEquals("${filename:replaceAll('\\\\\\\\', '/')}", attributes,
> "C:/temp/.txt");
> verifyEquals("${filename:replaceAll('\\\\\\.txt$', '')}", attributes,
> "C:\\temp");
> }
> While I originally had just modified replace, after looking at the EL
> documentation [5], replace is explicitly documented to only replace literal
> expressions, and does so multiple times, as does Java’s String#replace [6].
> I now propose to add another method replaceFirst, which accepts a pattern
> and replaces only the first occurrence. I will update the unit tests to
> properly capture this, and will update the documentation to reflect the new
> method.
>
> Thoughts from the community?
>
> [1] https://issues.apache.org/jira/browse/NIFI-1919
> [2]
> https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/ReplaceEvaluator.java
> [3]
> https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/groovy/org/apache/nifi/attribute/expression/language/QueryGroovyTest.groovy
> [4]
> https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
> [5]
> https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#replace
> [6]
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(java.lang.CharSequence,%20java.lang.CharSequence)
>
>
>
> Andy LoPresto
> [email protected]
> [email protected]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
>
>

Reply via email to