Repository: nifi
Updated Branches:
  refs/heads/master 3db6fffa6 -> 4d88aaedc


NIFI-1258: Added a new function named getDelimitedField to the Expression 
Language and put together a guide that walks through how to add a new function

Signed-off-by: Aldrin Piri <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/nifi/repo
Commit: http://git-wip-us.apache.org/repos/asf/nifi/commit/4d88aaed
Tree: http://git-wip-us.apache.org/repos/asf/nifi/tree/4d88aaed
Diff: http://git-wip-us.apache.org/repos/asf/nifi/diff/4d88aaed

Branch: refs/heads/master
Commit: 4d88aaedc58a6aded0b5dc546469a0f9b8bf513b
Parents: 3db6fff
Author: Mark Payne <[email protected]>
Authored: Sun Dec 13 10:13:27 2015 -0500
Committer: Aldrin Piri <[email protected]>
Committed: Thu Jan 21 22:09:25 2016 -0800

----------------------------------------------------------------------
 nifi-commons/nifi-expression-language/README    | 105 +++++++++++
 .../language/antlr/AttributeExpressionLexer.g   |   2 +
 .../language/antlr/AttributeExpressionParser.g  |   6 +-
 .../attribute/expression/language/Query.java    |  39 +++++
 .../functions/GetDelimitedFieldEvaluator.java   | 174 +++++++++++++++++++
 .../expression/language/TestQuery.java          |  61 +++++++
 .../asciidoc/expression-language-guide.adoc     |  39 +++++
 7 files changed, 423 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-commons/nifi-expression-language/README
----------------------------------------------------------------------
diff --git a/nifi-commons/nifi-expression-language/README 
b/nifi-commons/nifi-expression-language/README
new file mode 100644
index 0000000..6281dca
--- /dev/null
+++ b/nifi-commons/nifi-expression-language/README
@@ -0,0 +1,105 @@
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+
+
+
+This document is intended to provide a walk-through of what is necessary
+in order to add a new function to the Expression Language. Doing so requires
+a handful of steps, so we will outline each of those steps here, in the order
+that they must be done. While this documentation is fairly verbose, it is often
+the case that reading the documentation takes longer than performing the tasks
+outlined by the documentation.
+
+
+1) In order to make the nifi-expression-language Maven module compile in your 
IDE, you may need to add the ANTLR-generated sources to your IDE's classpath.
+   This can be done using Eclipse, as follows:
+    - Right-click on the nifi-expression-language project
+    - Go to "Properties" on the context menu
+    - Go to the "Java Build Path" item in the left tree and choose the 
"Source" tab.
+    - Click "Add Folder..."
+    - Add the target/generated-sources/antlr3 folder. If this folder does not 
exist, first build the project from Maven and then
+      right-click on the nifi-expression-language project in Eclipse and click 
Refresh.
+    - Click OK to close all dialogs.
+
+2) Add the method name to the Tokens for the Lexer
+       - Open the 
src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionLexer.g
 file
+       - Add the function name to the list of tokens in this file. These 
functions are grouped by the number of arguments
+         that they take. This grouping mechanism could probably be made 
better, perhaps grouping by the type of function
+         provided. However, for now, it is best to keep some sort of 
structure, at least. If the function has optional
+         arguments, the function should be grouped by the maximum number of 
arguments that it takes (for example, the
+         substring function can take 1 or 2 arguments, so it is grouped with 
the '2 argument functions').
+         The syntax to use is:
+         
+         <Token Name> : '<function name>';
+         
+         The Token Name should be all-caps and words should be separated by 
underscores. The Token Name is what will be used to
+         identify the token when ANTLR parses an Expression. The function name 
should use camel case starting with a lower-case
+         letter. This is the name of the function as it will be referenced in 
the Expression Language.
+       - Save the AttributeExpressionLexer.g file
+       
+3) Add the method to the grammar
+       - Open the 
src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionParser.g
 file
+       - Starting around line 75, the functions are defined, grouped by the 
type of value returned. We can add the new function
+         into the grammar here. Please see the ANTLR documentation for syntax 
on the grammar used. Note that this is ANTLR 3, NOT ANTLR 4.
+         The idea here is to spell out the syntax that should be used for the 
function. So generally, we do this by specifying the function name,
+         "LPAREN!" (which indicates a left parenthesis and the ! indicates 
that we do not want this passed to us when obtaining the parsed tokens),
+         and then a list of arguments that are separated by "COMMA!" (again, 
indicating a comma character and that we do not want the token passed
+         to us when we are looking at parsed tokens). We then end with the 
matching "RPAREN!".
+       - Save this file.
+
+4) Rebuild via Maven
+       - In order to make sure that we now can reference the tokens that are 
generated for our new function, we need to rebuild via Maven.
+         We can do this by building just the nifi-expression-language project, 
rather than rebuilding the entire NiFi code base.
+       - If necessary, right-click on the nifi-expression-language project in 
your IDE and refresh / update project from new Maven build.
+         This is generally necessary when using Eclipse.
+       
+5) Add the logic for the function
+       - In the 
src/main/java/org/apache/nifi/attribute/expression/language/evaluation/function 
package directory, we will need to create a new
+         class that is capable of implementing the logic of the new function. 
Create a class using the standard naming convention of
+         <function name>Evaluator and extends the appropriate abstract 
evaluator. If the function will return a String, the evaluator should extend
+         StringEvaluator. If the function will return a boolean, the evaluator 
should extend BooleanEvaluator. There are also evaluators for Date
+         and Number return types.
+       - Generally the constructor for the evaluator will take an Evaluator 
for the "Subject" and an Evaluator for each argument. The subject is the
+         value that the function will be evaluated against. The substring 
function, for instance, takes a subject of type String. Thinking in terms of
+         Java, the "subject" is the object on which the function is being 
called. It is important to take Evaluator objects and not just a String,
+         for instance, as we have to ensure that we determine that actual 
values to use dynamically at runtime.
+       - Implement the functionality as appropriate by implementing the 
abstract methods provided by the abstract Evaluator that is being extended by
+         your newly created Evaluator.
+       - The Evaluator need not be thread-safe. The existing Evaluators are 
numerous and provide great examples for understanding the API.
+
+6) Add the logic to the query parser
+       - Generally, when using ANTLR, the preferred method to parse the input 
is to use a Tree Walker. However, this is far less intuitive for many
+         Java developers (including those of us who wrote the Expression 
Language originally). As a result, we instead use ANTLR to tokenize and parse 
the
+         input and then obtain an Abstract Syntax Tree and process this 
"manually" in Java code. This occurs in the Query class.
+       - We can add the function into our parsing logic by updating the 
#buildFunctionEvaluator method of the 
org.apache.nifi.attribute.expression.language.Query class.
+         A static import will likely need to be added to the Query class in 
order to reference the new token. The token can then be added to the existing
+         'case' statement, which will return a new instance of the Evaluator 
that was just added.
+
+7) Add Unit Tests!
+       - Unit tests are critical for the Expression Language. These 
expressions can be used throughout the entire application and it is important 
that each function
+         perform its task properly. Otherwise, incorrect routing decisions 
could be made, or data could become corrupted as a result.
+       - Each function should have its battery of unit tests added to the 
TestQuery class. This class includes a convenience method named #verifyEquals 
that is
+         used to ensure that the Expression returns the same value, regardless 
of how it is compiled and evaluated.
+
+8) Add Documentation!
+       - The documentation for each function is provided in the nifi-docs 
module, under src/main/asciidoc/expression-language-guide.adoc.
+         The format of the document is crucial to maintain, as this document 
is not only rendered as HTML in the NiFi Documentation page, but the
+         CSS classes that are used in the rendered docs are also made use of 
by the NiFi UI. When a user is entering an Expression Language expression and
+         presses Ctrl+Space, the UI provides auto-completion information as 
well as inline documentation for each function. This information is pulled
+         directly from the HTML that is generated from this 
expression-language-guide file.
+       - Rebuild NiFi and run the application. Add an UpdateAttribute 
Processor to the graph and add a new property. For the value, type the 
Expression Language
+         opening tokens ${ and then press Ctrl+Space to ensure that the 
function and its documentation is presented as expected. Most functions that 
are added
+         will require a Subject. In order to see the function, then, you will 
need to provide a subject, such as typing "${myVariable:" (without the quotes)
+         and then press Ctrl+Space. This step is important, as it is quite 
easy to make a mistake when creating the documentation using a free-form text 
editor,
+         and this will ensure that users receive a very consistent and quality 
experience when using the new function.
+

http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionLexer.g
----------------------------------------------------------------------
diff --git 
a/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionLexer.g
 
b/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionLexer.g
index 80581f5..d56a27b 100644
--- 
a/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionLexer.g
+++ 
b/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionLexer.g
@@ -157,6 +157,8 @@ SUBSTRING   : 'substring';
 REPLACE        : 'replace';
 REPLACE_ALL : 'replaceAll';
 
+// 4 arg functions
+GET_DELIMITED_FIELD    : 'getDelimitedField';
 
 // STRINGS
 STRING_LITERAL

http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionParser.g
----------------------------------------------------------------------
diff --git 
a/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionParser.g
 
b/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionParser.g
index 7c37530..780d8c5 100644
--- 
a/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionParser.g
+++ 
b/nifi-commons/nifi-expression-language/src/main/antlr3/org/apache/nifi/attribute/expression/language/antlr/AttributeExpressionParser.g
@@ -79,7 +79,7 @@ oneArgString : ((SUBSTRING_BEFORE | SUBSTRING_BEFORE_LAST | 
SUBSTRING_AFTER | SU
                           (TO_RADIX LPAREN! anyArg (COMMA! anyArg)? RPAREN!);
 twoArgString : ((REPLACE | REPLACE_ALL) LPAREN! anyArg COMMA! anyArg RPAREN!) |
                           (SUBSTRING LPAREN! anyArg (COMMA! anyArg)? RPAREN!);
-
+fiveArgString : GET_DELIMITED_FIELD LPAREN! anyArg (COMMA! anyArg (COMMA! 
anyArg (COMMA! anyArg (COMMA! anyArg)?)?)?)? RPAREN!;
 
 // functions that return Booleans
 zeroArgBool : (IS_NULL | NOT_NULL | IS_EMPTY | NOT) LPAREN! RPAREN!;
@@ -95,11 +95,11 @@ oneArgNum   : ((INDEX_OF | LAST_INDEX_OF) LPAREN! anyArg 
RPAREN!) |
                          (TO_DATE LPAREN! anyArg? RPAREN!) |
                          ((MOD | PLUS | MINUS | MULTIPLY | DIVIDE) LPAREN! 
anyArg RPAREN!);
 
-stringFunctionRef : zeroArgString | oneArgString | twoArgString;
+stringFunctionRef : zeroArgString | oneArgString | twoArgString | 
fiveArgString;
 booleanFunctionRef : zeroArgBool | oneArgBool;
 numberFunctionRef : zeroArgNum | oneArgNum;
 
-anyArg : NUMBER | numberFunctionRef | STRING_LITERAL | zeroArgString | 
oneArgString | twoArgString | booleanLiteral | zeroArgBool | oneArgBool | 
expression;
+anyArg : NUMBER | numberFunctionRef | STRING_LITERAL | zeroArgString | 
oneArgString | twoArgString | fiveArgString | booleanLiteral | zeroArgBool | 
oneArgBool | expression;
 stringArg : STRING_LITERAL | zeroArgString | oneArgString | twoArgString | 
expression;
 functionRef : stringFunctionRef | booleanFunctionRef | numberFunctionRef;
 

http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/Query.java
----------------------------------------------------------------------
diff --git 
a/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/Query.java
 
b/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/Query.java
index 2c27e4d..b3a364a 100644
--- 
a/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/Query.java
+++ 
b/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/Query.java
@@ -50,6 +50,7 @@ import 
org.apache.nifi.attribute.expression.language.evaluation.functions.Equals
 import 
org.apache.nifi.attribute.expression.language.evaluation.functions.EqualsIgnoreCaseEvaluator;
 import 
org.apache.nifi.attribute.expression.language.evaluation.functions.FindEvaluator;
 import 
org.apache.nifi.attribute.expression.language.evaluation.functions.FormatEvaluator;
+import 
org.apache.nifi.attribute.expression.language.evaluation.functions.GetDelimitedFieldEvaluator;
 import 
org.apache.nifi.attribute.expression.language.evaluation.functions.GreaterThanEvaluator;
 import 
org.apache.nifi.attribute.expression.language.evaluation.functions.GreaterThanOrEqualEvaluator;
 import 
org.apache.nifi.attribute.expression.language.evaluation.functions.HostnameEvaluator;
@@ -138,6 +139,7 @@ import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpre
 import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.FALSE;
 import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.FIND;
 import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.FORMAT;
+import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.GET_DELIMITED_FIELD;
 import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.GREATER_THAN;
 import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.GREATER_THAN_OR_EQUAL;
 import static 
org.apache.nifi.attribute.expression.language.antlr.AttributeExpressionParser.HOSTNAME;
@@ -1288,6 +1290,43 @@ public class Query {
             case NOT: {
                 return addToken(new 
NotEvaluator(toBooleanEvaluator(subjectEvaluator)), "not");
             }
+            case GET_DELIMITED_FIELD: {
+                if (argEvaluators.size() == 1) {
+                    // Only a single argument - the index to return.
+                    return addToken(new 
GetDelimitedFieldEvaluator(toStringEvaluator(subjectEvaluator),
+                        toNumberEvaluator(argEvaluators.get(0), "first 
argument of getDelimitedField")), "getDelimitedField");
+                } else if (argEvaluators.size() == 2) {
+                    // two arguments - index and delimiter.
+                    return addToken(new 
GetDelimitedFieldEvaluator(toStringEvaluator(subjectEvaluator),
+                        toNumberEvaluator(argEvaluators.get(0), "first 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(1), "second 
argument of getDelimitedField")),
+                        "getDelimitedField");
+                } else if (argEvaluators.size() == 3) {
+                    // 3 arguments - index, delimiter, quote char.
+                    return addToken(new 
GetDelimitedFieldEvaluator(toStringEvaluator(subjectEvaluator),
+                        toNumberEvaluator(argEvaluators.get(0), "first 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(1), "second 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(2), "third 
argument of getDelimitedField")),
+                        "getDelimitedField");
+                } else if (argEvaluators.size() == 4) {
+                    // 4 arguments - index, delimiter, quote char, escape char
+                    return addToken(new 
GetDelimitedFieldEvaluator(toStringEvaluator(subjectEvaluator),
+                        toNumberEvaluator(argEvaluators.get(0), "first 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(1), "second 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(2), "third 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(3), "fourth 
argument of getDelimitedField")),
+                        "getDelimitedField");
+                } else {
+                    // 5 arguments - index, delimiter, quote char, escape 
char, strip escape/quote chars flag
+                    return addToken(new 
GetDelimitedFieldEvaluator(toStringEvaluator(subjectEvaluator),
+                        toNumberEvaluator(argEvaluators.get(0), "first 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(1), "second 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(2), "third 
argument of getDelimitedField"),
+                        toStringEvaluator(argEvaluators.get(3), "fourth 
argument of getDelimitedField"),
+                        toBooleanEvaluator(argEvaluators.get(4), "fifth 
argument of getDelimitedField")),
+                        "getDelimitedField");
+                }
+            }
             default:
                 throw new 
AttributeExpressionLanguageParsingException("Expected a Function-type 
expression but got " + tree.toString());
         }

http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/GetDelimitedFieldEvaluator.java
----------------------------------------------------------------------
diff --git 
a/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/GetDelimitedFieldEvaluator.java
 
b/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/GetDelimitedFieldEvaluator.java
new file mode 100644
index 0000000..e5695a8
--- /dev/null
+++ 
b/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/GetDelimitedFieldEvaluator.java
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.attribute.expression.language.evaluation.functions;
+
+import java.util.Map;
+
+import org.apache.nifi.attribute.expression.language.evaluation.Evaluator;
+import org.apache.nifi.attribute.expression.language.evaluation.QueryResult;
+import 
org.apache.nifi.attribute.expression.language.evaluation.StringEvaluator;
+import 
org.apache.nifi.attribute.expression.language.evaluation.StringQueryResult;
+import 
org.apache.nifi.attribute.expression.language.evaluation.literals.BooleanLiteralEvaluator;
+import 
org.apache.nifi.attribute.expression.language.evaluation.literals.StringLiteralEvaluator;
+import 
org.apache.nifi.attribute.expression.language.exception.AttributeExpressionLanguageException;
+
+public class GetDelimitedFieldEvaluator extends StringEvaluator {
+    private final Evaluator<String> subjectEval;
+    private final Evaluator<Long> indexEval;
+    private final Evaluator<String> delimiterEval;
+    private final Evaluator<String> quoteCharEval;
+    private final Evaluator<String> escapeCharEval;
+    private final Evaluator<Boolean> stripCharsEval;
+
+    public GetDelimitedFieldEvaluator(final Evaluator<String> subject, final 
Evaluator<Long> index) {
+        this(subject, index, new StringLiteralEvaluator(","));
+    }
+
+    public GetDelimitedFieldEvaluator(final Evaluator<String> subject, final 
Evaluator<Long> index, final Evaluator<String> delimiter) {
+        this(subject, index, delimiter, new StringLiteralEvaluator("\""));
+    }
+
+    public GetDelimitedFieldEvaluator(final Evaluator<String> subject, final 
Evaluator<Long> index, final Evaluator<String> delimiter,
+        final Evaluator<String> quoteChar) {
+        this(subject, index, delimiter, quoteChar, new 
StringLiteralEvaluator("\\\\"));
+    }
+
+    public GetDelimitedFieldEvaluator(final Evaluator<String> subject, final 
Evaluator<Long> index, final Evaluator<String> delimiter,
+        final Evaluator<String> quoteChar, final Evaluator<String> escapeChar) 
{
+        this(subject, index, delimiter, quoteChar, escapeChar, new 
BooleanLiteralEvaluator(false));
+    }
+
+    public GetDelimitedFieldEvaluator(final Evaluator<String> subject, final 
Evaluator<Long> index, final Evaluator<String> delimiter,
+        final Evaluator<String> quoteChar, final Evaluator<String> escapeChar, 
final Evaluator<Boolean> stripChars) {
+        this.subjectEval = subject;
+        this.indexEval = index;
+        this.delimiterEval = delimiter;
+        this.quoteCharEval = quoteChar;
+        this.escapeCharEval = escapeChar;
+        this.stripCharsEval = stripChars;
+    }
+
+    @Override
+    public QueryResult<String> evaluate(final Map<String, String> attributes) {
+        final String subject = subjectEval.evaluate(attributes).getValue();
+        if (subject == null || subject.isEmpty()) {
+            return new StringQueryResult("");
+        }
+
+        final Long index = indexEval.evaluate(attributes).getValue();
+        if (index == null) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the index (which field to obtain) was not 
specified");
+        }
+        if (index < 1) {
+            return new StringQueryResult("");
+        }
+
+        final String delimiter = delimiterEval.evaluate(attributes).getValue();
+        if (delimiter == null || delimiter.isEmpty()) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the delimiter was not specified");
+        } else if (delimiter.length() > 1) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the delimiter evaluated to \"" + delimiter
+                + "\", but only a single character is allowed.");
+        }
+
+        final String quoteString = 
quoteCharEval.evaluate(attributes).getValue();
+        if (quoteString == null || quoteString.isEmpty()) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the quote character "
+                + "(which character is used to enclose values that contain the 
delimiter) was not specified");
+        } else if (quoteString.length() > 1) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the quote character "
+                + "(which character is used to enclose values that contain the 
delimiter) evaluated to \"" + quoteString + "\", but only a single character is 
allowed.");
+        }
+
+        final String escapeString = 
escapeCharEval.evaluate(attributes).getValue();
+        if (escapeString == null || escapeString.isEmpty()) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the escape character "
+                + "(which character is used to escape the quote character or 
delimiter) was not specified");
+        } else if (quoteString.length() > 1) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the escape character "
+                + "(which character is used to escape the quote character or 
delimiter) evaluated to \"" + escapeString + "\", but only a single character 
is allowed.");
+        }
+
+        Boolean stripChars = stripCharsEval.evaluate(attributes).getValue();
+        if (stripChars == null) {
+            stripChars = Boolean.FALSE;
+        }
+
+        final char quoteChar = quoteString.charAt(0);
+        final char delimiterChar = delimiter.charAt(0);
+        final char escapeChar = escapeString.charAt(0);
+
+        // ensure that quoteChar, delimiterChar, escapeChar are all different.
+        if (quoteChar == delimiterChar) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the quote character and the delimiter are 
the same");
+        }
+        if (quoteChar == escapeChar) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the quote character and the escape character 
are the same");
+        }
+        if (delimiterChar == escapeChar) {
+            throw new AttributeExpressionLanguageException("Cannot evaluate 
getDelimitedField function because the delimiter and the escape character are 
the same");
+        }
+
+        // Iterate through each character in the subject, trying to find the 
field index that we care about and extracting the chars from it.
+        final StringBuilder fieldBuilder = new StringBuilder();
+        final int desiredFieldIndex = index.intValue();
+        final int numChars = subject.length();
+
+        boolean inQuote = false;
+        int curFieldIndex = 1;
+        boolean lastCharIsEscape = false;
+        for (int i = 0; i < numChars; i++) {
+            final char c = subject.charAt(i);
+
+            if (c == quoteChar && !lastCharIsEscape) {
+                // we found a quote character that is not escaped. Flip the 
value of 'inQuote'
+                inQuote = !inQuote;
+                if (!stripChars && curFieldIndex == desiredFieldIndex) {
+                    fieldBuilder.append(c);
+                }
+            } else if (c == delimiterChar && !lastCharIsEscape && !inQuote) {
+                // We found a delimiter that is not escaped and we are not in 
quotes - or we ran out of characters so we consider this
+                // the last character.
+                final int indexJustFinished = curFieldIndex++;
+                if (indexJustFinished == desiredFieldIndex) {
+                    return new StringQueryResult(fieldBuilder.toString());
+                }
+            } else if (curFieldIndex == desiredFieldIndex) {
+                if (c != escapeChar || !stripChars) {
+                    fieldBuilder.append(c);
+                }
+            }
+
+            lastCharIsEscape = (c == escapeChar) && !lastCharIsEscape;
+        }
+
+        if (curFieldIndex == desiredFieldIndex - 1) {
+            // we have run out of characters and we are on the desired field. 
Return the characters from this field.
+            return new StringQueryResult(fieldBuilder.toString());
+        }
+
+        // We did not find enough fields. Return an empty string.
+        return new StringQueryResult("");
+    }
+
+    @Override
+    public Evaluator<?> getSubjectEvaluator() {
+        return subjectEval;
+    }
+
+}

http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
----------------------------------------------------------------------
diff --git 
a/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
 
b/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
index 131bcde..c42931e 100644
--- 
a/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
+++ 
b/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java
@@ -1156,6 +1156,67 @@ public class TestQuery {
         
verifyEquals("${allMatchingAttributes('a.*'):contains('2'):equals('true'):and( 
${literal(true)} )}", attributes, true);
     }
 
+    @Test
+    public void testGetDelimitedField() {
+        final Map<String, String> attributes = new HashMap<>();
+
+        attributes.put("line", "Name, Age, Title");
+
+        // Test "simple" case - comma separated with no quoted or escaped text
+        verifyEquals("${line:getDelimitedField(2)}", attributes, " Age");
+        verifyEquals("${line:getDelimitedField(2, ',')}", attributes, " Age");
+        verifyEquals("${line:getDelimitedField(2, ',', '\"')}", attributes, " 
Age");
+        verifyEquals("${line:getDelimitedField(2, ',', '\"', '\\\\')}", 
attributes, " Age");
+
+        // test with a space in column
+        attributes.put("line", "First Name, Age, Title");
+        verifyEquals("${line:getDelimitedField(1)}", attributes, "First Name");
+        verifyEquals("${line:getDelimitedField(1, ',')}", attributes, "First 
Name");
+        verifyEquals("${line:getDelimitedField(1, ',', '\"')}", attributes, 
"First Name");
+        verifyEquals("${line:getDelimitedField(1, ',', '\"', '\\\\')}", 
attributes, "First Name");
+
+        // test quoted value
+        attributes.put("line", "\"Name (Last, First)\", Age, Title");
+        verifyEquals("${line:getDelimitedField(1)}", attributes, "\"Name 
(Last, First)\"");
+        verifyEquals("${line:getDelimitedField(1, ',')}", attributes, "\"Name 
(Last, First)\"");
+        verifyEquals("${line:getDelimitedField(1, ',', '\"')}", attributes, 
"\"Name (Last, First)\"");
+        verifyEquals("${line:getDelimitedField(1, ',', '\"', '\\\\')}", 
attributes, "\"Name (Last, First)\"");
+
+        // test non-standard quote char
+        attributes.put("line", "_Name (Last, First)_, Age, Title");
+        verifyEquals("${line:getDelimitedField(1)}", attributes, "_Name 
(Last");
+        verifyEquals("${line:getDelimitedField(1, ',', '_')}", attributes, 
"_Name (Last, First)_");
+
+        // test escape char
+        attributes.put("line", "Name (Last\\, First), Age, Title");
+        verifyEquals("${line:getDelimitedField(1)}", attributes, "Name 
(Last\\, First)");
+
+        attributes.put("line", "Name (Last__, First), Age, Title");
+        verifyEquals("${line:getDelimitedField(1, ',', '\"', '_')}", 
attributes, "Name (Last__");
+
+        attributes.put("line", "Name (Last_, First), Age, Title");
+        verifyEquals("${line:getDelimitedField(1, ',', '\"', '_')}", 
attributes, "Name (Last_, First)");
+
+        // test escape for enclosing chars
+        attributes.put("line", "\\\"Name (Last, First), Age, Title");
+        verifyEquals("${line:getDelimitedField(1)}", attributes, "\\\"Name 
(Last");
+
+        // get non existing field
+        attributes.put("line", "Name, Age, Title");
+        verifyEquals("${line:getDelimitedField(12)}", attributes, "");
+
+        // test escape char within quotes
+        attributes.put("line", "col 1, col 2, \"The First, Second, and 
\\\"Last\\\" Column\", Last");
+        verifyEquals("${line:getDelimitedField(3):trim()}", attributes, "\"The 
First, Second, and \\\"Last\\\" Column\"");
+
+        // test stripping chars
+        attributes.put("line", "col 1, col 2, \"The First, Second, and 
\\\"Last\\\" Column\", Last");
+        verifyEquals("${line:getDelimitedField(3, ',', '\"', '\\\\', 
true):trim()}", attributes, "The First, Second, and \"Last\" Column");
+
+        attributes.put("line", "\"Jacobson, John\", 32, Mr.");
+        verifyEquals("${line:getDelimitedField(2)}", attributes, " 32");
+    }
+
     private void verifyEquals(final String expression, final Map<String, 
String> attributes, final Object expectedResult) {
         Query.validateExpression(expression, false);
         assertEquals(String.valueOf(expectedResult), 
Query.evaluateExpressions(expression, attributes, null));

http://git-wip-us.apache.org/repos/asf/nifi/blob/4d88aaed/nifi-docs/src/main/asciidoc/expression-language-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/expression-language-guide.adoc 
b/nifi-docs/src/main/asciidoc/expression-language-guide.adoc
index 9593dbf..00ce0bb 100644
--- a/nifi-docs/src/main/asciidoc/expression-language-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/expression-language-guide.adoc
@@ -739,6 +739,45 @@ then the following Expressions will result in the 
following values:
 
 
 
+[.function]
+=== getDelimitedField
+
+*Description*: [.description]#Parses the Subject as a delimited line of text 
and returns just a single field
+       from that delimited text.#
+
+*Subject Type*: [.subject]#String#
+
+*Arguments*:
+
+       - [.argName]#_index_# : [.argDesc]#The index of the field to return. A 
value of 1 will return the first field, 
+               a value of 2 will return the second field, and so on.#
+       - [.argName]#_delimiter_# : [.argDesc]#Optional argument that provides 
the character to use as a field separator. 
+               If not specified, a comma will be used. This value must be 
exactly 1 character.#
+       - [.argName]#_quoteChar_# : [.argDesc]#Optional argument that provides 
the character that can be used to quote values
+               so that the delimiter can be used within a single field. If not 
specified, a double-quote (") will be used. This value
+               must be exactly 1 character.#
+       - [.argName]#_escapeChar_# :  [.argDesc]#Optional argument that 
provides the character that can be used to escape the Quote Character
+           or the Delimiter within a field. If not specified, a backslash (\) 
is used. This value must be exactly 1 character.#
+       - [.argName]#_stripChars_# : [.argDesc]#Optional argument that 
specifies whether or not quote characters and escape characters should
+           be stripped. For example, if we have a field value "1, 2, 3" and 
this value is true, we will get the value `1, 2, 3`, but if this
+           value is false, we will get the value `"1, 2, 3"` with the quotes. 
The default value is false. This value must be either `true`
+           or `false`.#
+
+*Return Type*: [.returnType]#String#
+
+*Examples*: If the "line" attribute contains the value _"Jacobson, John", 32, 
Mr._
+       and the "altLine" attribute contains the value _Jacobson, John|32|Mr._
+    then the following Expressions will result in the following values:
+
+.GetDelimitedField Examples
+|======================================================================
+| Expression | Value
+| `${line:getDelimitedField(2)}` | _(space)_32
+| `${line:getDelimitedField(2):trim()}` | 32
+| `${line:getDelimitedField(1)}` | "Jacobson, John"
+| `${line:getDelimitedField(1, ',', '"', '\\', true)}` | Jacobson, John
+| `${altLine:getDelimitedField(1, '|')} | Jacobson, John
+|======================================================================
 
 
 

Reply via email to