This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit e65b6d77a1afec28e3fe8cf592d61a1e4ea8656c
Author: Grant Henke <[email protected]>
AuthorDate: Sun Jul 14 18:20:24 2019 -0500

    [examples] Add a complete Nifi quickstart example
    
    This patchs adds a brief example using Apache Nifi
    to ingest data into Apache Kudu.
    
    Change-Id: I71f3bc5898c15d7bc19cffb3a91b9efac3f6928b
    Reviewed-on: http://gerrit.cloudera.org:8080/13878
    Tested-by: Grant Henke <[email protected]>
    Reviewed-by: Andrew Wong <[email protected]>
---
 docs/quickstart.adoc                          |   12 +-
 examples/quickstart/nifi/README.adoc          |  165 ++++
 examples/quickstart/nifi/Random_User_Kudu.xml | 1002 +++++++++++++++++++++++++
 examples/quickstart/spark/README.adoc         |    2 +-
 4 files changed, 1174 insertions(+), 7 deletions(-)

diff --git a/docs/quickstart.adoc b/docs/quickstart.adoc
index e02506b..46e06a9 100644
--- a/docs/quickstart.adoc
+++ b/docs/quickstart.adoc
@@ -30,7 +30,7 @@
 Follow these instructions to set up and run a local Kudu Cluster using Docker,
 and get started using Apache Kudu in minutes.
 
-Note: This is intended for demonstration purposes only and shouldn't
+NOTE: This is intended for demonstration purposes only and shouldn't
 be used for production or performance/scale testing.
 
 [[quickstart_vm]]
@@ -48,8 +48,8 @@ Clone the Apache Kudu repository using Git and change to the 
`kudu` directory:
 
 [source,bash]
 ----
-$ git clone https://github.com/apache/kudu
-$ cd kudu
+git clone https://github.com/apache/kudu
+cd kudu
 ----
 
 == Start the Quickstart Cluster
@@ -60,7 +60,7 @@ Set the `KUDU_QUICKSTART_IP` environment variable to your ip 
address:
 
 [source,bash]
 ----
-$ export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  
awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  
awk '{print $2}' | tail -1)
 ----
 
 === Bring up the Cluster
@@ -75,7 +75,7 @@ you can specify the master addresses with 
`localhost:7051,localhost:7151,localho
 docker-compose -f docker/quickstart.yml up
 ----
 
-Note: You can include the `-d` flag to run the cluster in the background.
+NOTE: You can include the `-d` flag to run the cluster in the background.
 
 === View the Web-UI
 
@@ -106,7 +106,7 @@ export KUDU_USER_NAME=kudu
 kudu cluster ksck localhost:7051,localhost:7151,localhost:7251
 ----
 
-Note: Setting `KUDU_USER_NAME=kudu` simplifies using Kudu from various user
+NOTE: Setting `KUDU_USER_NAME=kudu` simplifies using Kudu from various user
 accounts in a non-secure environment.
 
 == Running a Brief Example
diff --git a/examples/quickstart/nifi/README.adoc 
b/examples/quickstart/nifi/README.adoc
new file mode 100644
index 0000000..3d4e168
--- /dev/null
+++ b/examples/quickstart/nifi/README.adoc
@@ -0,0 +1,165 @@
+= Apache NiFi Quickstart
+
+Below is a brief example using Apache NiFi to ingest data in Apache Kudu.
+
+== Start the Kudu Quickstart Environment
+
+See the Apache Kudu
+link:https://kudu.apache.org/docs/quickstart.html[quickstart documentation]
+to setup and run the Kudu quickstart environment.
+
+== Run Apache NiFi
+
+Use the following command to run the latest Apache NiFi Docker image:
+
+[source,bash]
+----
+docker run --name kudu-nifi --network="docker_default" -p 8080:8080 
apache/nifi:latest
+----
+
+You can view the running NiFi instance at 
link:http://localhost:8080/nifi[localhost:8080/nifi].
+
+NOTE: `--network="docker_default"` is specified to connect the container the
+same network as the quickstart cluster.
+
+NOTE: You can include the `-d` flag to run the cluster in the background.
+
+== Create the Kudu table
+
+Create the `random_user` Kudu table that matches the expected Schema.
+
+In order to do this without any dependencies on your host machine, we will
+use the `jshell` REPL in a Docker container to create the table using the
+Java API. First setup the Docker container, download the jar, and run the REPL:
+
+[source,bash]
+----
+docker run -it --rm --network="docker_default" maven:latest bin/bash
+# Download the kudu-client-tools jar which has the kudu-client and all the 
dependencies.
+mkdir jars
+mvn dependency:copy \
+    -Dartifact=org.apache.kudu:kudu-client-tools:1.10.0 \
+    -DoutputDirectory=jars
+# Run the jshell with the jar on the classpath.
+jshell --class-path jars/*
+----
+
+NOTE: `--network="docker_default"` is specified to connect the container the
+same network as the quickstart cluster.
+
+Then, once in the `jshell` REPL, create the table using the Java API:
+
+[source,java]
+----
+import org.apache.kudu.client.CreateTableOptions
+import org.apache.kudu.client.KuduClient
+import org.apache.kudu.client.KuduClient.KuduClientBuilder
+import org.apache.kudu.ColumnSchema.ColumnSchemaBuilder
+import org.apache.kudu.Schema
+import org.apache.kudu.Type
+
+KuduClient client =
+  new 
KuduClientBuilder("kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251").build();
+
+if(client.tableExists("random_user")) {
+  client.deleteTable("random_user");
+}
+
+Schema schema = new Schema(Arrays.asList(
+  new ColumnSchemaBuilder("ssn", Type.STRING).key(true).build(),
+  new ColumnSchemaBuilder("firstName", Type.STRING).build(),
+  new ColumnSchemaBuilder("lastName", Type.STRING).build(),
+  new ColumnSchemaBuilder("email", Type.STRING).build())
+);
+CreateTableOptions tableOptions =
+  new 
CreateTableOptions().setNumReplicas(3).addHashPartitions(Arrays.asList("ssn"), 
4);
+client.createTable("random_user", schema, tableOptions);
+----
+
+Once complete, you can use `CTRL + D` to exit the REPL and `exit` to exit the 
container.
+
+== Load the Dataflow Template
+
+The `Random_User_Kudu.xml` template downloads randomly generated user data from
+http://randomuser.me and then pushes the data into Kudu. The data is pulled in
+100 records at a time and then split into individual records. The incoming data
+is in JSON Format.
+
+Next, the user's social security number, first name, last name, and e-mail
+address are extract from the JSON into FlowFile Attributes and the content is
+modified to become a new JSON document consisting of only 4 fields:
+`ssn`, `firstName`, `lastName`, and `email`. Finally, this smaller JSON is 
then pushed to
+Kudu as a single row, each field being a separate column in that row.
+
+To load the template follow the NiFi
+link:https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Import_Template["Importing
 a Template" documentation]
+to load `Random_User_Kudu.xml`.
+
+Then follow the NiFi
+link:hhttps://nifi.apache.org/docs/nifi-docs/html/user-guide.html#instantiating-a-template["Instantiating
 a Template" documentation]
+to add the `Random User Kudu` template to the canvas.
+
+Once the template is added to the canvas you need to start the JsonTreeReader
+controller service. You can do this via the PutKudu processor configuration
+or via the Nifi Flow configuration in the Operate panel. See the Nifi
+link:https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Controller_Services_for_Dataflows["Controller
 Service" documentation]
+for more details.
+
+Now you can start individual processors by right-clicking each processor and 
selecting `Start`.
+You can also explore the configuration, queue contents, and more by 
right-clicking on each element.
+Alternatively you can use the Operate panel and start the entire flow at once.
+More about starting and stopping NiFi components can be read in the NiFi
+link:https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#starting-a-component["Starting
 a Component" documentation].
+
+== Shutdown NiFi
+
+Once you are done with the NiFi container you can shutdown in a couple of ways.
+If you ran NiFi without the `-d` flag, you can use `ctrl + c` to stop the  
container.
+
+If you ran NiFi with the `-d` flag, you can use the following to
+gracefully shutdown the cluster:
+
+[source,bash]
+----
+docker stop kudu-nifi
+----
+
+To permanently remove the container run the following:
+
+[source,bash]
+----
+docker rm kudu-nifi
+----
+
+== Next steps
+
+The above example showed how to ingest data into Kudu using Apache NiFi.
+Next explore the other quickstart guides to learn how to query or process
+the data using other tools.
+
+For example, the 
link:https://github.com/apache/kudu/tree/master/examples/quickstart/spark[Spark 
quickstart guide]
+will walk you through how to setup and query Kudu tables with the `spark-kudu`
+integration.
+
+If you have already run through the Spark quickstart the following is a brief
+example of the code to allow you to query the `random_user` table:
+
+[source,bash]
+----
+spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.10.0
+----
+
+[source,scala]
+----
+:paste
+val random_user = spark.read
+       .option("kudu.master", "localhost:7051,localhost:7151,localhost:7251")
+       .option("kudu.table", "random_user")
+       // We need to use leader_only because Kudu on Docker currently doesn't
+       // support Snapshot scans due to `--use_hybrid_clock=false`.
+       .option("kudu.scanLocality", "leader_only")
+       .format("kudu").load
+random_user.createOrReplaceTempView("random_user")
+spark.sql("SELECT count(*) FROM random_user").show()
+spark.sql("SELECT * FROM random_user LIMIT 5").show()
+----
diff --git a/examples/quickstart/nifi/Random_User_Kudu.xml 
b/examples/quickstart/nifi/Random_User_Kudu.xml
new file mode 100644
index 0000000..158992a
--- /dev/null
+++ b/examples/quickstart/nifi/Random_User_Kudu.xml
@@ -0,0 +1,1002 @@
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<template encoding-version="1.2">
+    <description>This template downloads randomly generated user data from
+http://randomuser.me and then pushes the data into Kudu. The data is pulled in
+100 records at a time and then split into individual records. The incoming data
+is in JSON Format.
+
+Next, the user's social security number, first name, last name, and e-mail
+address are extract from the JSON into FlowFile Attributes and the content is
+modified to become a new JSON document consisting of only 4 fields:
+ssn, firstName, lastName, email. Finally, this smaller JSON is then pushed to
+Kudu as a single row, each value being a separate column in that 
row.</description>
+    <groupId>00304107-016c-1000-2e69-8f2347fbf5c3</groupId>
+    <name>Random User Kudu</name>
+    <snippet>
+        <connections>
+            <id>2ebb7ae0-bb19-386d-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>10000</backPressureObjectThreshold>
+            <bends>
+                <x>469.6021968790567</x>
+                <y>1017.9549013717346</y>
+            </bends>
+            <bends>
+                <x>469.6021968790567</x>
+                <y>1067.9549013717346</y>
+            </bends>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>d18d7c78-8767-35c5-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>failure</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>d18d7c78-8767-35c5-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>7400e70c-689c-353f-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>61b913f5-e84d-33c4-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>split</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>1f4acd0d-2480-38ea-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>786748c8-7a7c-3dd4-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <bends>
+                <x>173.46475219726562</x>
+                <y>179.42988967895508</y>
+            </bends>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>1f4acd0d-2480-38ea-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>0</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>Response</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>6ada961c-399a-30dd-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>91e420fd-87d5-39e6-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>d18d7c78-8767-35c5-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>failure</selectedRelationships>
+            <selectedRelationships>success</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>5946c6a3-44fa-3784-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>c518dc9b-e66c-3664-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>5946c6a3-44fa-3784-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>matched</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>61b913f5-e84d-33c4-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <controllerServices>
+            <id>d8092989-d6ef-3313-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <bundle>
+                <artifact>nifi-record-serialization-services-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <descriptors>
+                <entry>
+                    <key>schema-access-strategy</key>
+                    <value>
+                        <name>schema-access-strategy</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-registry</key>
+                    <value>
+                        
<identifiesControllerService>org.apache.nifi.schemaregistry.services.SchemaRegistry</identifiesControllerService>
+                        <name>schema-registry</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-name</key>
+                    <value>
+                        <name>schema-name</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-version</key>
+                    <value>
+                        <name>schema-version</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-branch</key>
+                    <value>
+                        <name>schema-branch</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-text</key>
+                    <value>
+                        <name>schema-text</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-inference-cache</key>
+                    <value>
+                        
<identifiesControllerService>org.apache.nifi.serialization.RecordSchemaCacheService</identifiesControllerService>
+                        <name>schema-inference-cache</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>Date Format</key>
+                    <value>
+                        <name>Date Format</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>Time Format</key>
+                    <value>
+                        <name>Time Format</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>Timestamp Format</key>
+                    <value>
+                        <name>Timestamp Format</name>
+                    </value>
+                </entry>
+            </descriptors>
+            <name>JsonTreeReader</name>
+            <persistsState>false</persistsState>
+            <properties>
+                <entry>
+                    <key>schema-access-strategy</key>
+                </entry>
+                <entry>
+                    <key>schema-registry</key>
+                </entry>
+                <entry>
+                    <key>schema-name</key>
+                </entry>
+                <entry>
+                    <key>schema-version</key>
+                </entry>
+                <entry>
+                    <key>schema-branch</key>
+                </entry>
+                <entry>
+                    <key>schema-text</key>
+                </entry>
+                <entry>
+                    <key>schema-inference-cache</key>
+                </entry>
+                <entry>
+                    <key>Date Format</key>
+                </entry>
+                <entry>
+                    <key>Time Format</key>
+                </entry>
+                <entry>
+                    <key>Timestamp Format</key>
+                </entry>
+            </properties>
+            <state>ENABLED</state>
+            <type>org.apache.nifi.json.JsonTreeReader</type>
+        </controllerServices>
+        <processors>
+            <id>1f4acd0d-2480-38ea-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>5.00505561901673</x>
+                <y>268.45753564705933</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>JsonPath Expression</key>
+                        <value>
+                            <name>JsonPath Expression</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>
+                            <name>Null Value Representation</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>JsonPath Expression</key>
+                        <value>$.results[*]</value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>empty string</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>SplitJson</name>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>original</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>split</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.SplitJson</type>
+        </processors>
+        <processors>
+            <id>5946c6a3-44fa-3784-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>5.00036773572856</x>
+                <y>744.0256629035371</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>Attributes List</key>
+                        <value>
+                            <name>Attributes List</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>attributes-to-json-regex</key>
+                        <value>
+                            <name>attributes-to-json-regex</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Destination</key>
+                        <value>
+                            <name>Destination</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Include Core Attributes</key>
+                        <value>
+                            <name>Include Core Attributes</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Null Value</key>
+                        <value>
+                            <name>Null Value</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>Attributes List</key>
+                        <value>ssn, firstName, lastName, email</value>
+                    </entry>
+                    <entry>
+                        <key>attributes-to-json-regex</key>
+                    </entry>
+                    <entry>
+                        <key>Destination</key>
+                        <value>flowfile-content</value>
+                    </entry>
+                    <entry>
+                        <key>Include Core Attributes</key>
+                        <value>true</value>
+                    </entry>
+                    <entry>
+                        <key>Null Value</key>
+                        <value>false</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>AttributesToJSON</name>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>success</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.AttributesToJSON</type>
+        </processors>
+        <processors>
+            <id>61b913f5-e84d-33c4-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>6.4349386870426315</x>
+                <y>504.31885574631224</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>Destination</key>
+                        <value>
+                            <name>Destination</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Return Type</key>
+                        <value>
+                            <name>Return Type</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Path Not Found Behavior</key>
+                        <value>
+                            <name>Path Not Found Behavior</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>
+                            <name>Null Value Representation</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>email</key>
+                        <value>
+                            <name>email</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>firstName</key>
+                        <value>
+                            <name>firstName</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>lastName</key>
+                        <value>
+                            <name>lastName</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>ssn</key>
+                        <value>
+                            <name>ssn</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>Destination</key>
+                        <value>flowfile-attribute</value>
+                    </entry>
+                    <entry>
+                        <key>Return Type</key>
+                        <value>auto-detect</value>
+                    </entry>
+                    <entry>
+                        <key>Path Not Found Behavior</key>
+                        <value>ignore</value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>empty string</value>
+                    </entry>
+                    <entry>
+                        <key>email</key>
+                        <value>$.email</value>
+                    </entry>
+                    <entry>
+                        <key>firstName</key>
+                        <value>$.name.first</value>
+                    </entry>
+                    <entry>
+                        <key>lastName</key>
+                        <value>$.name.last</value>
+                    </entry>
+                    <entry>
+                        <key>ssn</key>
+                        <value>$.id.value</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>EvaluateJsonPath</name>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>matched</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>unmatched</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.EvaluateJsonPath</type>
+        </processors>
+        <processors>
+            <id>6ada961c-399a-30dd-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>0.0</x>
+                <y>0.0</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>HTTP Method</key>
+                        <value>
+                            <name>HTTP Method</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Remote URL</key>
+                        <value>
+                            <name>Remote URL</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>SSL Context Service</key>
+                        <value>
+                            
<identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService>
+                            <name>SSL Context Service</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Connection Timeout</key>
+                        <value>
+                            <name>Connection Timeout</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Read Timeout</key>
+                        <value>
+                            <name>Read Timeout</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Include Date Header</key>
+                        <value>
+                            <name>Include Date Header</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Follow Redirects</key>
+                        <value>
+                            <name>Follow Redirects</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Attributes to Send</key>
+                        <value>
+                            <name>Attributes to Send</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Username</key>
+                        <value>
+                            <name>Basic Authentication Username</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Password</key>
+                        <value>
+                            <name>Basic Authentication Password</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>proxy-configuration-service</key>
+                        <value>
+                            
<identifiesControllerService>org.apache.nifi.proxy.ProxyConfigurationService</identifiesControllerService>
+                            <name>proxy-configuration-service</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Proxy Host</key>
+                        <value>
+                            <name>Proxy Host</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Proxy Port</key>
+                        <value>
+                            <name>Proxy Port</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Proxy Type</key>
+                        <value>
+                            <name>Proxy Type</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-user</key>
+                        <value>
+                            <name>invokehttp-proxy-user</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-password</key>
+                        <value>
+                            <name>invokehttp-proxy-password</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Put Response Body In Attribute</key>
+                        <value>
+                            <name>Put Response Body In Attribute</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Max Length To Put In Attribute</key>
+                        <value>
+                            <name>Max Length To Put In Attribute</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Digest Authentication</key>
+                        <value>
+                            <name>Digest Authentication</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Always Output Response</key>
+                        <value>
+                            <name>Always Output Response</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Trusted Hostname</key>
+                        <value>
+                            <name>Trusted Hostname</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Add Response Headers to Request</key>
+                        <value>
+                            <name>Add Response Headers to Request</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Content-Type</key>
+                        <value>
+                            <name>Content-Type</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>send-message-body</key>
+                        <value>
+                            <name>send-message-body</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Use Chunked Encoding</key>
+                        <value>
+                            <name>Use Chunked Encoding</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Penalize on "No Retry"</key>
+                        <value>
+                            <name>Penalize on "No Retry"</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>use-etag</key>
+                        <value>
+                            <name>use-etag</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>etag-max-cache-size</key>
+                        <value>
+                            <name>etag-max-cache-size</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>HTTP Method</key>
+                        <value>GET</value>
+                    </entry>
+                    <entry>
+                        <key>Remote URL</key>
+                        
<value>http://api.randomuser.me?nat=us&amp;results=100</value>
+                    </entry>
+                    <entry>
+                        <key>SSL Context Service</key>
+                    </entry>
+                    <entry>
+                        <key>Connection Timeout</key>
+                        <value>5 secs</value>
+                    </entry>
+                    <entry>
+                        <key>Read Timeout</key>
+                        <value>15 secs</value>
+                    </entry>
+                    <entry>
+                        <key>Include Date Header</key>
+                        <value>True</value>
+                    </entry>
+                    <entry>
+                        <key>Follow Redirects</key>
+                        <value>True</value>
+                    </entry>
+                    <entry>
+                        <key>Attributes to Send</key>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Username</key>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Password</key>
+                    </entry>
+                    <entry>
+                        <key>proxy-configuration-service</key>
+                    </entry>
+                    <entry>
+                        <key>Proxy Host</key>
+                    </entry>
+                    <entry>
+                        <key>Proxy Port</key>
+                    </entry>
+                    <entry>
+                        <key>Proxy Type</key>
+                        <value>http</value>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-user</key>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-password</key>
+                    </entry>
+                    <entry>
+                        <key>Put Response Body In Attribute</key>
+                    </entry>
+                    <entry>
+                        <key>Max Length To Put In Attribute</key>
+                        <value>256</value>
+                    </entry>
+                    <entry>
+                        <key>Digest Authentication</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Always Output Response</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Trusted Hostname</key>
+                    </entry>
+                    <entry>
+                        <key>Add Response Headers to Request</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Content-Type</key>
+                        <value>${mime.type}</value>
+                    </entry>
+                    <entry>
+                        <key>send-message-body</key>
+                        <value>true</value>
+                    </entry>
+                    <entry>
+                        <key>Use Chunked Encoding</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Penalize on "No Retry"</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>use-etag</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>etag-max-cache-size</key>
+                        <value>10MB</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>10 seconds</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>Fetch User Data</name>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>Failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>No Retry</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>Original</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>Response</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>Retry</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.InvokeHTTP</type>
+        </processors>
+        <processors>
+            <id>d18d7c78-8767-35c5-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>6.6021968790567485</x>
+                <y>977.9549013717346</y>
+            </position>
+            <bundle>
+                <artifact>nifi-kudu-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>Kudu Masters</key>
+                        <value>
+                            <name>Kudu Masters</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Table Name</key>
+                        <value>
+                            <name>Table Name</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>kerberos-credentials-service</key>
+                        <value>
+                            
<identifiesControllerService>org.apache.nifi.kerberos.KerberosCredentialsService</identifiesControllerService>
+                            <name>kerberos-credentials-service</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Skip head line</key>
+                        <value>
+                            <name>Skip head line</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>record-reader</key>
+                        <value>
+                            
<identifiesControllerService>org.apache.nifi.serialization.RecordReaderFactory</identifiesControllerService>
+                            <name>record-reader</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Insert Operation</key>
+                        <value>
+                            <name>Insert Operation</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Flush Mode</key>
+                        <value>
+                            <name>Flush Mode</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>FlowFiles per Batch</key>
+                        <value>
+                            <name>FlowFiles per Batch</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Batch Size</key>
+                        <value>
+                            <name>Batch Size</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>Kudu Masters</key>
+                        
<value>kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251</value>
+                    </entry>
+                    <entry>
+                        <key>Table Name</key>
+                        <value>random_user</value>
+                    </entry>
+                    <entry>
+                        <key>kerberos-credentials-service</key>
+                    </entry>
+                    <entry>
+                        <key>Skip head line</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>record-reader</key>
+                        <value>d8092989-d6ef-3313-0000-000000000000</value>
+                    </entry>
+                    <entry>
+                        <key>Insert Operation</key>
+                        <value>UPSERT</value>
+                    </entry>
+                    <entry>
+                        <key>Flush Mode</key>
+                        <value>AUTO_FLUSH_BACKGROUND</value>
+                    </entry>
+                    <entry>
+                        <key>FlowFiles per Batch</key>
+                        <value>1</value>
+                    </entry>
+                    <entry>
+                        <key>Batch Size</key>
+                        <value>100</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>PutKudu</name>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>success</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.kudu.PutKudu</type>
+        </processors>
+    </snippet>
+    <timestamp>07/18/2019 14:13:34 UTC</timestamp>
+</template>
diff --git a/examples/quickstart/spark/README.adoc 
b/examples/quickstart/spark/README.adoc
index 42953fe..b7ec637 100644
--- a/examples/quickstart/spark/README.adoc
+++ b/examples/quickstart/spark/README.adoc
@@ -3,7 +3,7 @@
 Below is a brief example using Apache Spark to load, query, and modify a real
 data set in Apache Kudu.
 
-== Start the Kudu Quickstart
+== Start the Kudu Quickstart Environment
 
 See the Apache Kudu
 link:https://kudu.apache.org/docs/quickstart.html[quickstart documentation]

Reply via email to