nifi-do...

jstorck Tue, 03 Oct 2017 06:31:52 -0700

Added: 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroRecordSetWriter/index.html
URL: 
http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroRecordSetWriter/index.html?rev=1811008&view=auto
==============================================================================
--- 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroRecordSetWriter/index.html
 (added)
+++ 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroRecordSetWriter/index.html
 Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta 
charset="utf-8"></meta><title>AvroRecordSetWriter</title><link rel="stylesheet" 
href="../../../../../css/component-usage.css" 
type="text/css"></link></head><script type="text/javascript">window.onload = 
function(){if(self==top) { document.getElementById('nameHeader').style.display 
= "inherit"; } }</script><body><h1 id="nameHeader" style="display: 
none;">AvroRecordSetWriter</h1><h2>Description: </h2><p>Writes the contents of 
a RecordSet in Binary Avro format.</p><h3>Tags: </h3><p>avro, result, set, 
writer, serializer, record, recordset, row</p><h3>Properties: </h3><p>In the 
list below, the names of required properties appear in <strong>bold</strong>. 
Any other properties (not in bold) are considered optional. The table also 
indicates any default values, and whether a property supports the <a 
href="../../../../../html/expression-language-guide.html">NiFi Expression 
Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th
 ><th>Allowable Values</th><th>Description</th></tr><tr><td 
 >id="name"><strong>Schema Write Strategy</strong></td><td 
 >id="default-value">avro-embedded</td><td id="allowable-values"><ul><li>Embed 
 >Avro Schema <img src="../../../../../html/images/iconInfo.png" alt="The 
 >FlowFile will have the Avro schema embedded into the content, as is typical 
 >with Avro" title="The FlowFile will have the Avro schema embedded into the 
 >content, as is typical with Avro"></img></li><li>Set 'schema.name' Attribute 
 ><img src="../../../../../html/images/iconInfo.png" alt="The FlowFile will be 
 >given an attribute named 'schema.name' and this attribute will indicate the 
 >name of the schema in the Schema Registry. Note that ifthe schema for a 
 >record is not obtained from a Schema Registry, then no attribute will be 
 >added." title="The FlowFile will be given an attribute named 'schema.name' 
 >and this attribute will indicate the name of the schema in the Schema 
 >Registry. Note that ifthe schema for a record is not obtained
  from a Schema Registry, then no attribute will be added."></img></li><li>Set 
'avro.schema' Attribute <img src="../../../../../html/images/iconInfo.png" 
alt="The FlowFile will be given an attribute named 'avro.schema' and this 
attribute will contain the Avro Schema that describes the records in the 
FlowFile. The contents of the FlowFile need not be Avro, but the text of the 
schema will be used." title="The FlowFile will be given an attribute named 
'avro.schema' and this attribute will contain the Avro Schema that describes 
the records in the FlowFile. The contents of the FlowFile need not be Avro, but 
the text of the schema will be used."></img></li><li>HWX Schema Reference 
Attributes <img src="../../../../../html/images/iconInfo.png" alt="The FlowFile 
will be given a set of 3 attributes to describe the schema: 
'schema.identifier', 'schema.version', and 'schema.protocol.version'. Note that 
if the schema for a record does not contain the necessary identifier and 
version, an Exception
  will be thrown when attempting to write the data." title="The FlowFile will 
be given a set of 3 attributes to describe the schema: 'schema.identifier', 
'schema.version', and 'schema.protocol.version'. Note that if the schema for a 
record does not contain the necessary identifier and version, an Exception will 
be thrown when attempting to write the data."></img></li><li>HWX 
Content-Encoded Schema Reference <img 
src="../../../../../html/images/iconInfo.png" alt="The content of the FlowFile 
will contain a reference to a schema in the Schema Registry service. The 
reference is encoded as a single byte indicating the 'protocol version', 
followed by 8 bytes indicating the schema identifier, and finally 4 bytes 
indicating the schema version, as per the Hortonworks Schema Registry 
serializers and deserializers, as found at 
https://github.com/hortonworks/registry. This will be prepended to each 
FlowFile. Note that if the schema for a record does not contain the necessary 
identifier and versi
 on, an Exception will be thrown when attempting to write the data." title="The 
content of the FlowFile will contain a reference to a schema in the Schema 
Registry service. The reference is encoded as a single byte indicating the 
'protocol version', followed by 8 bytes indicating the schema identifier, and 
finally 4 bytes indicating the schema version, as per the Hortonworks Schema 
Registry serializers and deserializers, as found at 
https://github.com/hortonworks/registry. This will be prepended to each 
FlowFile. Note that if the schema for a record does not contain the necessary 
identifier and version, an Exception will be thrown when attempting to write 
the data."></img></li><li>Confluent Schema Registry Reference <img 
src="../../../../../html/images/iconInfo.png" alt="The content of the FlowFile 
will contain a reference to a schema in the Schema Registry service. The 
reference is encoded as a single 'Magic Byte' followed by 4 bytes representing 
the identifier of the schema, as out
 lined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html.
 This will be prepended to each FlowFile. Note that if the schema for a record 
does not contain the necessary identifier and version, an Exception will be 
thrown when attempting to write the data. This is based on the encoding used by 
version 3.2.x of the Confluent Schema Registry." title="The content of the 
FlowFile will contain a reference to a schema in the Schema Registry service. 
The reference is encoded as a single 'Magic Byte' followed by 4 bytes 
representing the identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html.
 This will be prepended to each FlowFile. Note that if the schema for a record 
does not contain the necessary identifier and version, an Exception will be 
thrown when attempting to write the data. This is based on the encoding used by 
version 3.2.x of the Confluent Schema Registry."></img></li><li>Do Not Write 
 Schema <img src="../../../../../html/images/iconInfo.png" alt="Do not add any 
schema-related information to the FlowFile." title="Do not add any 
schema-related information to the FlowFile."></img></li></ul></td><td 
id="description">Specifies how the schema for a Record should be added to the 
data.</td></tr><tr><td id="name"><strong>Schema Access 
Strategy</strong></td><td id="default-value">inherit-record-schema</td><td 
id="allowable-values"><ul><li>Use 'Schema Name' Property <img 
src="../../../../../html/images/iconInfo.png" alt="The name of the Schema to 
use is specified by the 'Schema Name' Property. The value of this property is 
used to lookup the Schema in the configured Schema Registry service." 
title="The name of the Schema to use is specified by the 'Schema Name' 
Property. The value of this property is used to lookup the Schema in the 
configured Schema Registry service."></img></li><li>Inherit Record Schema <img 
src="../../../../../html/images/iconInfo.png" alt="The schema us
 ed to write records will be the same schema that was given to the Record when 
the Record was created." title="The schema used to write records will be the 
same schema that was given to the Record when the Record was 
created."></img></li><li>Use 'Schema Text' Property <img 
src="../../../../../html/images/iconInfo.png" alt="The text of the Schema 
itself is specified by the 'Schema Text' Property. The value of this property 
must be a valid Avro Schema. If Expression Language is used, the value of the 
'Schema Text' property must be valid after substituting the expressions." 
title="The text of the Schema itself is specified by the 'Schema Text' 
Property. The value of this property must be a valid Avro Schema. If Expression 
Language is used, the value of the 'Schema Text' property must be valid after 
substituting the expressions."></img></li></ul></td><td 
id="description">Specifies how to obtain the schema that is to be used for 
interpreting the data.</td></tr><tr><td id="name">Schema Reg
 istry</td><td id="default-value"></td><td 
id="allowable-values"><strong>Controller Service API: 
</strong><br/>SchemaRegistry<br/><strong>Implementations: </strong><a 
href="../../../nifi-registry-nar/1.4.0/org.apache.nifi.schemaregistry.services.AvroSchemaRegistry/index.html">AvroSchemaRegistry</a><br/><a
 
href="../../../nifi-hwx-schema-registry-nar/1.4.0/org.apache.nifi.schemaregistry.hortonworks.HortonworksSchemaRegistry/index.html">HortonworksSchemaRegistry</a><br/><a
 
href="../../../nifi-confluent-platform-nar/1.4.0/org.apache.nifi.confluent.schemaregistry.ConfluentSchemaRegistry/index.html">ConfluentSchemaRegistry</a></td><td
 id="description">Specifies the Controller Service to use for the Schema 
Registry</td></tr><tr><td id="name">Schema Name</td><td 
id="default-value">${schema.name}</td><td id="allowable-values"></td><td 
id="description">Specifies the name of the schema to lookup in the Schema 
Registry property<br/><strong>Supports Expression Language: 
true</strong></td></tr><tr
 ><td id="name">Schema Text</td><td id="default-value">${avro.schema}</td><td 
 >id="allowable-values"></td><td id="description">The text of an Avro-formatted 
 >Schema<br/><strong>Supports Expression Language: 
 >true</strong></td></tr><tr><td id="name"><strong>Compression 
 >Format</strong></td><td id="default-value">NONE</td><td 
 >id="allowable-values"><ul><li>BZIP2</li><li>DEFLATE</li><li>NONE</li><li>SNAPPY</li><li>LZO</li></ul></td><td
 > id="description">Compression type to use when writing Avro files. Default is 
 >None.</td></tr></table><h3>State management: </h3>This component does not 
 >store state.<h3>Restricted: </h3>This component is not 
 >restricted.</body></html>
\ No newline at end of file


Added: 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/additionalDetails.html
URL: 
http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/additionalDetails.html?rev=1811008&view=auto
==============================================================================
--- 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/additionalDetails.html
 (added)
+++ 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/additionalDetails.html
 Tue Oct  3 13:30:16 2017
@@ -0,0 +1,334 @@
+<!DOCTYPE html>
+<html lang="en">
+    <!--
+      Licensed to the Apache Software Foundation (ASF) under one or more
+      contributor license agreements.  See the NOTICE file distributed with
+      this work for additional information regarding copyright ownership.
+      The ASF licenses this file to You under the Apache License, Version 2.0
+      (the "License"); you may not use this file except in compliance with
+      the License.  You may obtain a copy of the License at
+          http://www.apache.org/licenses/LICENSE-2.0
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+      See the License for the specific language governing permissions and
+      limitations under the License.
+    -->
+    <head>
+        <meta charset="utf-8"/>
+        <title>CSVReader</title>
+        <link rel="stylesheet" href="../../../../../css/component-usage.css" 
type="text/css"/>
+    </head>
+
+    <body>
+        <p>
+               The CSVReader Controller Service, expects input in such a way 
that the first line of a FlowFile specifies the name of
+               each column in the data. Following the first line, the rest of 
the FlowFile is expected to be valid CSV data from which
+               to form appropriate Records. The reader allows for 
customization of the CSV Format, such as which character should be used
+               to separate CSV fields, which character should be used for 
quoting and when to quote fields, which character should denote
+               a comment, etc.
+        </p>
+
+
+               <h2>Schemas and Type Coercion</h2>
+               
+               <p>
+                       When a record is parsed from incoming data, it is 
separated into fields. Each of these fields is then looked up against the
+                       configured schema (by field name) in order to determine 
what the type of the data should be. If the field is not present in
+                       the schema, that field is omitted from the Record. If 
the field is found in the schema, the data type of the received data
+                       is compared against the data type specified in the 
schema. If the types match, the value of that field is used as-is. If the
+                       schema indicates that the field should be of a 
different type, then the Controller Service will attempt to coerce the data
+                       into the type specified by the schema. If the field 
cannot be coerced into the specified type, an Exception will be thrown.
+               </p>
+               
+               <p>
+                       The following rules apply when attempting to coerce a 
field value from one data type to another:
+               </p>
+                       
+               <ul>
+                       <li>Any data type can be coerced into a String 
type.</li>
+                       <li>Any numeric data type (Byte, Short, Int, Long, 
Float, Double) can be coerced into any other numeric data type.</li>
+                       <li>Any numeric value can be coerced into a Date, Time, 
or Timestamp type, by assuming that the Long value is the number of
+                       milliseconds since epoch (Midnight GMT, January 1, 
1970).</li>
+                       <li>A String value can be coerced into a Date, Time, or 
Timestamp type, if its format matches the configured "Date Format," "Time 
Format,"
+                               or "Timestamp Format."</li>
+                       <li>A String value can be coerced into a numeric value 
if the value is of the appropriate type. For example, the String value
+                               <code>8</code> can be coerced into any numeric 
type. However, the String value <code>8.2</code> can be coerced into a Double 
or Float
+                               type but not an Integer.</li>
+                       <li>A String value of "true" or "false" (regardless of 
case) can be coerced into a Boolean value.</li>
+                       <li>A String value that is not empty can be coerced 
into a Char type. If the String contains more than 1 character, the first 
character is used
+                               and the rest of the characters are ignored.</li>
+                       <li>Any "date/time" type (Date, Time, Timestamp) can be 
coerced into any other "date/time" type.</li>
+                       <li>Any "date/time" type can be coerced into a Long 
type, representing the number of milliseconds since epoch (Midnight GMT, 
January 1, 1970).</li>
+                       <li>Any "date/time" type can be coerced into a String. 
The format of the String is whatever DateFormat is configured for the 
corresponding
+                               property (Date Format, Time Format, Timestamp 
Format property).</li>
+               </ul>
+               
+               <p>
+                       If none of the above rules apply when attempting to 
coerce a value from one data type to another, the coercion will fail and an 
Exception
+                       will be thrown.
+               </p>
+                       
+                       
+
+               <h2>Examples</h2>
+               
+               <h3>Example 1</h3>
+               
+        <p>
+               As an example, consider a FlowFile whose contents consists of 
the following:
+        </p>
+
+<code>
+id, name, balance, join_date, notes<br />
+1, John, 48.23, 04/03/2007 "Our very<br />
+first customer!"<br />
+2, Jane, 1245.89, 08/22/2009,<br />
+3, Frank Franklin, "48481.29", 04/04/2016,<br />
+</code>
+        
+        <p>
+               Additionally, let's consider that this Controller Service is 
configured with the Schema Registry pointing to an AvroSchemaRegistry and the 
schema is
+               configured as the following:
+        </p>
+        
+<code>
+<pre>
+{
+  "namespace": "nifi",
+  "name": "balances",
+  "type": "record",
+  "fields": [
+    { "name": "id", "type": "int" },
+    { "name": "name": "type": "string" },
+    { "name": "balance": "type": "double" },
+    { "name": "join_date", "type": {
+      "type": "int",
+      "logicalType": "date"
+    }},
+    { "name": "notes": "type": "string" }
+  ]
+}
+</pre>
+</code>
+
+       <p>
+               In the example above, we see that the 'join_date' column is a 
Date type. In order for the CSV Reader to be able to properly parse a value as 
a date,
+               we need to provide the reader with the date format to use. In 
this example, we would configure the Date Format property to be 
<code>MM/dd/yyyy</code>
+               to indicate that it is a two-digit month, followed by a 
two-digit day, followed by a four-digit year - each separated by a slash.
+               In this case, the result will be that this FlowFile consists of 
3 different records. The first record will contain the following values:
+       </p>
+
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>id</td>
+                               <td>1</td>
+                       </tr>
+                       <tr>
+                               <td>name</td>
+                               <td>John</td>
+                       </tr>
+                       <tr>
+                               <td>balance</td>
+                               <td>48.23</td>
+                       </tr>
+                       <tr>
+                               <td>join_date</td>
+                               <td>04/03/2007</td>
+                       </tr>
+                       <tr>
+                               <td>notes</td>
+                               <td>Our very<br />first customer!</td>
+                       </tr>
+               </body>
+       </table>
+       
+       <p>
+               The second record will contain the following values:
+       </p>
+       
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>id</td>
+                               <td>2</td>
+                       </tr>
+                       <tr>
+                               <td>name</td>
+                               <td>Jane</td>
+                       </tr>
+                       <tr>
+                               <td>balance</td>
+                               <td>1245.89</td>
+                       </tr>
+                       <tr>
+                               <td>join_date</td>
+                               <td>08/22/2009</td>
+                       </tr>
+                       <tr>
+                               <td>notes</td>
+                               <td></td>
+                       </tr>
+               </body>
+       </table>
+       
+               <p>
+                       The third record will contain the following values:
+               </p>            
+       
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>id</td>
+                               <td>3</td>
+                       </tr>
+                       <tr>
+                               <td>name</td>
+                               <td>Frank Franklin</td>
+                       </tr>
+                       <tr>
+                               <td>balance</td>
+                               <td>48481.29</td>
+                       </tr>
+                       <tr>
+                               <td>join_date</td>
+                               <td>04/04/2016</td>
+                       </tr>
+                       <tr>
+                               <td>notes</td>
+                               <td></td>
+                       </tr>
+               </body>
+       </table>
+
+
+
+       <h3>Example 2 - Schema with CSV Header Line</h3>
+
+       <p>
+               When CSV data consists of a header line that outlines the 
column names, the reader provides
+               a couple of different properties for configuring how to handle 
these column names. The
+               "Schema Access Strategy" property as well as the associated 
properties ("Schema Registry," "Schema Text," and
+               "Schema Name" properties) can be used to specify how to obtain 
the schema. If the "Schema Access Strategy" is set
+               to "Use String Fields From Header" then the header line of the 
CSV will be used to determine the schema. Otherwise,
+               a schema will be referenced elsewhere. But what happens if a 
schema is obtained from a Schema Registry, for instance,
+               and the CSV Header indicates a different set of column names?
+       </p>
+       
+       <p>
+               For example, let's say that the following schema is obtained 
from the Schema Registry:
+       </p>
+
+<code>
+<pre>
+{
+  "namespace": "nifi",
+  "name": "balances",
+  "type": "record",
+  "fields": [
+    { "name": "id", "type": "int" },
+    { "name": "name": "type": "string" },
+    { "name": "balance": "type": "double" },
+    { "name": "memo": "type": "string" }
+  ]
+}
+</pre>
+</code>
+               
+               <p>
+                       And the CSV contains the following data:
+               </p>
+               
+<code>
+<pre>
+id, name, balance, notes
+1, John Doe, 123.45, First Customer
+</pre>
+</code>
+               
+               <p>
+               Note here that our schema indicates that the final column is 
named "memo" whereas the CSV Header indicates that it is named "notes."
+               </p>
+       
+       <p>
+       In this case, the reader will look at the "Ignore CSV Header Column 
Names" property. If this property is set to "true" then the column names
+       provided in the CSV will simply be ignored and the last column will be 
called "memo." However, if the "Ignore CSV Header Column Names" property
+       is set to "false" then the result will be that the last column will be 
named "notes" and each record will have a null value for the "memo" column.
+       </p>
+
+               <p>
+               With "Ignore CSV Header Column Names" property set to 
"false":<br />
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>id</td>
+                               <td>1</td>
+                       </tr>
+                       <tr>
+                               <td>name</td>
+                               <td>John Doe</td>
+                       </tr>
+                       <tr>
+                               <td>balance</td>
+                               <td>123.45</td>
+                       </tr>
+                       <tr>
+                               <td>memo</td>
+                               <td>First Customer</td>
+                       </tr>
+               </body>
+       </table>
+               </p>
+               
+               
+               <p>
+               With "Ignore CSV Header Column Names" property set to 
"true":<br />
+                               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>id</td>
+                               <td>1</td>
+                       </tr>
+                       <tr>
+                               <td>name</td>
+                               <td>John Doe</td>
+                       </tr>
+                       <tr>
+                               <td>balance</td>
+                               <td>123.45</td>
+                       </tr>
+                       <tr>
+                               <td>notes</td>
+                               <td>First Customer</td>
+                       </tr>
+                       <tr>
+                               <td>memo</td>
+                               <td><code>null</code></td>
+                       </tr>
+               </body>
+       </table>
+               </p>
+               
+    </body>
+</html>

Added: 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/index.html
URL: 
http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/index.html?rev=1811008&view=auto
==============================================================================
--- 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/index.html
 (added)
+++ 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/index.html
 Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta 
charset="utf-8"></meta><title>CSVReader</title><link rel="stylesheet" 
href="../../../../../css/component-usage.css" 
type="text/css"></link></head><script type="text/javascript">window.onload = 
function(){if(self==top) { document.getElementById('nameHeader').style.display 
= "inherit"; } }</script><body><h1 id="nameHeader" style="display: 
none;">CSVReader</h1><h2>Description: </h2><p>Parses CSV-formatted data, 
returning each row in the CSV file as a separate record. This reader assumes 
that the first line in the content is the column names and all subsequent lines 
are the values. See Controller Service's Usage for further 
documentation.</p><p><a href="additionalDetails.html">Additional 
Details...</a></p><h3>Tags: </h3><p>csv, parse, record, row, reader, delimited, 
comma, separated, values</p><h3>Properties: </h3><p>In the list below, the 
names of required properties appear in <strong>bold</strong>. Any other 
properties (not in bold) are consi
 dered optional. The table also indicates any default values, and whether a 
property supports the <a 
href="../../../../../html/expression-language-guide.html">NiFi Expression 
Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default 
Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td 
id="name"><strong>Schema Access Strategy</strong></td><td 
id="default-value">csv-header-derived</td><td id="allowable-values"><ul><li>Use 
'Schema Name' Property <img src="../../../../../html/images/iconInfo.png" 
alt="The name of the Schema to use is specified by the 'Schema Name' Property. 
The value of this property is used to lookup the Schema in the configured 
Schema Registry service." title="The name of the Schema to use is specified by 
the 'Schema Name' Property. The value of this property is used to lookup the 
Schema in the configured Schema Registry service."></img></li><li>Use 'Schema 
Text' Property <img src="../../../../../html/images/iconInfo.png" alt="The text 
of the 
 Schema itself is specified by the 'Schema Text' Property. The value of this 
property must be a valid Avro Schema. If Expression Language is used, the value 
of the 'Schema Text' property must be valid after substituting the 
expressions." title="The text of the Schema itself is specified by the 'Schema 
Text' Property. The value of this property must be a valid Avro Schema. If 
Expression Language is used, the value of the 'Schema Text' property must be 
valid after substituting the expressions."></img></li><li>HWX Schema Reference 
Attributes <img src="../../../../../html/images/iconInfo.png" alt="The FlowFile 
contains 3 Attributes that will be used to lookup a Schema from the configured 
Schema Registry: 'schema.identifier', 'schema.version', and 
'schema.protocol.version'" title="The FlowFile contains 3 Attributes that will 
be used to lookup a Schema from the configured Schema Registry: 
'schema.identifier', 'schema.version', and 
'schema.protocol.version'"></img></li><li>HWX Content-Encod
 ed Schema Reference <img src="../../../../../html/images/iconInfo.png" 
alt="The content of the FlowFile contains a reference to a schema in the Schema 
Registry service. The reference is encoded as a single byte indicating the 
'protocol version', followed by 8 bytes indicating the schema identifier, and 
finally 4 bytes indicating the schema version, as per the Hortonworks Schema 
Registry serializers and deserializers, found at 
https://github.com/hortonworks/registry"; title="The content of the FlowFile 
contains a reference to a schema in the Schema Registry service. The reference 
is encoded as a single byte indicating the 'protocol version', followed by 8 
bytes indicating the schema identifier, and finally 4 bytes indicating the 
schema version, as per the Hortonworks Schema Registry serializers and 
deserializers, found at 
https://github.com/hortonworks/registry";></img></li><li>Confluent 
Content-Encoded Schema Reference <img 
src="../../../../../html/images/iconInfo.png" alt="The conten
 t of the FlowFile contains a reference to a schema in the Schema Registry 
service. The reference is encoded as a single 'Magic Byte' followed by 4 bytes 
representing the identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html.
 This is based on version 3.2.x of the Confluent Schema Registry." title="The 
content of the FlowFile contains a reference to a schema in the Schema Registry 
service. The reference is encoded as a single 'Magic Byte' followed by 4 bytes 
representing the identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html.
 This is based on version 3.2.x of the Confluent Schema 
Registry."></img></li><li>Use String Fields From Header <img 
src="../../../../../html/images/iconInfo.png" alt="The first non-comment line 
of the CSV file is a header line that contains the names of the columns. The 
schema will be derived by using the column names in the hea
 der and assuming that all columns are of type String." title="The first 
non-comment line of the CSV file is a header line that contains the names of 
the columns. The schema will be derived by using the column names in the header 
and assuming that all columns are of type String."></img></li></ul></td><td 
id="description">Specifies how to obtain the schema that is to be used for 
interpreting the data.</td></tr><tr><td id="name">Schema Registry</td><td 
id="default-value"></td><td id="allowable-values"><strong>Controller Service 
API: </strong><br/>SchemaRegistry<br/><strong>Implementations: </strong><a 
href="../../../nifi-registry-nar/1.4.0/org.apache.nifi.schemaregistry.services.AvroSchemaRegistry/index.html">AvroSchemaRegistry</a><br/><a
 
href="../../../nifi-hwx-schema-registry-nar/1.4.0/org.apache.nifi.schemaregistry.hortonworks.HortonworksSchemaRegistry/index.html">HortonworksSchemaRegistry</a><br/><a
 
href="../../../nifi-confluent-platform-nar/1.4.0/org.apache.nifi.confluent.schemare
 gistry.ConfluentSchemaRegistry/index.html">ConfluentSchemaRegistry</a></td><td 
id="description">Specifies the Controller Service to use for the Schema 
Registry</td></tr><tr><td id="name">Schema Name</td><td 
id="default-value">${schema.name}</td><td id="allowable-values"></td><td 
id="description">Specifies the name of the schema to lookup in the Schema 
Registry property<br/><strong>Supports Expression Language: 
true</strong></td></tr><tr><td id="name">Schema Text</td><td 
id="default-value">${avro.schema}</td><td id="allowable-values"></td><td 
id="description">The text of an Avro-formatted Schema<br/><strong>Supports 
Expression Language: true</strong></td></tr><tr><td id="name">Date 
Format</td><td id="default-value"></td><td id="allowable-values"></td><td 
id="description">Specifies the format to use when reading/writing Date fields. 
If not specified, Date fields will be assumed to be number of milliseconds 
since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the
  Java Simple Date Format (for example, MM/dd/yyyy for a two-digit month, 
followed by a two-digit day, followed by a four-digit year, all separated by 
'/' characters, as in 01/01/2017).</td></tr><tr><td id="name">Time 
Format</td><td id="default-value"></td><td id="allowable-values"></td><td 
id="description">Specifies the format to use when reading/writing Time fields. 
If not specified, Time fields will be assumed to be number of milliseconds 
since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the 
Java Simple Date Format (for example, HH:mm:ss for a two-digit hour in 24-hour 
format, followed by a two-digit minute, followed by a two-digit second, all 
separated by ':' characters, as in 18:04:15).</td></tr><tr><td 
id="name">Timestamp Format</td><td id="default-value"></td><td 
id="allowable-values"></td><td id="description">Specifies the format to use 
when reading/writing Timestamp fields. If not specified, Timestamp fields will 
be assumed to be number of milliseco
 nds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must 
match the Java Simple Date Format (for example, MM/dd/yyyy HH:mm:ss for a 
two-digit month, followed by a two-digit day, followed by a four-digit year, 
all separated by '/' characters; and then followed by a two-digit hour in 
24-hour format, followed by a two-digit minute, followed by a two-digit second, 
all separated by ':' characters, as in 01/01/2017 18:04:15).</td></tr><tr><td 
id="name"><strong>CSV Format</strong></td><td id="default-value">custom</td><td 
id="allowable-values"><ul><li>Custom Format <img 
src="../../../../../html/images/iconInfo.png" alt="The format of the CSV is 
configured by using the properties of this Controller Service, such as Value 
Separator" title="The format of the CSV is configured by using the properties 
of this Controller Service, such as Value Separator"></img></li><li>RFC 4180 
<img src="../../../../../html/images/iconInfo.png" alt="CSV data follows the 
RFC 4180 Specification def
 ined at https://tools.ietf.org/html/rfc4180"; title="CSV data follows the RFC 
4180 Specification defined at 
https://tools.ietf.org/html/rfc4180";></img></li><li>Microsoft Excel <img 
src="../../../../../html/images/iconInfo.png" alt="CSV data follows the format 
used by Microsoft Excel" title="CSV data follows the format used by Microsoft 
Excel"></img></li><li>Tab-Delimited <img 
src="../../../../../html/images/iconInfo.png" alt="CSV data is Tab-Delimited 
instead of Comma Delimited" title="CSV data is Tab-Delimited instead of Comma 
Delimited"></img></li><li>MySQL Format <img 
src="../../../../../html/images/iconInfo.png" alt="CSV data follows the format 
used by MySQL" title="CSV data follows the format used by 
MySQL"></img></li><li>Informix Unload <img 
src="../../../../../html/images/iconInfo.png" alt="The format used by Informix 
when issuing the UNLOAD TO file_name command" title="The format used by 
Informix when issuing the UNLOAD TO file_name command"></img></li><li>Informix 
Unload Esc
 ape Disabled <img src="../../../../../html/images/iconInfo.png" alt="The 
format used by Informix when issuing the UNLOAD TO file_name command with 
escaping disabled" title="The format used by Informix when issuing the UNLOAD 
TO file_name command with escaping disabled"></img></li></ul></td><td 
id="description">Specifies which "format" the CSV data is in, or specifies if 
custom formatting should be used.</td></tr><tr><td id="name"><strong>Value 
Separator</strong></td><td id="default-value">,</td><td 
id="allowable-values"></td><td id="description">The character that is used to 
separate values/fields in a CSV Record</td></tr><tr><td id="name"><strong>Treat 
First Line as Header</strong></td><td id="default-value">false</td><td 
id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td 
id="description">Specifies whether or not the first line of CSV should be 
considered a Header or should be considered a record. If the Schema Access 
Strategy indicates that the columns must be defi
 ned in the header, then this property will be ignored, since the header must 
always be present and won't be processed as a Record. Otherwise, if 'true', 
then the first line of CSV data will not be processed as a record and if 
'false',then the first line will be interpreted as a record.</td></tr><tr><td 
id="name">Ignore CSV Header Column Names</td><td 
id="default-value">false</td><td 
id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td 
id="description">If the first line of a CSV is a header, and the configured 
schema does not match the fields named in the header line, this controls how 
the Reader will interpret the fields. If this property is true, then the field 
names mapped to each column are driven only by the configured schema and any 
fields not in the schema will be ignored. If this property is false, then the 
field names found in the CSV Header will be used as the names of the 
fields.</td></tr><tr><td id="name"><strong>Quote Character</strong></td><td 
id="default-
 value">"</td><td id="allowable-values"></td><td id="description">The character 
that is used to quote values so that escape characters do not have to be 
used</td></tr><tr><td id="name"><strong>Escape Character</strong></td><td 
id="default-value">\</td><td id="allowable-values"></td><td 
id="description">The character that is used to escape characters that would 
otherwise have a specific meaning to the CSV Parser.</td></tr><tr><td 
id="name">Comment Marker</td><td id="default-value"></td><td 
id="allowable-values"></td><td id="description">The character that is used to 
denote the start of a comment. Any line that begins with this comment will be 
ignored.</td></tr><tr><td id="name">Null String</td><td 
id="default-value"></td><td id="allowable-values"></td><td 
id="description">Specifies a String that, if present as a value in the CSV, 
should be considered a null field instead of using the literal 
value.</td></tr><tr><td id="name"><strong>Trim Fields</strong></td><td 
id="default-value">true
 </td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td 
id="description">Whether or not white space should be removed from the 
beginning and end of fields</td></tr></table><h3>State management: </h3>This 
component does not store state.<h3>Restricted: </h3>This component is not 
restricted.</body></html>
\ No newline at end of file

Added: 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVRecordSetWriter/index.html
URL: 
http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVRecordSetWriter/index.html?rev=1811008&view=auto
==============================================================================
--- 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVRecordSetWriter/index.html
 (added)
+++ 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVRecordSetWriter/index.html
 Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta 
charset="utf-8"></meta><title>CSVRecordSetWriter</title><link rel="stylesheet" 
href="../../../../../css/component-usage.css" 
type="text/css"></link></head><script type="text/javascript">window.onload = 
function(){if(self==top) { document.getElementById('nameHeader').style.display 
= "inherit"; } }</script><body><h1 id="nameHeader" style="display: 
none;">CSVRecordSetWriter</h1><h2>Description: </h2><p>Writes the contents of a 
RecordSet as CSV data. The first line written will be the column names (unless 
the 'Include Header Line' property is false). All subsequent lines will be the 
values corresponding to the record fields.</p><h3>Tags: </h3><p>csv, result, 
set, recordset, record, writer, serializer, row, tsv, tab, separated, 
delimited</p><h3>Properties: </h3><p>In the list below, the names of required 
properties appear in <strong>bold</strong>. Any other properties (not in bold) 
are considered optional. The table also indicates any default va
 lues, and whether a property supports the <a 
href="../../../../../html/expression-language-guide.html">NiFi Expression 
Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default 
Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td 
id="name"><strong>Schema Write Strategy</strong></td><td 
id="default-value">schema-name</td><td id="allowable-values"><ul><li>Set 
'schema.name' Attribute <img src="../../../../../html/images/iconInfo.png" 
alt="The FlowFile will be given an attribute named 'schema.name' and this 
attribute will indicate the name of the schema in the Schema Registry. Note 
that ifthe schema for a record is not obtained from a Schema Registry, then no 
attribute will be added." title="The FlowFile will be given an attribute named 
'schema.name' and this attribute will indicate the name of the schema in the 
Schema Registry. Note that ifthe schema for a record is not obtained from a 
Schema Registry, then no attribute will be added."></img></li><li>Set 'avro
 .schema' Attribute <img src="../../../../../html/images/iconInfo.png" alt="The 
FlowFile will be given an attribute named 'avro.schema' and this attribute will 
contain the Avro Schema that describes the records in the FlowFile. The 
contents of the FlowFile need not be Avro, but the text of the schema will be 
used." title="The FlowFile will be given an attribute named 'avro.schema' and 
this attribute will contain the Avro Schema that describes the records in the 
FlowFile. The contents of the FlowFile need not be Avro, but the text of the 
schema will be used."></img></li><li>HWX Schema Reference Attributes <img 
src="../../../../../html/images/iconInfo.png" alt="The FlowFile will be given a 
set of 3 attributes to describe the schema: 'schema.identifier', 
'schema.version', and 'schema.protocol.version'. Note that if the schema for a 
record does not contain the necessary identifier and version, an Exception will 
be thrown when attempting to write the data." title="The FlowFile will be giv
 en a set of 3 attributes to describe the schema: 'schema.identifier', 
'schema.version', and 'schema.protocol.version'. Note that if the schema for a 
record does not contain the necessary identifier and version, an Exception will 
be thrown when attempting to write the data."></img></li><li>HWX 
Content-Encoded Schema Reference <img 
src="../../../../../html/images/iconInfo.png" alt="The content of the FlowFile 
will contain a reference to a schema in the Schema Registry service. The 
reference is encoded as a single byte indicating the 'protocol version', 
followed by 8 bytes indicating the schema identifier, and finally 4 bytes 
indicating the schema version, as per the Hortonworks Schema Registry 
serializers and deserializers, as found at 
https://github.com/hortonworks/registry. This will be prepended to each 
FlowFile. Note that if the schema for a record does not contain the necessary 
identifier and version, an Exception will be thrown when attempting to write 
the data." title="The cont
 ent of the FlowFile will contain a reference to a schema in the Schema 
Registry service. The reference is encoded as a single byte indicating the 
'protocol version', followed by 8 bytes indicating the schema identifier, and 
finally 4 bytes indicating the schema version, as per the Hortonworks Schema 
Registry serializers and deserializers, as found at 
https://github.com/hortonworks/registry. This will be prepended to each 
FlowFile. Note that if the schema for a record does not contain the necessary 
identifier and version, an Exception will be thrown when attempting to write 
the data."></img></li><li>Confluent Schema Registry Reference <img 
src="../../../../../html/images/iconInfo.png" alt="The content of the FlowFile 
will contain a reference to a schema in the Schema Registry service. The 
reference is encoded as a single 'Magic Byte' followed by 4 bytes representing 
the identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter
 .html. This will be prepended to each FlowFile. Note that if the schema for a 
record does not contain the necessary identifier and version, an Exception will 
be thrown when attempting to write the data. This is based on the encoding used 
by version 3.2.x of the Confluent Schema Registry." title="The content of the 
FlowFile will contain a reference to a schema in the Schema Registry service. 
The reference is encoded as a single 'Magic Byte' followed by 4 bytes 
representing the identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html.
 This will be prepended to each FlowFile. Note that if the schema for a record 
does not contain the necessary identifier and version, an Exception will be 
thrown when attempting to write the data. This is based on the encoding used by 
version 3.2.x of the Confluent Schema Registry."></img></li><li>Do Not Write 
Schema <img src="../../../../../html/images/iconInfo.png" alt="Do not add any 
schem
 a-related information to the FlowFile." title="Do not add any schema-related 
information to the FlowFile."></img></li></ul></td><td 
id="description">Specifies how the schema for a Record should be added to the 
data.</td></tr><tr><td id="name"><strong>Schema Access 
Strategy</strong></td><td id="default-value">inherit-record-schema</td><td 
id="allowable-values"><ul><li>Use 'Schema Name' Property <img 
src="../../../../../html/images/iconInfo.png" alt="The name of the Schema to 
use is specified by the 'Schema Name' Property. The value of this property is 
used to lookup the Schema in the configured Schema Registry service." 
title="The name of the Schema to use is specified by the 'Schema Name' 
Property. The value of this property is used to lookup the Schema in the 
configured Schema Registry service."></img></li><li>Inherit Record Schema <img 
src="../../../../../html/images/iconInfo.png" alt="The schema used to write 
records will be the same schema that was given to the Record when the R
 ecord was created." title="The schema used to write records will be the same 
schema that was given to the Record when the Record was 
created."></img></li><li>Use 'Schema Text' Property <img 
src="../../../../../html/images/iconInfo.png" alt="The text of the Schema 
itself is specified by the 'Schema Text' Property. The value of this property 
must be a valid Avro Schema. If Expression Language is used, the value of the 
'Schema Text' property must be valid after substituting the expressions." 
title="The text of the Schema itself is specified by the 'Schema Text' 
Property. The value of this property must be a valid Avro Schema. If Expression 
Language is used, the value of the 'Schema Text' property must be valid after 
substituting the expressions."></img></li></ul></td><td 
id="description">Specifies how to obtain the schema that is to be used for 
interpreting the data.</td></tr><tr><td id="name">Schema Registry</td><td 
id="default-value"></td><td id="allowable-values"><strong>Controller 
 Service API: </strong><br/>SchemaRegistry<br/><strong>Implementations: 
</strong><a 
href="../../../nifi-registry-nar/1.4.0/org.apache.nifi.schemaregistry.services.AvroSchemaRegistry/index.html">AvroSchemaRegistry</a><br/><a
 
href="../../../nifi-hwx-schema-registry-nar/1.4.0/org.apache.nifi.schemaregistry.hortonworks.HortonworksSchemaRegistry/index.html">HortonworksSchemaRegistry</a><br/><a
 
href="../../../nifi-confluent-platform-nar/1.4.0/org.apache.nifi.confluent.schemaregistry.ConfluentSchemaRegistry/index.html">ConfluentSchemaRegistry</a></td><td
 id="description">Specifies the Controller Service to use for the Schema 
Registry</td></tr><tr><td id="name">Schema Name</td><td 
id="default-value">${schema.name}</td><td id="allowable-values"></td><td 
id="description">Specifies the name of the schema to lookup in the Schema 
Registry property<br/><strong>Supports Expression Language: 
true</strong></td></tr><tr><td id="name">Schema Text</td><td 
id="default-value">${avro.schema}</td><td id="al
 lowable-values"></td><td id="description">The text of an Avro-formatted 
Schema<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td 
id="name">Date Format</td><td id="default-value"></td><td 
id="allowable-values"></td><td id="description">Specifies the format to use 
when reading/writing Date fields. If not specified, Date fields will be assumed 
to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If 
specified, the value must match the Java Simple Date Format (for example, 
MM/dd/yyyy for a two-digit month, followed by a two-digit day, followed by a 
four-digit year, all separated by '/' characters, as in 
01/01/2017).</td></tr><tr><td id="name">Time Format</td><td 
id="default-value"></td><td id="allowable-values"></td><td 
id="description">Specifies the format to use when reading/writing Time fields. 
If not specified, Time fields will be assumed to be number of milliseconds 
since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match th
 e Java Simple Date Format (for example, HH:mm:ss for a two-digit hour in 
24-hour format, followed by a two-digit minute, followed by a two-digit second, 
all separated by ':' characters, as in 18:04:15).</td></tr><tr><td 
id="name">Timestamp Format</td><td id="default-value"></td><td 
id="allowable-values"></td><td id="description">Specifies the format to use 
when reading/writing Timestamp fields. If not specified, Timestamp fields will 
be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 
GMT). If specified, the value must match the Java Simple Date Format (for 
example, MM/dd/yyyy HH:mm:ss for a two-digit month, followed by a two-digit 
day, followed by a four-digit year, all separated by '/' characters; and then 
followed by a two-digit hour in 24-hour format, followed by a two-digit minute, 
followed by a two-digit second, all separated by ':' characters, as in 
01/01/2017 18:04:15).</td></tr><tr><td id="name"><strong>CSV 
Format</strong></td><td id="default-value">c
 ustom</td><td id="allowable-values"><ul><li>Custom Format <img 
src="../../../../../html/images/iconInfo.png" alt="The format of the CSV is 
configured by using the properties of this Controller Service, such as Value 
Separator" title="The format of the CSV is configured by using the properties 
of this Controller Service, such as Value Separator"></img></li><li>RFC 4180 
<img src="../../../../../html/images/iconInfo.png" alt="CSV data follows the 
RFC 4180 Specification defined at https://tools.ietf.org/html/rfc4180"; 
title="CSV data follows the RFC 4180 Specification defined at 
https://tools.ietf.org/html/rfc4180";></img></li><li>Microsoft Excel <img 
src="../../../../../html/images/iconInfo.png" alt="CSV data follows the format 
used by Microsoft Excel" title="CSV data follows the format used by Microsoft 
Excel"></img></li><li>Tab-Delimited <img 
src="../../../../../html/images/iconInfo.png" alt="CSV data is Tab-Delimited 
instead of Comma Delimited" title="CSV data is Tab-Delimited instead
  of Comma Delimited"></img></li><li>MySQL Format <img 
src="../../../../../html/images/iconInfo.png" alt="CSV data follows the format 
used by MySQL" title="CSV data follows the format used by 
MySQL"></img></li><li>Informix Unload <img 
src="../../../../../html/images/iconInfo.png" alt="The format used by Informix 
when issuing the UNLOAD TO file_name command" title="The format used by 
Informix when issuing the UNLOAD TO file_name command"></img></li><li>Informix 
Unload Escape Disabled <img src="../../../../../html/images/iconInfo.png" 
alt="The format used by Informix when issuing the UNLOAD TO file_name command 
with escaping disabled" title="The format used by Informix when issuing the 
UNLOAD TO file_name command with escaping disabled"></img></li></ul></td><td 
id="description">Specifies which "format" the CSV data is in, or specifies if 
custom formatting should be used.</td></tr><tr><td id="name"><strong>Value 
Separator</strong></td><td id="default-value">,</td><td id="allowable-value
 s"></td><td id="description">The character that is used to separate 
values/fields in a CSV Record</td></tr><tr><td id="name"><strong>Include Header 
Line</strong></td><td id="default-value">true</td><td 
id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td 
id="description">Specifies whether or not the CSV column names should be 
written out as the first line.</td></tr><tr><td id="name"><strong>Quote 
Character</strong></td><td id="default-value">"</td><td 
id="allowable-values"></td><td id="description">The character that is used to 
quote values so that escape characters do not have to be used</td></tr><tr><td 
id="name"><strong>Escape Character</strong></td><td 
id="default-value">\</td><td id="allowable-values"></td><td 
id="description">The character that is used to escape characters that would 
otherwise have a specific meaning to the CSV Parser.</td></tr><tr><td 
id="name">Comment Marker</td><td id="default-value"></td><td 
id="allowable-values"></td><td id="description">The
  character that is used to denote the start of a comment. Any line that begins 
with this comment will be ignored.</td></tr><tr><td id="name">Null 
String</td><td id="default-value"></td><td id="allowable-values"></td><td 
id="description">Specifies a String that, if present as a value in the CSV, 
should be considered a null field instead of using the literal 
value.</td></tr><tr><td id="name"><strong>Trim Fields</strong></td><td 
id="default-value">true</td><td 
id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td 
id="description">Whether or not white space should be removed from the 
beginning and end of fields</td></tr><tr><td id="name"><strong>Quote 
Mode</strong></td><td id="default-value">MINIMAL</td><td 
id="allowable-values"><ul><li>Quote All Values <img 
src="../../../../../html/images/iconInfo.png" alt="All values will be quoted 
using the configured quote character." title="All values will be quoted using 
the configured quote character."></img></li><li>Quote Minimal <i
 mg src="../../../../../html/images/iconInfo.png" alt="Values will be quoted 
only if they are contain special characters such as newline characters or field 
separators." title="Values will be quoted only if they are contain special 
characters such as newline characters or field 
separators."></img></li><li>Quote Non-Numeric Values <img 
src="../../../../../html/images/iconInfo.png" alt="Values will be quoted unless 
the value is a number." title="Values will be quoted unless the value is a 
number."></img></li><li>Do Not Quote Values <img 
src="../../../../../html/images/iconInfo.png" alt="Values will not be quoted. 
Instead, all special characters will be escaped using the configured escape 
character." title="Values will not be quoted. Instead, all special characters 
will be escaped using the configured escape 
character."></img></li></ul></td><td id="description">Specifies how fields 
should be quoted when they are written</td></tr><tr><td 
id="name"><strong>Record Separator</strong></td><t
 d id="default-value">\n</td><td id="allowable-values"></td><td 
id="description">Specifies the characters to use in order to separate CSV 
Records</td></tr><tr><td id="name"><strong>Include Trailing 
Delimiter</strong></td><td id="default-value">false</td><td 
id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td 
id="description">If true, a trailing delimiter will be added to each CSV Record 
that is written. If false, the trailing delimiter will be 
omitted.</td></tr></table><h3>State management: </h3>This component does not 
store state.<h3>Restricted: </h3>This component is not restricted.</body></html>
\ No newline at end of file

Added: 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/additionalDetails.html
URL: 
http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/additionalDetails.html?rev=1811008&view=auto
==============================================================================
--- 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/additionalDetails.html
 (added)
+++ 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/additionalDetails.html
 Tue Oct  3 13:30:16 2017
@@ -0,0 +1,405 @@
+<!DOCTYPE html>
+<html lang="en">
+    <!--
+      Licensed to the Apache Software Foundation (ASF) under one or more
+      contributor license agreements.  See the NOTICE file distributed with
+      this work for additional information regarding copyright ownership.
+      The ASF licenses this file to You under the Apache License, Version 2.0
+      (the "License"); you may not use this file except in compliance with
+      the License.  You may obtain a copy of the License at
+          http://www.apache.org/licenses/LICENSE-2.0
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+      See the License for the specific language governing permissions and
+      limitations under the License.
+    -->
+    <head>
+        <meta charset="utf-8"/>
+        <title>GrokReader</title>
+        <link rel="stylesheet" href="../../../../../css/component-usage.css" 
type="text/css"/>
+    </head>
+
+    <body>
+        <p>
+               The GrokReader Controller Service provides a means for parsing 
and structuring input that is
+               made up of unstructured text, such as log files. Grok allows 
users to add a naming construct to
+               Regular Expressions such that they can be composed in order to 
create expressions that are easier
+               to manage and work with. This Controller Service consists of 
one Required Property and a few Optional
+               Properties. The is named <code>Grok Pattern File</code> 
property specifies the filename of
+               a file that contains Grok Patterns that can be used for parsing 
log data. If not specified, a default
+               patterns file will be used. Its contents are provided below. 
There are also properties for specifying
+               the schema to use when parsing data. The schema is not 
required. However, when data is parsed
+               a Record is created that contains all of the fields present in 
the Grok Expression (explained below),
+               and all fields are of type String. If a schema is chosen, the 
field can be declared to be a different,
+               compatible type, such as number. Additionally, if the schema 
does not contain one of the fields in the
+               parsed data, that field will be ignored. This can be used to 
filter out fields that are not of interest.
+               </p>
+               
+               <p>
+               The Required Property is named <code>Grok Expression</code> and 
specifies how to parse each
+               incoming record. This is done by providing a Grok Expression 
such as:
+               <code>%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} 
\[%{DATA:thread}\] %{DATA:class} %{GREEDYDATA:message}</code>.
+               This Expression will parse Apache NiFi log messages. This is 
accomplished by specifying that a line begins
+               with the <code>TIMESTAMP_ISO8601</code> pattern (which is a 
Regular Expression defined in the default
+               Grok Patterns File). The value that matches this pattern is 
then given the name <code>timestamp</code>. As a result,
+               the value that matches this pattern will be assigned to a field 
named <code>timestamp</code> in the Record that
+               produced by this Controller Service.
+        </p>
+        
+        <p>
+               If a line is encountered in the FlowFile that does not match 
the configured Grok Expression, it is assumed that the line
+               is part of the previous message. If the line is the start of a 
stack trace, then the entire stack trace is read in and assigned
+               to a field named <code>STACK_TRACE</code>. Otherwise, the line 
is appended to the last field defined in the Grok Expression. This
+               is done because typically the last field is a 'message' type of 
field, which can consist of new-lines.
+        </p>
+
+
+               <h2>Schemas and Type Coercion</h2>
+               
+               <p>
+                       When a record is parsed from incoming data, it is 
separated into fields. Each of these fields is then looked up against the
+                       configured schema (by field name) in order to determine 
what the type of the data should be. If the field is not present in
+                       the schema, that field is omitted from the Record. If 
the field is found in the schema, the data type of the received data
+                       is compared against the data type specified in the 
schema. If the types match, the value of that field is used as-is. If the
+                       schema indicates that the field should be of a 
different type, then the Controller Service will attempt to coerce the data
+                       into the type specified by the schema. If the field 
cannot be coerced into the specified type, an Exception will be thrown.
+               </p>
+               
+               <p>
+                       The following rules apply when attempting to coerce a 
field value from one data type to another:
+               </p>
+                       
+               <ul>
+                       <li>Any data type can be coerced into a String 
type.</li>
+                       <li>Any numeric data type (Byte, Short, Int, Long, 
Float, Double) can be coerced into any other numeric data type.</li>
+                       <li>Any numeric value can be coerced into a Date, Time, 
or Timestamp type, by assuming that the Long value is the number of
+                       milliseconds since epoch (Midnight GMT, January 1, 
1970).</li>
+                       <li>A String value can be coerced into a Date, Time, or 
Timestamp type, if its format matches the configured "Date Format," "Time 
Format,"
+                               or "Timestamp Format."</li>
+                       <li>A String value can be coerced into a numeric value 
if the value is of the appropriate type. For example, the String value
+                               <code>8</code> can be coerced into any numeric 
type. However, the String value <code>8.2</code> can be coerced into a Double 
or Float
+                               type but not an Integer.</li>
+                       <li>A String value of "true" or "false" (regardless of 
case) can be coerced into a Boolean value.</li>
+                       <li>A String value that is not empty can be coerced 
into a Char type. If the String contains more than 1 character, the first 
character is used
+                               and the rest of the characters are ignored.</li>
+                       <li>Any "date/time" type (Date, Time, Timestamp) can be 
coerced into any other "date/time" type.</li>
+                       <li>Any "date/time" type can be coerced into a Long 
type, representing the number of milliseconds since epoch (Midnight GMT, 
January 1, 1970).</li>
+                       <li>Any "date/time" type can be coerced into a String. 
The format of the String is whatever DateFormat is configured for the 
corresponding
+                               property (Date Format, Time Format, Timestamp 
Format property).</li>
+               </ul>
+               
+               <p>
+                       If none of the above rules apply when attempting to 
coerce a value from one data type to another, the coercion will fail and an 
Exception
+                       will be thrown.
+               </p>
+               
+               
+
+        <h2>
+               Examples
+               </h2>
+        
+        <p>
+               As an example, consider that this Controller Service is 
configured with the following properties:
+        </p>
+
+               <table>
+               <head>
+                       <th>Property Name</th>
+                       <th>Property Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>Grok Expression</td>
+                               <td><code>%{TIMESTAMP_ISO8601:timestamp} 
%{LOGLEVEL:level} \[%{DATA:thread}\] %{DATA:class} 
%{GREEDYDATA:message}</code></td>
+                       </tr>
+               </body>
+       </table>
+
+        <p>
+               Additionally, let's consider a FlowFile whose contents consists 
of the following:
+        </p>
+
+        <code><pre>
+2016-08-04 13:26:32,473 INFO [Leader Election Notification Thread-1] 
o.a.n.c.l.e.CuratorLeaderElectionManager 
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1fa27ea5
 has been interrupted; no longer leader for role 'Cluster Coordinator'
+2016-08-04 13:26:32,474 ERROR [Leader Election Notification Thread-2] 
o.apache.nifi.controller.FlowController One
+Two
+Three
+org.apache.nifi.exception.UnitTestException: Testing to ensure we are able to 
capture stack traces
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_45]
+       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_45]
+        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_45]
+        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_45]
+        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_45]
+        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_45]
+        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
+Caused by: org.apache.nifi.exception.UnitTestException: Testing to ensure we 
are able to capture stack traces
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    ... 12 common frames omitted
+2016-08-04 13:26:35,475 WARN [Curator-Framework-0] 
org.apache.curator.ConnectionState Connection attempt unsuccessful after 3008 
(greater than max timeout of 3000). Resetting connection and trying again with 
a new connection.
+        </pre></code>
+       
+       <p>
+               In this case, the result will be that this FlowFile consists of 
3 different records. The first record will contain the following values:
+       </p>
+
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>timestamp</td>
+                               <td>2016-08-04 13:26:32,473</td>
+                       </tr>
+                       <tr>
+                               <td>level</td>
+                               <td>INFO</td>
+                       </tr>
+                       <tr>
+                               <td>thread</td>
+                               <td>Leader Election Notification Thread-1</td>
+                       </tr>
+                       <tr>
+                               <td>class</td>
+                               
<td>o.a.n.c.l.e.CuratorLeaderElectionManager</td>
+                       </tr>
+                       <tr>
+                               <td>message</td>
+                               
<td>org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1fa27ea5
 has been interrupted; no longer leader for role 'Cluster Coordinator'</td>
+                       </tr>
+                       <tr>
+                               <td>STACK_TRACE</td>
+                               <td><i>null</i></td>
+                       </tr>
+               </body>
+       </table>
+       
+       <p>
+               The second record will contain the following values:
+       </p>
+       
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>timestamp</td>
+                               <td>2016-08-04 13:26:32,474</td>
+                       </tr>
+                       <tr>
+                               <td>level</td>
+                               <td>ERROR</td>
+                       </tr>
+                       <tr>
+                               <td>thread</td>
+                               <td>Leader Election Notification Thread-2</td>
+                       </tr>
+                       <tr>
+                               <td>class</td>
+                               <td>o.apache.nifi.controller.FlowController</td>
+                       </tr>
+                       <tr>
+                               <td>message</td>
+                               <td>One<br />
+Two<br />
+Three</td>
+                       </tr>
+                       <tr>
+                               <td>STACK_TRACE</td>
+                               <td>
+<pre>
+org.apache.nifi.exception.UnitTestException: Testing to ensure we are able to 
capture stack traces
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_45]
+       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_45]
+        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_45]
+        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_45]
+        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_45]
+        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_45]
+        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
+Caused by: org.apache.nifi.exception.UnitTestException: Testing to ensure we 
are able to capture stack traces
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    at 
org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.getElectedActiveCoordinatorAddress(NodeClusterCoordinator.java:185)
+    ... 12 common frames omitted
+</pre></td>
+                       </tr>
+               </body>
+       </table>
+       
+               <p>
+                       The third record will contain the following values:
+               </p>            
+       
+               <table>
+               <head>
+                       <th>Field Name</th>
+                       <th>Field Value</th>
+               </head>
+               <body>
+                       <tr>
+                               <td>timestamp</td>
+                               <td>2016-08-04 13:26:35,475</td>
+                       </tr>
+                       <tr>
+                               <td>level</td>
+                               <td>WARN</td>
+                       </tr>
+                       <tr>
+                               <td>thread</td>
+                               <td>Curator-Framework-0</td>
+                       </tr>
+                       <tr>
+                               <td>class</td>
+                               <td>org.apache.curator.ConnectionState</td>
+                       </tr>
+                       <tr>
+                               <td>message</td>
+                               <td>Connection attempt unsuccessful after 3008 
(greater than max timeout of 3000). Resetting connection and trying again with 
a new connection.</td>
+                       </tr>
+                       <tr>
+                               <td>STACK_TRACE</td>
+                               <td><i>null</i></td>
+                       </tr>
+               </body>
+       </table>        
+
+               
+               <h2>
+               </h2>
+       
+       <h2>Default Patterns</h2>
+
+       <p>
+               The following patterns are available in the default Grok 
Pattern File:
+       </p>
+
+               <code>
+               <pre>
+# Log Levels
+LOGLEVEL 
([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)|FINE|FINER|FINEST|CONFIG
+
+# Syslog Dates: Month Day HH:MM:SS
+SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}
+PROG (?:[\w._/%-]+)
+SYSLOGPROG %{PROG:program}(?:\[%{POSINT:pid}\])?
+SYSLOGHOST %{IPORHOST}
+SYSLOGFACILITY <%{NONNEGINT:facility}.%{NONNEGINT:priority}>
+HTTPDATE %{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{INT}
+
+# Months: January, Feb, 3, 03, 12, December
+MONTH 
\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b
+MONTHNUM (?:0?[1-9]|1[0-2])
+MONTHNUM2 (?:0[1-9]|1[0-2])
+MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
+
+# Days: Monday, Tue, Thu, etc...
+DAY 
(?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)
+
+# Years?
+YEAR (?>\d\d){1,2}
+HOUR (?:2[0123]|[01]?[0-9])
+MINUTE (?:[0-5][0-9])
+# '60' is a leap second in most time standards and thus is valid.
+SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
+TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
+
+# datestamp is YYYY/MM/DD-HH:MM:SS.UUUU (or something like it)
+DATE_US_MONTH_DAY_YEAR %{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}
+DATE_US_YEAR_MONTH_DAY %{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY}
+DATE_US %{DATE_US_MONTH_DAY_YEAR}|%{DATE_US_YEAR_MONTH_DAY}
+DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}
+ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
+ISO8601_SECOND (?:%{SECOND}|60)
+TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T 
]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
+DATE %{DATE_US}|%{DATE_EU}
+DATESTAMP %{DATE}[- ]%{TIME}
+TZ (?:[PMCE][SD]T|UTC)
+DATESTAMP_RFC822 %{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} %{TZ}
+DATESTAMP_RFC2822 %{DAY}, %{MONTHDAY} %{MONTH} %{YEAR} %{TIME} 
%{ISO8601_TIMEZONE}
+DATESTAMP_OTHER %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{TZ} %{YEAR}
+DATESTAMP_EVENTLOG %{YEAR}%{MONTHNUM2}%{MONTHDAY}%{HOUR}%{MINUTE}%{SECOND}
+
+
+POSINT \b(?:[1-9][0-9]*)\b
+NONNEGINT \b(?:[0-9]+)\b
+WORD \b\w+\b
+NOTSPACE \S+
+SPACE \s*
+DATA .*?
+GREEDYDATA .*
+QUOTEDSTRING 
(?>(?<!\\)(?>"(?>\\.|[^\\"]+)+"|""|(?>'(?>\\.|[^\\']+)+')|''|(?>`(?>\\.|[^\\`]+)+`)|``))
+UUID [A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}
+
+USERNAME [a-zA-Z0-9._-]+
+USER %{USERNAME}
+INT (?:[+-]?(?:[0-9]+))
+BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))
+NUMBER (?:%{BASE10NUM})
+BASE16NUM (?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))
+BASE16FLOAT 
\b(?<![0-9A-Fa-f.])(?:[+-]?(?:0x)?(?:(?:[0-9A-Fa-f]+(?:\.[0-9A-Fa-f]*)?)|(?:\.[0-9A-Fa-f]+)))\b
+
+# Networking
+MAC (?:%{CISCOMAC}|%{WINDOWSMAC}|%{COMMONMAC})
+CISCOMAC (?:(?:[A-Fa-f0-9]{4}\.){2}[A-Fa-f0-9]{4})
+WINDOWSMAC (?:(?:[A-Fa-f0-9]{2}-){5}[A-Fa-f0-9]{2})
+COMMONMAC (?:(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2})
+IPV6 
((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5
 ]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?
+IPV4 
(?<![0-9])(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))(?![0-9])
+IP (?:%{IPV6}|%{IPV4})
+HOSTNAME 
\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
+HOST %{HOSTNAME}
+IPORHOST (?:%{HOSTNAME}|%{IP})
+HOSTPORT %{IPORHOST}:%{POSINT}
+
+# paths
+PATH (?:%{UNIXPATH}|%{WINPATH})
+UNIXPATH (?>/(?>[\w_%!$@:.,-]+|\\.)*)+
+TTY (?:/dev/(pts|tty([pq])?)(\w+)?/?(?:[0-9]+))
+WINPATH (?>[A-Za-z]+:|\\)(?:\\[^\\?*]*)+
+URIPROTO [A-Za-z]+(\+[A-Za-z+]+)?
+URIHOST %{IPORHOST}(?::%{POSINT:port})?
+# uripath comes loosely from RFC1738, but mostly from what Firefox
+# doesn't turn into %XX
+URIPATH (?:/[A-Za-z0-9$.+!*'(){},~:;=@#%_\-]*)+
+#URIPARAM 
\?(?:[A-Za-z0-9]+(?:=(?:[^&]*))?(?:&(?:[A-Za-z0-9]+(?:=(?:[^&]*))?)?)*)?
+URIPARAM \?[A-Za-z0-9$.+!*'|(){},~@#%&/=:;_?\-\[\]]*
+URIPATHPARAM %{URIPATH}(?:%{URIPARAM})?
+URI %{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?
+
+# Shortcuts
+QS %{QUOTEDSTRING}
+
+# Log formats
+SYSLOGBASE %{SYSLOGTIMESTAMP:timestamp} (?:%{SYSLOGFACILITY} 
)?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:
+COMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} 
\[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: 
HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} 
(?:%{NUMBER:bytes}|-)
+COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}
+               </pre>
+               </code>
+
+    </body>
+</html>

Added: 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/index.html
URL: 
http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/index.html?rev=1811008&view=auto
==============================================================================
--- 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/index.html
 (added)
+++ 
nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/index.html
 Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta 
charset="utf-8"></meta><title>GrokReader</title><link rel="stylesheet" 
href="../../../../../css/component-usage.css" 
type="text/css"></link></head><script type="text/javascript">window.onload = 
function(){if(self==top) { document.getElementById('nameHeader').style.display 
= "inherit"; } }</script><body><h1 id="nameHeader" style="display: 
none;">GrokReader</h1><h2>Description: </h2><p>Provides a mechanism for reading 
unstructured text data, such as log files, and structuring the data so that it 
can be processed. The service is configured using Grok patterns. The service 
reads from a stream of data and splits each message that it finds into a 
separate Record, each containing the fields that are configured. If a line in 
the input does not match the expected message pattern, the line of text is 
either considered to be part of the previous message or is skipped, depending 
on the configuration, with the exception of stack traces. A stack trace th
 at is found at the end of a log message is considered to be part of the 
previous message but is added to the 'stackTrace' field of the Record. If a 
record has no stack trace, it will have a NULL value for the stackTrace field 
(assuming that the schema does in fact include a stackTrace field of type 
String). Assuming that the schema includes a '_raw' field of type String, the 
raw message will be included in the Record.</p><p><a 
href="additionalDetails.html">Additional Details...</a></p><h3>Tags: 
</h3><p>grok, logs, logfiles, parse, unstructured, text, record, reader, regex, 
pattern, logstash</p><h3>Properties: </h3><p>In the list below, the names of 
required properties appear in <strong>bold</strong>. Any other properties (not 
in bold) are considered optional. The table also indicates any default values, 
and whether a property supports the <a 
href="../../../../../html/expression-language-guide.html">NiFi Expression 
Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default 
 Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td 
id="name"><strong>Schema Access Strategy</strong></td><td 
id="default-value">string-fields-from-grok-expression</td><td 
id="allowable-values"><ul><li>Use String Fields From Grok Expression <img 
src="../../../../../html/images/iconInfo.png" alt="The schema will be derived 
by using the field names present in the Grok Expression. All fields will be 
assumed to be of type String. Additionally, a field will be included with a 
name of 'stackTrace' and a type of String." title="The schema will be derived 
by using the field names present in the Grok Expression. All fields will be 
assumed to be of type String. Additionally, a field will be included with a 
name of 'stackTrace' and a type of String."></img></li><li>Use 'Schema Name' 
Property <img src="../../../../../html/images/iconInfo.png" alt="The name of 
the Schema to use is specified by the 'Schema Name' Property. The value of this 
property is used to lookup the Schema in 
 the configured Schema Registry service." title="The name of the Schema to use 
is specified by the 'Schema Name' Property. The value of this property is used 
to lookup the Schema in the configured Schema Registry 
service."></img></li><li>Use 'Schema Text' Property <img 
src="../../../../../html/images/iconInfo.png" alt="The text of the Schema 
itself is specified by the 'Schema Text' Property. The value of this property 
must be a valid Avro Schema. If Expression Language is used, the value of the 
'Schema Text' property must be valid after substituting the expressions." 
title="The text of the Schema itself is specified by the 'Schema Text' 
Property. The value of this property must be a valid Avro Schema. If Expression 
Language is used, the value of the 'Schema Text' property must be valid after 
substituting the expressions."></img></li><li>HWX Schema Reference Attributes 
<img src="../../../../../html/images/iconInfo.png" alt="The FlowFile contains 3 
Attributes that will be used to looku
 p a Schema from the configured Schema Registry: 'schema.identifier', 
'schema.version', and 'schema.protocol.version'" title="The FlowFile contains 3 
Attributes that will be used to lookup a Schema from the configured Schema 
Registry: 'schema.identifier', 'schema.version', and 
'schema.protocol.version'"></img></li><li>HWX Content-Encoded Schema Reference 
<img src="../../../../../html/images/iconInfo.png" alt="The content of the 
FlowFile contains a reference to a schema in the Schema Registry service. The 
reference is encoded as a single byte indicating the 'protocol version', 
followed by 8 bytes indicating the schema identifier, and finally 4 bytes 
indicating the schema version, as per the Hortonworks Schema Registry 
serializers and deserializers, found at 
https://github.com/hortonworks/registry"; title="The content of the FlowFile 
contains a reference to a schema in the Schema Registry service. The reference 
is encoded as a single byte indicating the 'protocol version', followed by 8
  bytes indicating the schema identifier, and finally 4 bytes indicating the 
schema version, as per the Hortonworks Schema Registry serializers and 
deserializers, found at 
https://github.com/hortonworks/registry";></img></li><li>Confluent 
Content-Encoded Schema Reference <img 
src="../../../../../html/images/iconInfo.png" alt="The content of the FlowFile 
contains a reference to a schema in the Schema Registry service. The reference 
is encoded as a single 'Magic Byte' followed by 4 bytes representing the 
identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html.
 This is based on version 3.2.x of the Confluent Schema Registry." title="The 
content of the FlowFile contains a reference to a schema in the Schema Registry 
service. The reference is encoded as a single 'Magic Byte' followed by 4 bytes 
representing the identifier of the schema, as outlined at 
http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.
 html. This is based on version 3.2.x of the Confluent Schema 
Registry."></img></li></ul></td><td id="description">Specifies how to obtain 
the schema that is to be used for interpreting the data.</td></tr><tr><td 
id="name">Schema Registry</td><td id="default-value"></td><td 
id="allowable-values"><strong>Controller Service API: 
</strong><br/>SchemaRegistry<br/><strong>Implementations: </strong><a 
href="../../../nifi-registry-nar/1.4.0/org.apache.nifi.schemaregistry.services.AvroSchemaRegistry/index.html">AvroSchemaRegistry</a><br/><a
 
href="../../../nifi-hwx-schema-registry-nar/1.4.0/org.apache.nifi.schemaregistry.hortonworks.HortonworksSchemaRegistry/index.html">HortonworksSchemaRegistry</a><br/><a
 
href="../../../nifi-confluent-platform-nar/1.4.0/org.apache.nifi.confluent.schemaregistry.ConfluentSchemaRegistry/index.html">ConfluentSchemaRegistry</a></td><td
 id="description">Specifies the Controller Service to use for the Schema 
Registry</td></tr><tr><td id="name">Schema Name</td><td i
 d="default-value">${schema.name}</td><td id="allowable-values"></td><td 
id="description">Specifies the name of the schema to lookup in the Schema 
Registry property<br/><strong>Supports Expression Language: 
true</strong></td></tr><tr><td id="name">Schema Text</td><td 
id="default-value">${avro.schema}</td><td id="allowable-values"></td><td 
id="description">The text of an Avro-formatted Schema<br/><strong>Supports 
Expression Language: true</strong></td></tr><tr><td id="name">Grok Pattern 
File</td><td id="default-value"></td><td id="allowable-values"></td><td 
id="description">Path to a file that contains Grok Patterns to use for parsing 
logs. If not specified, a built-in default Pattern file will be used. If 
specified, all patterns in the given pattern file will override the default 
patterns. See the Controller Service's Additional Details for a list of 
pre-defined patterns.<br/><strong>Supports Expression Language: 
true</strong></td></tr><tr><td id="name"><strong>Grok Expression</stron
 g></td><td id="default-value"></td><td id="allowable-values"></td><td 
id="description">Specifies the format of a log line in Grok format. This allows 
the Record Reader to understand how to parse each log line. If a line in the 
log file does not match this pattern, the line will be assumed to belong to the 
previous log message.</td></tr><tr><td id="name"><strong>No Match 
Behavior</strong></td><td id="default-value">append-to-previous-message</td><td 
id="allowable-values"><ul><li>Append to Previous Message <img 
src="../../../../../html/images/iconInfo.png" alt="The line of text that does 
not match the Grok Expression will be appended to the last field of the prior 
message." title="The line of text that does not match the Grok Expression will 
be appended to the last field of the prior message."></img></li><li>Skip Line 
<img src="../../../../../html/images/iconInfo.png" alt="The line of text that 
does not match the Grok Expression will be skipped." title="The line of text 
that does not 
 match the Grok Expression will be skipped."></img></li></ul></td><td 
id="description">If a line of text is encountered and it does not match the 
given Grok Expression, and it is not part of a stack trace, this property 
specifies how the text should be processed.</td></tr></table><h3>State 
management: </h3>This component does not store state.<h3>Restricted: </h3>This 
component is not restricted.</body></html>
\ No newline at end of file

svn commit: r1811008 [15/43] - in /nifi/site/trunk/docs: ./ nifi-docs/ nifi-docs/components/ nifi-docs/components/org.apache.nifi/ nifi-docs/components/org.apache.nifi/nifi-ambari-nar/ nifi-docs/components/org.apache.nifi/nifi-ambari-nar/1.4.0/ nifi-do...

Reply via email to