This is an automated email from the ASF dual-hosted git repository.

exceptionfactory pushed a commit to branch support/nifi-1.x
in repository https://gitbox.apache.org/repos/asf/nifi.git


The following commit(s) were added to refs/heads/support/nifi-1.x by this push:
     new 6c99f1ed83 NIFI-13550 Added documentation about the ExcelReader 
Starting Row Strategy
6c99f1ed83 is described below

commit 6c99f1ed837c6e0b756b08ac646a846d9842d770
Author: dan-s1 <[email protected]>
AuthorDate: Mon Jul 15 19:27:11 2024 +0000

    NIFI-13550 Added documentation about the ExcelReader Starting Row Strategy
    
    (cherry picked from commit 1ff5ebd6fcd7fe4b312ed8dc8ddb5366535ecddf)
    
    Signed-off-by: David Handermann <[email protected]>
---
 .../additionalDetails.html                                 | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git 
a/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-services/src/main/resources/docs/org.apache.nifi.excel.ExcelReader/additionalDetails.html
 
b/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-services/src/main/resources/docs/org.apache.nifi.excel.ExcelReader/additionalDetails.html
index 561d43afec..4cda682e0c 100644
--- 
a/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-services/src/main/resources/docs/org.apache.nifi.excel.ExcelReader/additionalDetails.html
+++ 
b/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-services/src/main/resources/docs/org.apache.nifi.excel.ExcelReader/additionalDetails.html
@@ -25,6 +25,9 @@
                The ExcelReader allows for interpreting input data as delimited 
Records. Each row in an Excel spreadsheet is a record
                        and each cell is considered a field. The reader allows 
for choosing which row to start from and which sheets
                        in a spreadsheet to ingest.
+                       When using the "Use Starting Row" strategy, the field 
names will be assumed to be the column names from the configured
+                       starting row. If there are any column(s) from the 
starting row which are blank, they are automatically assigned a field name
+                       using the cell number prefixed with "column_".
                        When using the "Infer Schema" strategy, the field names 
will be assumed to be the
                        cell numbers of each column prefixed with "column_". 
Otherwise, the names of fields can be supplied
                        when specifying the schema by using the Schema Text or 
looking up the schema in a Schema Registry.
@@ -70,13 +73,16 @@
                        will be thrown.
                </p>
 
-
-        <h2>Schema Inference</h2>
+        <h2>Use Starting Row and Schema Inference</h2>
 
         <p>
             While NiFi's Record API does require that each Record have a 
schema, it is often convenient to infer the schema based on the values in the 
data,
-            rather than having to manually create a schema. This is 
accomplished by selecting a value of "Infer Schema" for the "Schema Access 
Strategy" property.
-            When using this strategy, the Reader will determine the schema by 
first parsing all data in the FlowFile, keeping track of all fields that it has 
encountered
+            rather than having to manually create a schema. This is 
accomplished by selecting either value of "Use Starting Row" or "Infer Schema" 
for the
+                       "Schema Access Strategy" property. When using the "Use 
Starting Row" strategy, the Reader will determine the schema by parsing the 
first ten rows
+                       after the configured starting row of the data in the 
FlowFile all the while keeping track of all fields that it has encountered
+                       and the type of each field. A schema is then formed 
that encompasses all encountered fields. A schema can even be inferred if there 
are blank lines
+                       within those ten rows, but if they are all blank, then 
this strategy will fail to create a schema.
+            When using the "Infer Schema" strategy, the Reader will determine 
the schema by first parsing all data in the FlowFile, keeping track of all 
fields that it has encountered
             and the type of each field. Once all data has been parsed, a 
schema is formed that encompasses all fields that have been encountered.
         </p>
 

Reply via email to