[ 
https://issues.apache.org/jira/browse/NIFI-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701968#comment-17701968
 ] 

David Handermann commented on NIFI-11167:
-----------------------------------------

[~dstiegli1] It sounds like a row with all null values should be returned from 
the Reader. In general, the Reader should provide a record-oriented 
representation of the Excel document.

Many of the properties in ConvertExcelToCSVProcessor are also specific to the 
CSV output format, so would not apply to a new Excel Reader.

A property to select Excel Sheets to extract seems like something that should 
be supported. It may be better to implement that using a Regular Expression 
named something like Sheet Name Pattern, defaulted to {{.*}} to include all 
Sheets.

> Add Excel Record Reader
> -----------------------
>
>                 Key: NIFI-11167
>                 URL: https://issues.apache.org/jira/browse/NIFI-11167
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: David Handermann
>            Assignee: Daniel Stieglitz
>            Priority: Minor
>
> A new Excel Record Reader should be implemented to support reading XSLX 
> spreadsheet rows as NiFi Records. This Reader will enable integration with 
> various record-oriented components, obviating the need for the narrowly 
> focused ConvertExcelToCSVProcessor. The initial version of the Excel Reader 
> should not support the legacy binary XLS format.
> The ExcelReader should use a library that supports reading from a stream of 
> rows to avoid consuming large amounts of heap memory during processing.
> The ExcelReader should support configurable properties to read selected 
> sheets. With Excel supporting typed field values, some amount of field type 
> mapping will be required. Additional input filtering properties should not be 
> implemented as existing Processors like QueryRecord support a wide variety of 
> filtering and projection use cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to