[
https://issues.apache.org/jira/browse/NIFI-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707552#comment-17707552
]
David Handermann commented on NIFI-11167:
-----------------------------------------
[~dstiegli1] Regarding the styles, thanks for researching the details. Although
converting numbers to dates should not be a problem, it sounds like the styles
are necessary to help determine whether the column is a date. With that
background, it seems like reading styles should always be enabled, or perhaps
limited to when using the infer schema strategy.
Regarding the infer schema strategy, the
[CSVSchemaInference|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/CSVSchemaInference.java]
and
[CSVHeaderSchemaStrategy|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/CSVHeaderSchemaStrategy.java]
have some examples, but I am not aware of any other documentation aside from
related Jira issues.
> Add Excel Record Reader
> -----------------------
>
> Key: NIFI-11167
> URL: https://issues.apache.org/jira/browse/NIFI-11167
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Extensions
> Reporter: David Handermann
> Assignee: Daniel Stieglitz
> Priority: Minor
>
> A new Excel Record Reader should be implemented to support reading XSLX
> spreadsheet rows as NiFi Records. This Reader will enable integration with
> various record-oriented components, obviating the need for the narrowly
> focused ConvertExcelToCSVProcessor. The initial version of the Excel Reader
> should not support the legacy binary XLS format.
> The ExcelReader should use a library that supports reading from a stream of
> rows to avoid consuming large amounts of heap memory during processing.
> The ExcelReader should support configurable properties to read selected
> sheets. With Excel supporting typed field values, some amount of field type
> mapping will be required. Additional input filtering properties should not be
> implemented as existing Processors like QueryRecord support a wide variety of
> filtering and projection use cases.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)