[
https://issues.apache.org/jira/browse/NIFI-14579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17954772#comment-17954772
]
Daniel Stieglitz commented on NIFI-14579:
-----------------------------------------
[~exceptionfactory] I think I see it in InferSchemaAccessStrategy on line 46
{code:java}
contentStream.mark(1_000_000);{code}
> Add parameter to configure number of rows used in schema inference with
> header in ExcelReader service
> -----------------------------------------------------------------------------------------------------
>
> Key: NIFI-14579
> URL: https://issues.apache.org/jira/browse/NIFI-14579
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Piotr Zalas
> Assignee: Daniel Stieglitz
> Priority: Major
>
> Currently ExcelReader service allows to configure *Schema Access Strategy*
> parameter as {*}Use Starting Row{*}. With this parameter, only 10 first rows
> are used to infer schema of columns in the sheet, as opposed to *Infer
> Schema* strategy.
> My user requests that all rows in the sheet are used to infer schema.
> Moreover, the service is used in QueryRecord processor to limit number of
> rows read (as described in NIFI-14427). The user wants to infer schema only
> from rows they read, not all rows in the sheet.
> It would be great to add parameter to ExcelReader that allows to configure
> number of rows read during schema inference (i.e.
> {{ExcelHeaderSchemaStrategy#NUM_ROWS_TO_DETERMINE_TYPES}} variable). The
> parameter could probably show conditionally based on value of {*}Schema
> Access Strategy{*}. Value 0 could have special meaning that all rows in the
> sheet should be read. The default value could be 10 to preserve existing
> behavior.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)