[
https://issues.apache.org/jira/browse/NIFI-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537177#comment-16537177
]
ASF GitHub Bot commented on NIFI-1251:
--------------------------------------
Github user mattyb149 commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2834#discussion_r201064263
--- Diff:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExecuteSQL.java
---
@@ -146,6 +162,31 @@
.sensitive(false)
.build();
+ public static final PropertyDescriptor MAX_ROWS_PER_FLOW_FILE = new
PropertyDescriptor.Builder()
+ .name("esql-max-rows")
+ .displayName("Max Rows Per Flow File")
+ .description("The maximum number of result rows that will be
included in a single FlowFile. This will allow you to break up very large "
+ + "result sets into multiple FlowFiles. If the value
specified is zero, then all rows are returned in a single FlowFile.")
+ .defaultValue("0")
+ .required(true)
+
.addValidator(StandardValidators.NON_NEGATIVE_INTEGER_VALIDATOR)
+
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
+ .build();
+
+ public static final PropertyDescriptor OUTPUT_BATCH_SIZE = new
PropertyDescriptor.Builder()
+ .name("esql-output-batch-size")
+ .displayName("Output Batch Size")
+ .description("The number of output FlowFiles to queue before
committing the process session. When set to zero, the session will be committed
when all result set rows "
+ + "have been processed and the output FlowFiles are
ready for transfer to the downstream relationship. For large result sets, this
can cause a large burst of FlowFiles "
+ + "to be transferred at the end of processor
execution. If this property is set, then when the specified number of FlowFiles
are ready for transfer, then the session will "
+ + "be committed, thus releasing the FlowFiles to the
downstream relationship. NOTE: The maxvalue.* and fragment.count attributes
will not be set on FlowFiles when this "
--- End diff --
Minor issue, we don't write maxvalue.* attributes in ExecuteSQL, I'll
remove that reference while merging
> Allow ExecuteSQL to send out large result sets in chunks
> --------------------------------------------------------
>
> Key: NIFI-1251
> URL: https://issues.apache.org/jira/browse/NIFI-1251
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Peter Wicks
> Priority: Major
>
> Currently, when using ExecuteSQL, if a result set is very large, it can take
> quite a long time to pull back all of the results. It would be nice to have
> the ability to specify the maximum number of records to put into a FlowFile,
> so that if we pull back say 1 million records we can configure it to create
> 1000 FlowFiles, each with 1000 records. This way, we can begin processing the
> first 1,000 records while the next 1000 are being pulled from the remote
> database.
> This suggestion comes from Vinay via the dev@ mailing list:
> Is there way to have streaming feature when large result set is fetched from
> database basically to reads data from the database in chunks of records
> instead of loading the full result set into memory.
> As part of ExecuteSQL can a property be specified called "FetchSize" which
> Indicates how many rows should be fetched from the resultSet.
> Since jam bit new in using NIFI , can any guide me on above.
> Thanks in advance
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)