[jira] [Commented] (NIFI-1251) Allow ExecuteSQL to send out large result sets in chunks

ASF GitHub Bot (JIRA) Mon, 09 Jul 2018 09:22:30 -0700


    [ 
https://issues.apache.org/jira/browse/NIFI-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537177#comment-16537177
 ]


ASF GitHub Bot commented on NIFI-1251:
--------------------------------------

Github user mattyb149 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2834#discussion_r201064263
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExecuteSQL.java
 ---
    @@ -146,6 +162,31 @@
                 .sensitive(false)
                 .build();
     
    +    public static final PropertyDescriptor MAX_ROWS_PER_FLOW_FILE = new 
PropertyDescriptor.Builder()
    +            .name("esql-max-rows")
    +            .displayName("Max Rows Per Flow File")
    +            .description("The maximum number of result rows that will be 
included in a single FlowFile. This will allow you to break up very large "
    +                    + "result sets into multiple FlowFiles. If the value 
specified is zero, then all rows are returned in a single FlowFile.")
    +            .defaultValue("0")
    +            .required(true)
    +            
.addValidator(StandardValidators.NON_NEGATIVE_INTEGER_VALIDATOR)
    +            
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
    +            .build();
    +
    +    public static final PropertyDescriptor OUTPUT_BATCH_SIZE = new 
PropertyDescriptor.Builder()
    +            .name("esql-output-batch-size")
    +            .displayName("Output Batch Size")
    +            .description("The number of output FlowFiles to queue before 
committing the process session. When set to zero, the session will be committed 
when all result set rows "
    +                    + "have been processed and the output FlowFiles are 
ready for transfer to the downstream relationship. For large result sets, this 
can cause a large burst of FlowFiles "
    +                    + "to be transferred at the end of processor 
execution. If this property is set, then when the specified number of FlowFiles 
are ready for transfer, then the session will "
    +                    + "be committed, thus releasing the FlowFiles to the 
downstream relationship. NOTE: The maxvalue.* and fragment.count attributes 
will not be set on FlowFiles when this "
    --- End diff --
    
    Minor issue, we don't write maxvalue.* attributes in ExecuteSQL, I'll 
remove that reference while merging


> Allow ExecuteSQL to send out large result sets in chunks
> --------------------------------------------------------
>
>                 Key: NIFI-1251
>                 URL: https://issues.apache.org/jira/browse/NIFI-1251
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Peter Wicks
>            Priority: Major
>
> Currently, when using ExecuteSQL, if a result set is very large, it can take 
> quite a long time to pull back all of the results. It would be nice to have 
> the ability to specify the maximum number of records to put into a FlowFile, 
> so that if we pull back say 1 million records we can configure it to create 
> 1000 FlowFiles, each with 1000 records. This way, we can begin processing the 
> first 1,000 records while the next 1000 are being pulled from the remote 
> database.
> This suggestion comes from Vinay via the dev@ mailing list:
> Is there way to have streaming feature when large result set is fetched from
> database basically to reads data from the database in chunks of records
> instead of loading the full result set into memory.
> As part of ExecuteSQL can a property be specified called "FetchSize" which
> Indicates how many rows should be fetched from the resultSet.
> Since jam bit new in using NIFI , can any guide me on above.
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-1251) Allow ExecuteSQL to send out large result sets in chunks

Reply via email to