[jira] [Commented] (NIFI-4473) Add support for large result sets and normalizing Avro names to SelectHiveQL

ASF GitHub Bot (JIRA) Tue, 17 Oct 2017 07:04:24 -0700

    [ 
https://issues.apache.org/jira/browse/NIFI-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207682#comment-16207682
 ]


ASF GitHub Bot commented on NIFI-4473:
--------------------------------------

Github user mattyb149 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2212#discussion_r145140092
  
    --- Diff: 
nifi-nar-bundles/nifi-hive-bundle/nifi-hive-processors/src/main/java/org/apache/nifi/processors/hive/SelectHiveQL.java
 ---
    @@ -243,95 +284,152 @@ private void onTrigger(final ProcessContext context, 
final ProcessSession sessio
                 // If the query is not set, then an incoming flow file is 
required, and expected to contain a valid SQL select query.
                 // If there is no incoming connection, onTrigger will not be 
called as the processor will fail when scheduled.
                 final StringBuilder queryContents = new StringBuilder();
    -            session.read(fileToProcess, new InputStreamCallback() {
    -                @Override
    -                public void process(InputStream in) throws IOException {
    -                    queryContents.append(IOUtils.toString(in));
    -                }
    -            });
    +            session.read(fileToProcess, in -> 
queryContents.append(IOUtils.toString(in, charset)));
                 selectQuery = queryContents.toString();
             }
     
     
    +        final Integer fetchSize = 
context.getProperty(FETCH_SIZE).evaluateAttributeExpressions().asInteger();
    --- End diff --
    
    No reason, just a copy-paste from QueryDatabaseTable which doesn't accept 
incoming connections. I will update that and any of the others. Not sure if 
it's a valid use case either, but I can't see why we shouldn't, just in case.


> Add support for large result sets and normalizing Avro names to SelectHiveQL
> ----------------------------------------------------------------------------
>
>                 Key: NIFI-4473
>                 URL: https://issues.apache.org/jira/browse/NIFI-4473
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>
> A number of enhancements were made to processors like QueryDatabaseTable to 
> allow for such things as:
> - Splitting result sets into multiple flow files (i.e. Max Rows Per Flowfile 
> property)
> - Max number of splits/rows returned (Max fragments)
> - Normalizing names to be Avro-compatible
> The RDBMS processors also now support Avro logical types, but the version of 
> Avro needed by the current version of Hive (1.2.1) is Avro 1.7.7, which does 
> not support logical types.
> These enhancements were made to JdbcCommon, but not to HiveJdbcCommon (the 
> Hive version of the JDBC utils class). Since Hive queries can return even 
> larger result sets than traditional RDBMS, these properties/enhancements are 
> at least as valuable to have for SelectHiveQL.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (NIFI-4473) Add support for large result sets and normalizing Avro names to SelectHiveQL

Reply via email to