[
https://issues.apache.org/jira/browse/IMPALA-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785744#comment-17785744
]
ASF subversion and git services commented on IMPALA-12377:
----------------------------------------------------------
Commit d318f1c99208196abdf0d0f57490439ddb421cab in impala's branch
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d318f1c99 ]
IMPALA-12377: Improve count(*) performance for jdbc external table
Backend function DataSourceScanNode::GetNext() handles count query
inefficiently. Even when there are no column data returned from
external data source, it still tries to materialize rows and add
rows to RowBatch one by one up to the number of row count. It also
call GetNextInputBatch() multiple times (count / batch_size), while
GetNextInputBatch() invokes JNI function in external data source.
This patch improves the DataSourceScanNode::GetNext() and
JdbcDataSource.getNext() to avoid unnecessary function calls.
Testing:
- Ran query_test/test_ext_data_sources.py which consists count
queries for jdbc external table.
- Passed core-tests.
Change-Id: I9953dca949eb773022f1d6dcf48d8877857635d6
Reviewed-on: http://gerrit.cloudera.org:8080/20653
Reviewed-by: Abhishek Rawat <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Improve count star performance for external data source
> -------------------------------------------------------
>
> Key: IMPALA-12377
> URL: https://issues.apache.org/jira/browse/IMPALA-12377
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend, Frontend
> Reporter: Wenzhe Zhou
> Assignee: Wenzhe Zhou
> Priority: Major
>
> The code to handle count(*) query in backend function
> DataSourceScanNode::GetNext() are not efficient. Even there are no column
> data returned from external data source, it still try to materialize rows and
> add rows to RowBatch one by one up to the number of row count. It also call
> GetNextInputBatch() multiple times (count / batch_size), while
> GetNextInputBatch() invoke JNI function.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]