Wenzhe Zhou has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/20653 )

Change subject: IMPALA-12377: Improve count(*) performance for jdbc external 
table
......................................................................

IMPALA-12377: Improve count(*) performance for jdbc external table

Backend function DataSourceScanNode::GetNext() handles count query
inefficiently. Even there are no column data returned from external
data source, it still tries to materialize rows and adds rows to
RowBatch one by one up to the number of row count. It also call
GetNextInputBatch() multiple times (count / batch_size), while
GetNextInputBatch() invokes JNI function in external data source.

This patch improves the DataSourceScanNode::GetNext() and
JdbcDataSource.getNext() to avoid unnecessary function calls.

Testing:
 - Ran query_test/test_ext_data_sources.py which consists count
   queries for jdbc external table.
 - Passed core-tests.

Change-Id: I9953dca949eb773022f1d6dcf48d8877857635d6
---
M be/src/exec/data-source-scan-node.cc
M 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
2 files changed, 33 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/20653/4
--
To view, visit http://gerrit.cloudera.org:8080/20653
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9953dca949eb773022f1d6dcf48d8877857635d6
Gerrit-Change-Number: 20653
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>

Reply via email to