Wenzhe Zhou created IMPALA-12377:
------------------------------------
Summary: Improve 'select count(*)' for external data source
Key: IMPALA-12377
URL: https://issues.apache.org/jira/browse/IMPALA-12377
Project: IMPALA
Issue Type: Sub-task
Components: Backend, Frontend
Reporter: Wenzhe Zhou
The code to handle 'select count(*)' in backend function
DataSourceScanNode::GetNext() are not efficient. Even there are no column data
returned from external data source, it still try to materialize rows and add
rows to RowBatch one by one up to the number of row count. It also call
GetNextInputBatch() multiple times (count / batch_size), while
GetNextInputBatch() invoke JNI function.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)