Wenzhe Zhou created IMPALA-12377:
------------------------------------

             Summary: Improve 'select count(*)' for external data source
                 Key: IMPALA-12377
                 URL: https://issues.apache.org/jira/browse/IMPALA-12377
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend, Frontend
            Reporter: Wenzhe Zhou


The code to handle 'select count(*)' in backend function 
DataSourceScanNode::GetNext() are not efficient. Even there are no column data 
returned from external data source, it still try to materialize rows and add 
rows to RowBatch one by one up to the number of row count.  It also call 
GetNextInputBatch() multiple times (count / batch_size), while  
GetNextInputBatch() invoke JNI function.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to