Wenzhe Zhou created IMPALA-12376:
------------------------------------
Summary: DataSourceScanNode drop some returned rows if
FLAGS_data_source_batch_size is greater than default value
Key: IMPALA-12376
URL: https://issues.apache.org/jira/browse/IMPALA-12376
Project: IMPALA
Issue Type: Sub-task
Components: Backend
Reporter: Wenzhe Zhou
Assignee: Wenzhe Zhou
Backend DataSourceScanNode (be/src/exec/data-source-scan-node.cc) does not
handle eos properly in function DataSourceScanNode::GetNext(). Rows, which are
returned from external data source, could be dropped if
FLAGS_data_source_batch_size is set with value which is greater than default
value 1024.
In following code:
if (row_batch->AtCapacity() || input_batch_->eos || ReachedLimit()) {
*eos = input_batch_->eos || ReachedLimit();
eos could be set as true when some rows in input batch are not processed if
row_batch->AtCapacity() return true.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)