Alex Behm has uploaded a new patch set (#5). Change subject: IMPALA-3905: Add HdfsScanner::GetNext() interface and implementation for Parquet. ......................................................................
IMPALA-3905: Add HdfsScanner::GetNext() interface and implementation for Parquet. This is a first step towards making our scan node single threaded since we are moving to an execution model where multi-threading is done at the fragment level. This patch adds a new synchronous HdfsScanner::GetNext() interface and implements it for the Parquet scanner. The async execution via HdfsScanner::ProcessSplit() is still supported and is implemented by repeatedly calling GetNext() for code sharing purposes. I did not yet add a single-threaded scan node that uses GetNext(). The new code will be excercised by the existing scan node and tests. Testing: I ran an exhaustive private build which passed. I also ran a microbenchmark on a big TPCH lineitem table and there was no significant difference in scan performance. Change-Id: Iab50770bac05afcda4d3404fb4f53a0104931eb0 --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/base-sequence-scanner.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-rcfile-scanner.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/exec/parquet-column-readers.h 18 files changed, 506 insertions(+), 365 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/3801/5 -- To view, visit http://gerrit.cloudera.org:8080/3801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iab50770bac05afcda4d3404fb4f53a0104931eb0 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]>
