Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-3905: Add HdfsScanner::GetNext() interface and 
implementation for Parquet.
......................................................................

IMPALA-3905: Add HdfsScanner::GetNext() interface and implementation for 
Parquet.

This is a first step towards making our scan node single threaded since we are
moving to an execution model where multi-threading is done at the fragment 
level.

This patch adds a new synchronous HdfsScanner::GetNext() interface and 
implements
it for the Parquet scanner. The async execution via HdfsScanner::ProcessSplit()
is still supported and is implemented by repeatedly calling GetNext() for
code sharing purposes.

I did not yet add a single-threaded scan node that uses GetNext().
The new code will be excercised by the existing scan node and tests.

Testing: I locally ran the scanner tests and TPCDS tests on core.
I am still in the process of validating the performance of the existing
multi-threaded scan node.

Change-Id: Iab50770bac05afcda4d3404fb4f53a0104931eb0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/ImpalaInternalService.thrift
10 files changed, 359 insertions(+), 259 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/32/3732/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3732
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iab50770bac05afcda4d3404fb4f53a0104931eb0
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Alex Behm <[email protected]>

Reply via email to