Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/24190

to look at the new patch set (#21).

Change subject: IMPALA-14882: part1: Convert arrow record batch to impala tuple 
batch
......................................................................

IMPALA-14882: part1: Convert arrow record batch to impala tuple batch

Added efficient conversion from an arrow record batch to a batch of
impala tuples.
Previously, there existed a conversion in paimon jni reader that
converted arrow arrays row-by-row. The new conversion writes all rows
per column (i.e. all elements of an arrow array into the corresponding
slots in the impala tuples) for each column/field/slot.
Fixed size parts of tuples have to be pre-allocated in a contiguous
memory, so that columnar writes can be efficient (i.e. a tuple's memory
can be calculated as base_ptr + index * tuple_size).

Paimon jni scan node was modified to convert the arrow record first into
a scratch batch, then apply filtered rows to the output batch, similarly
to how file readers work (e.g. HdfsParquetScanner).

This patch does not handle nested types (array/list, struct, map).

Testing:
-added arrow-converter-test.cc with tests for converting different types
-ran paimon tests locally

Measurement:
-added arrow-converter-benchmark for converting different types

Change-Id: Iea544b3c71d9211c893f0fec3527ebe84155ebcd
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/arrow-converter-benchmark.cc
M be/src/exec/CMakeLists.txt
A be/src/exec/arrow-converter-test.cc
A be/src/exec/arrow-converter.cc
A be/src/exec/arrow-converter.h
M be/src/exec/paimon/CMakeLists.txt
D be/src/exec/paimon/paimon-jni-row-reader.cc
D be/src/exec/paimon/paimon-jni-row-reader.h
M be/src/exec/paimon/paimon-jni-scan-node.cc
M be/src/exec/paimon/paimon-jni-scan-node.h
M be/src/exec/scratch-tuple-batch.h
12 files changed, 1,129 insertions(+), 515 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/24190/21
--
To view, visit http://gerrit.cloudera.org:8080/24190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea544b3c71d9211c893f0fec3527ebe84155ebcd
Gerrit-Change-Number: 24190
Gerrit-PatchSet: 21
Gerrit-Owner: Balazs Hevele <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>

Reply via email to