Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21452


Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
......................................................................

IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

With this patch IcebergDeleteBuilder checks how many probe threads
are actually blocked on the builder. Let's assume the following plan:

    UNION ALL
   /         \
  /           \
 /             \
SCAN all    ANTI JOIN
datafiles  /         \
without   /           \
deletes  SCAN         SCAN
         datafiles    deletes
         with deletes

In that case UNION ALL, and the two "SCAN datafiles" operators are in
the same fragment, while the builder of the ANTI JOIN is in a different
fragment. This means that "SCAN datafiles without deletes" can run in
parallel with the builder. But once that SCAN is exhausted, the UNION
ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN
operator, but that operator depends on the join builder output.

This means in some cases the SCAN fragments are busy, while in other
cases the SCAN fragments are blocked. It depends on how much work
they need to do, and how much work the build-side needs to do. So to
handle all cases, we dynamically check how many build fragments are
blocked on the builder, then spin up as many threads to parellelize
the final sort.

The also works well when we have the following plan:

    ANTI JOIN
   /         \
  /           \
 SCAN         SCAN
 datafiles    deletes
 with deletes

The above plan is created when all data files have corresponding
deletes, or when we are running a simple count(*) query. In that
case all "SCAN datafiles" fragments are blocked on the builder,
so we can use that many threads to sort the build results.

A new field "ThreadCountInFinalBuild" was added, so we can check the
query profile about how many threads were used for the final
sorting in the builders.

Measurements:
In a table with 1 Trillion data records and 68.5 Billion delete records
it lowered "IcebergDeletePositionSortTimer" from ~1 minute to
8-10 seconds, in an environment with 40 executors and MT_DOP=12.

TODO:
 * e2e tests that check counter "ThreadCountInFinalBuild"

Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/join-builder.cc
M be/src/exec/join-builder.h
4 files changed, 102 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/1
--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to