[
https://issues.apache.org/jira/browse/IMPALA-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989357#comment-16989357
]
ASF subversion and git services commented on IMPALA-9126:
---------------------------------------------------------
Commit 17e534e3164a88c4f1da85b39e8245d1ef079bd6 in impala's branch
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=17e534e ]
IMPALA-9126: part 4: hash join builder manages spilling
This is the final patch for IMPALA-9126.
This will allow the many:1 relationship of probe:build
partitions that we need for the shared join build.
Key changes:
* Builder picks the next spilled partition to process.
* Partitions are identified by unique ID so can be
decoupled between build and probe.
* unique_ptr is used to manage build partitions. This
helps document the lifecycle of the partitions better,
particularly when they are handed off to
PartitionedHashJoinNode.
Testing:
* Ran exhaustive tests.
* Ran a single node TPC-H and TPC-DS stress test with 1000 queries.
Perf:
Ran a single node TPC-H 30 test against master from
before IMPALA-9126 changes. No significant perf
change.
Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Reviewed-on: http://gerrit.cloudera.org:8080/14790
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Cleanly separate build and probe state in hash join node
> --------------------------------------------------------
>
> Key: IMPALA-9126
> URL: https://issues.apache.org/jira/browse/IMPALA-9126
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: multithreading
>
> As a precursor to IMPALA-4224, we should clean up the hash join
> implementation so that the build and probe state is better separated. The
> builder should not deal with probe side data structures (like the probe
> streams that it allocates) and all accesses to the build-side data structures
> should go through as narrow APIs as possible.
> The nested loop join is already pretty clean.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]