[
https://issues.apache.org/jira/browse/IMPALA-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong resolved IMPALA-4224.
-----------------------------------
Fix Version/s: Impala 3.4.0
Resolution: Fixed
> Add backend support for join build sinks in parallel plans
> ----------------------------------------------------------
>
> Key: IMPALA-4224
> URL: https://issues.apache.org/jira/browse/IMPALA-4224
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 2.8.0
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: multithreading
> Fix For: Impala 3.4.0
>
>
> Now that IMPALA-3567 is solved, the next step is to add the plumbing to have
> a join builder as the sink of a plan fragment to implement the parallel plans
> added in http://gerrit.cloudera.org:8080/2846
> This JIRA tracks making the plans executable, without sharing of the join
> build for broadcast join.
> Steps required:
> * Enable the join build sink in the planner
> * Update planner to include all required state in the thrift objects (the
> join build sinks are missing various required info).
> * Update planner resource requirement calculations - join build fragment
> needs real resource estimates
> * Update scheduler to schedule join build fragment co-located with their
> parent fragment. This depends on the build plans being sent pre-order. Pass
> the source fragment instance id into the join nodes so they can locate the
> input fragment instance.
> * Update scheduler to correctly handle multiple build plans.
> * Instantiate the join builders as input sinks to the plan. This requires
> getting some data from the thrift structs instead of passed in from the
> PHJNode
> * Ensure the join builders function correctly as plan sinks (e.g. add an
> indefinite wait to the join node to prevent it from crashing, ensure that the
> builder consumes the whole input). Initially we probably wait to have the
> build thread block in Close().
> * Update the join node so that in the non-subplan mt_dop > 0 case, it looks
> up the input fragment instance and waits for it to finish the build (with
> cancellation). Need to find all the places it looks for the right child.
> * After that the join node "owns" the builder so the control flow should be
> the same mostly. The main difference is that the buffer pool client and
> memory tracking is set up differently. Maybe need to change the Close() call
> as well?
> * Figure out any resource management, etc, issues across the build and probe
> (threads, memory, etc). Fix up the builder thread behaviour so that Close()
> doesn't block and the thread is released.
> This, I think, needs to be one change because the intermediate states aren't
> testable or functional.
> Testing:
> * Existing mt join tests are useful and will exercise the new behaviour
> * Ensure spilling is tested with multithreading (new dimension to spilling
> tests?)
> * Ensure cancellation is tested.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]