[ 
https://issues.apache.org/jira/browse/IMPALA-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-4224.
-----------------------------------
    Fix Version/s: Impala 3.4.0
       Resolution: Fixed

> Add backend support for join build sinks in parallel plans
> ----------------------------------------------------------
>
>                 Key: IMPALA-4224
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4224
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.8.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: multithreading
>             Fix For: Impala 3.4.0
>
>
> Now that IMPALA-3567 is solved, the next step is to add the plumbing to have 
> a join builder as the sink of a plan fragment to implement the parallel plans 
> added in http://gerrit.cloudera.org:8080/2846
> This JIRA tracks making the plans executable, without sharing of the join 
> build for broadcast join.
> Steps required:
> * Enable the join build sink in the planner
> * Update planner to include all required state in the thrift objects (the 
> join build sinks are missing various required info).
> * Update planner resource requirement calculations - join build fragment 
> needs real resource estimates
> * Update scheduler to schedule join build fragment co-located with their 
> parent fragment. This depends on the build plans being sent pre-order. Pass 
> the source fragment instance id into the join nodes so they can locate the 
> input fragment instance.
> * Update scheduler to correctly handle multiple build plans.
> * Instantiate the join builders as input sinks to the plan. This requires 
> getting some data from the thrift structs instead of passed in from the 
> PHJNode
> * Ensure the join builders function correctly as plan sinks (e.g. add an 
> indefinite wait to the join node to prevent it from crashing, ensure that the 
> builder consumes the whole input). Initially we probably wait to have the 
> build thread block in Close(). 
> * Update the join node so that in the non-subplan mt_dop > 0 case, it looks 
> up the input fragment instance and waits for it to finish the build (with 
> cancellation). Need to find all the places it looks for the right child.
> *  After that the join node "owns" the builder so the control flow should be 
> the same mostly. The main difference is that the buffer pool client and 
> memory tracking is set up differently. Maybe need to change the Close() call 
> as well?
> * Figure out any resource management, etc, issues across the build and probe 
> (threads, memory, etc). Fix up the builder thread behaviour so that Close() 
> doesn't block and the thread is released.
> This, I think, needs to be one change because the intermediate states aren't 
> testable or functional.
> Testing:
> * Existing mt join tests are useful and will exercise the new behaviour
> * Ensure spilling is tested with multithreading (new dimension to spilling 
> tests?)
> * Ensure cancellation is tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to