[ 
https://issues.apache.org/jira/browse/IMPALA-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206846#comment-17206846
 ] 

ASF subversion and git services commented on IMPALA-9711:
---------------------------------------------------------

Commit e60292fb3bd71f25b90119d0d48292f4c49e158f in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e60292f ]

IMPALA-9711: incrementally update aggregate profile

In order to not cause additional work in the default mode,
we still only compute the average once per instance,
when it completes or when the query finishes.

When --gen_experimental_profile=true, we update the aggregated
profile for each status report, so that the live profile
can be viewed as the query executes.

The implications of this are as follows:
* More work is done on the KRPC control service RPC thread
  (although this is largely moot after part 2 of IMPALA-9382
   where we merge into the aggregated profile directly,
   so avoid the extra update).
* For complex multi-stage queries, the profile merging
  work is done earlier as each stage completes, therefore
  the critical path of the query is shortened
* Multiple RPC threads may be merging profiles concurrently
* Multiple threads may be calling AggregatedRuntimeProfile::Update()
  on the same profile, whereas previously all merging was done by
  a single thread. I looked through the locking in that function to
  check correctness.

Testing:
Ran core tests.

Ran a subset of the Python tests under TSAN, confirmed no races
were introduced in this code.

Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a
Reviewed-on: http://gerrit.cloudera.org:8080/15931
Reviewed-by: Tim Armstrong <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Incrementally compute averaged profile
> --------------------------------------
>
>                 Key: IMPALA-9711
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9711
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>
> The parent showed that updating the averaged profile can take a bit of time 
> and executes on the RPC handler of the fetch RPC. We should consider how to 
> optimise this.
> I think the most impactful change, from the point of view of query latency, 
> is to apply the profile updates when each instance finishes instead of doing 
> the computation on the RPC handler.
> For IMPALA-9382, we also need the averaged profile to be kept up-to-date when 
> the profile is accessed, because it will be the only part of the profile that 
> is exposed to clients.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to