[ 
https://issues.apache.org/jira/browse/IMPALA-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697467#comment-16697467
 ] 

ASF subversion and git services commented on IMPALA-4063:
---------------------------------------------------------

Commit d4019be2a259b6f5bc14da3261fac36e07f771ad in impala's branch 
refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d4019be ]

IMPALA-7852: Fix some flakiness in test_hash_join_timer.py

test_hash_join_timer.py aims to verify the timers in the
join nodes are functioning correctly. It does so by parsing
the query profile for certain patterns after the query has
finished.

Before IMPALA-4063, each individual fragment instance will
post its profile to the coordinator upon completion. After
IMPALA-4063, the profiles of all fragment instances are sent
together periodically and the final profile is sent once all
fragment instances on a backend are done.

The problem with the existing implementation of the test is
that it doesn't actually fetch results before closing the
query. As a result of it, the coordinator fragment never gets
a chance to complete as it will block forever when inserting
into the plan root sink. The lack of completion of the
coordinator fragment causes the final profiles of fragment
instances on the coordinator to be not sent before the query
is closed. As a result, the profile of a fragment instance
on the coordinator could be stale if it completes between
two periodic updates, leading to random test failure.

This change fixes the flankiness by always fetching results
before closing the query. Ideally, if we fix IMPALA-539 and
wait for all backends' final profiles before completing query
unregistration, we should get the final profile from the
coordinator fragment too.

Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Reviewed-on: http://gerrit.cloudera.org:8080/11964
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Make fragment instance reports per-query (or per-host) instead of 
> per-fragment instance.
> ----------------------------------------------------------------------------------------
>
>                 Key: IMPALA-4063
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4063
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>    Affects Versions: Impala 2.7.0
>            Reporter: Sailesh Mukil
>            Assignee: Michael Ho
>            Priority: Major
>              Labels: impala-scalability-sprint-08-13-2018, performance
>             Fix For: Impala 3.2.0
>
>
> Currently we send a report per-fragment instance to the coordinator every 5 
> seconds (by default; modifiable via query option 'status_report_interval').
> For queries with a large number of fragment instances, this generates 
> tremendous amounts  of network traffic to the coordinator, which will only be 
> aggravated with higher a DOP.
> We should instead queue per-fragment instance reports and send out a 
> per-query report to the coordinator instead.
> For code references, see:
> PlanFragmentExecutor:: ReportProfile()
> PlanFragmentExecutor:: SendReport()
> FragmentExecState:: ReportStatusCb()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to