[ 
https://issues.apache.org/jira/browse/DRILL-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15862152#comment-15862152
 ] 

Paul Rogers commented on DRILL-5195:
------------------------------------

I like the idea. However, the idea of "busy" is probably not exactly right. 
Because of the Volcano-style structure of Drill, each fragment will be, in 
aggregate, up to 100% "busy." But, each operator will be "busy" some slice of 
that percentage. (Operators run sequentially, not in parallel.)

I've found it useful to display the percent of time an operator takes within 
its fragment. Maybe:

* Screen: 0%
* Selection vector remover: 5%
* Sort: 70%
* Scanner: 25%

That is, all operators together sum to 100. One cannot make the SVR, say, any 
more "busy" without making the others less "busy". So, perhaps a better name is 
"% CPU".

A question arises when the query has more than one fragment. In this case, the 
sum of times can be 100%, but each fragment might be, say, 40% and 60% of CPU 
time. Would it then make sense to display the % CPU relative to the fragment or 
the entire query? For debugging, % of fragment is most useful. That is, we want 
to reduce fragment run time and must do that per-fragment; the length of time 
spent in other fragments has no impact on the performance of our target 
fragment.

For the customer, % of total query run time might be useful. Customers just 
want to know where time goes, regardless of our parallization/serialization 
rules.

> Publish Operator and MajorFragment Stats in Profile page
> --------------------------------------------------------
>
>                 Key: DRILL-5195
>                 URL: https://issues.apache.org/jira/browse/DRILL-5195
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Web Server
>    Affects Versions: 1.9.0
>            Reporter: Kunal Khatua
>            Assignee: Kunal Khatua
>
> Currently, we show runtimes for major fragments, and min,max,avg times for 
> setup, processing and waiting for various operators.
> It would be worthwhile to have additional stats for the following:
> MajorFragment
>   %Busy - % of the active time for all the minor fragments within each major 
> fragment that they were busy. 
> Operator Profile
>   %Busy - % of the active time for all the fragments within each operator 
> that they were busy. 
>   Records - Total number of records propagated out by that operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to