[
https://issues.apache.org/jira/browse/DRILL-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832702#comment-16832702
]
ASF GitHub Bot commented on DRILL-7222:
---------------------------------------
kkhatua commented on issue #1779: DRILL-7222: Visualize estimated and actual
row counts for a query
URL: https://github.com/apache/drill/pull/1779#issuecomment-489185185
@arina-ielchiieva
The motivation for this PR comes from the need for engineers to analyze
queries as plans change due to introduction of statistics. An initial thought
was to add an additional column, but, I think, we already have a lot of
columns. I've tried to figure which columns to trim, but almost all seem
relevant. I know we might come back to doing similar things with Resource
Management as well, where we'll again need to work on estimates vs actual. So
adding additional columns is not practical.
Showing the estimates based on whether a planning decision was made using
statistics is not possible unless the profile JSON itself carries some hint
that statistics were used.
Also, I added the toggle button to provide a mechanism to hide the estimates
by default (another reason why not an additional column). I'm worried that
users will get the impression that there are issues with Drill because of
estimates being wildly off. Even if they are sufficiently accurate (like
NDV-based estimates vs actual), most users don't have the insight into how the
stats are being used.
Users who have insight into such things can make use of the estimates to
tune parameters (e.g. broadcast or selectivity thresholds) to force changes in
plans that are sub-optimal. Based on this, I thought we should go with the
parenthesis option for showing the estimated row counts.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Visualize estimated and actual row counts for a query
> -----------------------------------------------------
>
> Key: DRILL-7222
> URL: https://issues.apache.org/jira/browse/DRILL-7222
> Project: Apache Drill
> Issue Type: Improvement
> Components: Web Server
> Affects Versions: 1.16.0
> Reporter: Kunal Khatua
> Assignee: Kunal Khatua
> Priority: Major
> Labels: doc-impacting, user-experience
> Fix For: 1.17.0
>
>
> With statistics in place, it would be useful to have the *estimated* rowcount
> along side the *actual* rowcount query profile's operator overview.
> We can extract this from the Physical Plan section of the profile.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)