[
https://issues.apache.org/jira/browse/PHOENIX-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156182#comment-15156182
]
James Taylor commented on PHOENIX-2702:
---------------------------------------
We go part way down this route already with the explain plan. For example:
{code}
CLIENT 12-CHUNK PARALLEL 1-WAY RANGE SCAN OVER T
{code}
means that 12 guideposts were traversed without a merge sort being necessary
(since it says 1).
Another example:
{code}
CLIENT 50-CHUNK PARALLEL 10-WAY RANGE SCAN OVER T
{code}
means that 50 guideposts were traversed, partially combined, and then merge
sorted 10 ways. This could happen if the table had a salt bucket of 10 and each
of these buckets had 5 guideposts (perhaps or perhaps not spanning region
boundaries).
There's definitely room for improvement here. Have the bytes and row estimates
would be more readable (and IMO would replace the chunk count information).
It'd be good to take a holistic view, document our existing explain plan
(PHOENIX-1481), and come up with a better way to present it. Another
consideration is how this information is surfaced in Calcite ([~julianhyde])
and whether there are any "standard" ways of presenting an explain plan
(defacto or otherwise).
> Show estimate rows and bytes touched for explain plan.
> ------------------------------------------------------
>
> Key: PHOENIX-2702
> URL: https://issues.apache.org/jira/browse/PHOENIX-2702
> Project: Phoenix
> Issue Type: Bug
> Reporter: Lars Hofhansl
>
> We can already estimate the size of a table (both rows and uncompressed
> bytes) with q query like this:
> {code}
> SELECT physical_name AS table_name, SUM(guide_posts_row_count) AS est_rows,
> SUM(guilde_posts_width) AS est_size from SYSTEM.STATS GROUP BY physical_name;
> {code}
> During the planning phase we have more information, though. So we can report
> the actual numbers for a query during an explain since we have that info
> there anyway (we filtered the guidepost already with the key info provided in
> the query).
> I might whip up a quick patch for this.
> (Could also go further and add a est_count, est_size UDF for this, but that
> would be a bit harder to get hooked up at the right places, I think, and the
> meaning would be ambiguous)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)