Alok Singh created PHOENIX-2306:
-----------------------------------
Summary: Expose additional statistics in the explain plan to allow
better cost estimation
Key: PHOENIX-2306
URL: https://issues.apache.org/jira/browse/PHOENIX-2306
Project: Phoenix
Issue Type: New Feature
Environment: 4.5.1
Reporter: Alok Singh
Priority: Minor
In a mailing list converstation, James described the phoenix APIs that can be
used to derive cost estimates.
{noformat}
Yes, you could calculate an estimate for this information, but it isn't
currently exposed through JDBC or through the explain plan (which would be a
good place for it to live). You'd need to dip down to the implementation to get
it. Something like this:
PhoenixStatement statement =
connection.createStatement().unwrap(PhoenixStatement.class);
ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
QueryPlan plan = statement.getQueryPlan();
List<KeyRange> ranges = plan.getSplits();
Each KeyRange in ranges will be going over a configurable amount of bytes
(determined by phoenix.stats.guidepost.width and/or
phoenix.stats.guidepost.per.region), so a simple worst case estimate would be
to multiply the ranges.size() by this config value (using a default of
QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or 300MB). If the
query is a point lookup (which you can check with
plan.getContext().getScanRanges().isPointLookup()), then the cost would be
ranges.size() * average_row_size.
Since these aren't exposed APIs, they're subject to change. Please file a JIRA
if you're interested in helping figure out what the "official" APIs for this
should be.
{noformat}
Ideally, these statistics should be returned as part of the explain plan. That
would allow the end users of phoenix to use standard JDBC tooling to get at
this information.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)