Alok Singh created PHOENIX-2306:
-----------------------------------

             Summary: Expose additional statistics in the explain plan to allow 
better cost estimation 
                 Key: PHOENIX-2306
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2306
             Project: Phoenix
          Issue Type: New Feature
         Environment: 4.5.1
            Reporter: Alok Singh
            Priority: Minor


In a mailing list converstation, James described the phoenix APIs that can be 
used to derive cost estimates.
{noformat}
Yes, you could calculate an estimate for this information, but it isn't 
currently exposed through JDBC or through the explain plan (which would be a 
good place for it to live). You'd need to dip down to the implementation to get 
it. Something like this:

PhoenixStatement statement = 
connection.createStatement().unwrap(PhoenixStatement.class);
ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
QueryPlan plan = statement.getQueryPlan();
List<KeyRange> ranges = plan.getSplits();

Each KeyRange in ranges will be going over a configurable amount of bytes 
(determined by phoenix.stats.guidepost.width and/or 
phoenix.stats.guidepost.per.region), so a simple worst case estimate would be 
to multiply the ranges.size() by this config value (using a default of 
QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or 300MB). If the 
query is a point lookup (which you can check with 
plan.getContext().getScanRanges().isPointLookup()), then the cost would be 
ranges.size() * average_row_size.

Since these aren't exposed APIs, they're subject to change. Please file a JIRA 
if you're interested in helping figure out what the "official" APIs for this 
should be.
{noformat}

Ideally, these statistics should be returned as part of the explain plan. That 
would allow the end users of phoenix to use standard JDBC tooling to get at 
this information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to