[
https://issues.apache.org/jira/browse/PHOENIX-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140323#comment-14140323
]
ramkrishna.s.vasudevan commented on PHOENIX-180:
------------------------------------------------
Getting this issue if I try running IT tests using mvn install
{code}
testGetLowerUnboundSplits(org.apache.phoenix.end2end.DefaultParallelIteratorsRegionSplitterIT)
Time elapsed: 4.804 sec <<< ERROR!
org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table
undefined. tableName=SYSTEM.STATS
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.getAllTableRegions(ConnectionQueryServicesImpl.java:386)
at
org.apache.phoenix.iterate.DefaultParallelIteratorRegionSplitter.getAllRegions(DefaultParallelIteratorRegionSplitter.java:75)
at
org.apache.phoenix.iterate.DefaultParallelIteratorRegionSplitter.getSplits(DefaultParallelIteratorRegionSplitter.java:173)
at
org.apache.phoenix.iterate.ParallelIterators.getSplits(ParallelIterators.java:236)
at
org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:109)
at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:110)
at
org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:216)
at
org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:150)
at
org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
at
org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:208)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at
org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:207)
at
org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:996)
at
org.apache.phoenix.schema.MetaDataClient.updateStatistics(MetaDataClient.java:505)
at
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpdateStatisticsStatement$1.execute(PhoenixStatement.java:691)
at
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:257)
at
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:249)
{code}
for all the test cases having ANALYZE in it. Individually the all pass. Any
thing is happening when we run using mvn install?
> Use stats to guide query parallelization
> ----------------------------------------
>
> Key: PHOENIX-180
> URL: https://issues.apache.org/jira/browse/PHOENIX-180
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: ramkrishna.s.vasudevan
> Labels: enhancement
> Attachments: Phoenix-180_V1.patch, Phoenix-180_WIP.patch
>
>
> We're currently not using stats, beyond a table-wide min key/max key cached
> per client connection, to guide parallelization. If a query targets just a
> few regions, we don't know how to evenly divide the work among threads,
> because we don't know the data distribution. This other [issue]
> (https://github.com/forcedotcom/phoenix/issues/64) is targeting gather and
> maintaining the stats, while this issue is focused on using the stats.
> The main changes are:
> 1. Create a PTableStats interface that encapsulates the stats information
> (and implements the Writable interface so that it can be serialized back from
> the server).
> 2. Add a stats member variable off of PTable to hold this.
> 3. From MetaDataEndPointImpl, lookup the stats row for the table in the stats
> table. If the stats have changed, return a new PTable with the updated stats
> information. We may want to cache the stats row and have the stats gatherer
> invalidate the cache row when updated so we don't have to always do a scan
> for it. Additionally, it would be idea if we could use the same split policy
> on the stats table that we use on the system table to guarantee co-location
> of data (for the sake of caching).
> - modify the client-side parallelization (ParallelIterators.getSplits()) to
> use this information to guide how to chunk up the scans at query time.
> This should help boost query performance, especially in cases where the data
> is highly skewed. It's likely the cause for the slowness reported in this
> issue: https://github.com/forcedotcom/phoenix/issues/47.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)