[
https://issues.apache.org/jira/browse/PHOENIX-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655747#comment-16655747
]
Andrew Purtell commented on PHOENIX-4974:
-----------------------------------------
Sounds about right [~lhofhansl]
At some point we should think about optimizing
{{getRegionLocator(...).getAllRegionLocations()}} for high scale installations.
For example, we could support incremental queries. Given a timestamp, return
only the changes in region location since that time. Then the client can
initialize its cache with an expensive retrieval once, and refresh its cache
from there with lighter weight incremental queries. Requires reimplementing the
equivalent of the old Region Historian over in HBase though. We could keep
historical locations with fairly short TTL in META. Will need splittable META
for high scale anyhow.
> Gets all regions uses get requests is extremely slows for big table
> --------------------------------------------------------------------
>
> Key: PHOENIX-4974
> URL: https://issues.apache.org/jira/browse/PHOENIX-4974
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.14.0, 5.0.0
> Reporter: Jaanai
> Assignee: Jaanai
> Priority: Blocker
> Attachments: PHOENIX-4974-master.patch, performance.png
>
>
> When executes the first query after started the client(SQLline or
> initializing JDBC client ), needs to load region locations to the client
> cache. Now the following is key implement :
> {code:java}
> List<HRegionLocation> locations = Lists.newArrayList();
> byte[] currentKey = HConstants.EMPTY_START_ROW;
> do {
> HRegionLocation regionLocation =
> connection.getRegionLocation(
> TableName.valueOf(tableName), currentKey, reload);
> locations.add(regionLocation);
> currentKey = regionLocation.getRegionInfo().getEndKey();
> } while (!Bytes.equals(currentKey, HConstants.EMPTY_END_ROW));
> {code}
> For some big tables which have more than ten thousand regions, this
> procedure is extremely slow.
> Runs a look points query on the table that has 10000 regions after starting
> the client, it needs 25+ seconds.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)