[
https://issues.apache.org/jira/browse/HBASE-27228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697305#comment-17697305
]
chaijunjie commented on HBASE-27228:
------------------------------------
good idea, we find the first scan for a table is slow, maybe this issue can
solve it.
> Client connection warming API
> -----------------------------
>
> Key: HBASE-27228
> URL: https://issues.apache.org/jira/browse/HBASE-27228
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Priority: Major
>
> In a high performance API or low latency stream workers, you often do not
> want to incur costs on the first few requests. In these cases, you want to
> warm connections before ever adding to the load balancer or processing group.
> Upon first creating a Connection, there are two areas that can slow down the
> first few requests:
> * Fetching region locations
> * Creating the initial connection to each RegionServer, which sends
> connection headers, possibly does auth handshakes, etc.
> A user can easily work around the first slowness by calling
> Table.getRegionLocator().getAllRegionLocations().
> It's more challenging for a user to warm the actual RegionServer connections.
> One way we have done this is to use a RegionLocator to fetch all locations
> for a table, reduce that down to 1 region per server, and then issue a Get to
> each row. We end up repeating this for every table that a process may connect
> to, because at the level we do this we can't easily tell which servers have
> already been warmed. We also have run into various bugs over time, for
> example where an empty startkey causes a Get to fail.
> We can make this easier for the users by providing an API which uses
> Connection internals to as cheaply as possible warm these connections. I'd
> propose we add the following:
> New Table/AsyncTable method {{{}warmConnections(){}}}. This would do the
> following:
> * use region locator to fetch all locations (with caching)
> * reduce returned locations to unique ServerNames
> * for each ServerName (with lock):
> ** if already warmed, skip
> ** otherwise, get a connection to that server and send an initial request to
> trigger socket creation/connection header/etc
> With this API, if someone is connecting to multiple tables, they could warm
> each of them Table in parallel and we'd only create connections to each
> server once.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)