[ 
https://issues.apache.org/jira/browse/HBASE-27228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697305#comment-17697305
 ] 

chaijunjie commented on HBASE-27228:
------------------------------------

good idea, we find the first scan for a table is slow, maybe this issue can 
solve it.

> Client connection warming API
> -----------------------------
>
>                 Key: HBASE-27228
>                 URL: https://issues.apache.org/jira/browse/HBASE-27228
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Priority: Major
>
> In a high performance API or low latency stream workers, you often do not 
> want to incur costs on the first few requests. In these cases, you want to 
> warm connections before ever adding to the load balancer or processing group.
> Upon first creating a Connection, there are two areas that can slow down the 
> first few requests:
>  * Fetching region locations
>  * Creating the initial connection to each RegionServer, which sends 
> connection headers, possibly does auth handshakes, etc.
> A user can easily work around the first slowness by calling 
> Table.getRegionLocator().getAllRegionLocations().
> It's more challenging for a user to warm the actual RegionServer connections. 
> One way we have done this is to use a RegionLocator to fetch all locations 
> for a table, reduce that down to 1 region per server, and then issue a Get to 
> each row. We end up repeating this for every table that a process may connect 
> to, because at the level we do this we can't easily tell which servers have 
> already been warmed. We also have run into various bugs over time, for 
> example where an empty startkey causes a Get to fail.
> We can make this easier for the users by providing an API which uses 
> Connection internals to as cheaply as possible warm these connections. I'd 
> propose we add the following:
> New Table/AsyncTable method {{{}warmConnections(){}}}. This would do the 
> following:
>  * use region locator to fetch all locations (with caching)
>  * reduce returned locations to unique ServerNames
>  * for each ServerName (with lock):
>  ** if already warmed, skip
>  ** otherwise, get a connection to that server and send an initial request to 
> trigger socket creation/connection header/etc
> With this API, if someone is connecting to multiple tables, they could warm 
> each of them Table in parallel and we'd only create connections to each 
> server once. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to