Sanjeet Malhotra created HBASE-30161:
----------------------------------------

             Summary: Add paginated, single-RPC 
RegionLocator.getRegionLocations(startKey, limit) API for bulk meta-cache warmup
                 Key: HBASE-30161
                 URL: https://issues.apache.org/jira/browse/HBASE-30161
             Project: HBase
          Issue Type: Improvement
            Reporter: Sanjeet Malhotra
            Assignee: Sanjeet Malhotra


`RegionLocator.getAllRegionLocations()` is currently the only bulk API          
             
  to fetch all region locations of a table. Internally it opens a
  `ResultScanner` against `hbase:meta` via                                      
               
  `MetaTableAccessor.scanMetaForTableRegions(...)` and drives                   
               
  `scanner.next()` in a loop — so the number of RPCs is                         
               
  `ceil(numRegions / hbase.meta.scanner.caching)`.                              
               
                                                                                
               
  This is a problem for clients (e.g. Phoenix) that want to perform a           
               
  *bulk warmup* of their region-location cache after a fresh JVM start          
               
  while serializing meta access. The natural pattern is to wrap the call        
               
  in a lock — mirroring what `ConnectionImplementation.locateRegionInMeta`      
               
  already does for single-region lookups via `userRegionLock`. But              
               
  because `getAllRegionLocations()` does N RPCs under one logical call:         
               
                                                                                
               
  * The lock-timeout budget has to grow with table size — there is no           
               
    sensible fixed value that works for both 10-region and 10000-region         
               
    tables.                                                                     
               
  * A single slow RPC inside the loop blocks all other meta lookups for
    the duration.                                                               
               
  * Per-call duration is no longer constant w.r.t. data volume.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to