[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545890#comment-16545890
 ] 

stack commented on HBASE-20697:
-------------------------------

[~zghaobac] Thanks for the backport.

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20697
>                 URL: https://issues.apache.org/jira/browse/HBASE-20697
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.1, 1.2.6, 2.0.1
>            Reporter: zhaoyuan
>            Assignee: zhaoyuan
>            Priority: Major
>             Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
>         Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List<HRegionLocation> getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap<HRegionInfo, ServerName> locations =
>       MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList<HRegionLocation> regions = new ArrayList<>(locations.size());
>   for (Entry<HRegionInfo, ServerName> entry : locations.entrySet()) {
>     regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
>     connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap<byte[], RegionLocations> tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
>     if (LOG.isTraceEnabled()) {
>       LOG.trace("Cached location: " + locations);
>     }
>     addToCachedServers(locations);
>     return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap<byte[], RegionLocations> tableLocations =
>     getTableLocations(tableName);
>   Entry<byte[], RegionLocations> e = tableLocations.floorEntry(row);
>   if (e == null) {
>     if (metrics!= null) metrics.incrMetaCacheMiss();
>     return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>       getRowComparator(tableName).compareRows(
>           endKey, 0, endKey.length, row, 0, row.length) > 0) {
>     if (metrics != null) metrics.incrMetaCacheHit();
>     return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to