[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

zhaoyuan (JIRA) Thu, 07 Jun 2018 02:02:22 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


zhaoyuan updated HBASE-20697:
-----------------------------
    Description: 
When we upgrade and restart  a new version application which will read and 
write to HBase, we will get some operation timeout. The time out is expected 
because when the application restarts，It will not hold any region locations 
cache and do communication with zk and meta regionserver to get region 
locations.

We want to avoid these timeouts so we do warmup work and as far as I am 
concerned,the method table.getRegionLocator().getAllRegionLocations() will 
fetch all region locations and cache them. However, it didn't work good. There 
are still a lot of time outs,so it confused me. 

I dig into the source code and find something below
{code:java}
// code placeholder
public List<HRegionLocation> getAllRegionLocations() throws IOException {
  TableName tableName = getName();
  NavigableMap<HRegionInfo, ServerName> locations =
      MetaScanner.allTableRegions(this.connection, tableName);
  ArrayList<HRegionLocation> regions = new ArrayList<>(locations.size());
  for (Entry<HRegionInfo, ServerName> entry : locations.entrySet()) {
    regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
  }
  if (regions.size() > 0) {
    connection.cacheLocation(tableName, new RegionLocations(regions));
  }
  return regions;
}

In MetaCache

public void cacheLocation(final TableName tableName, final RegionLocations 
locations) {
  byte [] startKey = 
locations.getRegionLocation().getRegionInfo().getStartKey();
  ConcurrentMap<byte[], RegionLocations> tableLocations = 
getTableLocations(tableName);
  RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, locations);
  boolean isNewCacheEntry = (oldLocation == null);
  if (isNewCacheEntry) {
    if (LOG.isTraceEnabled()) {
      LOG.trace("Cached location: " + locations);
    }
    addToCachedServers(locations);
    return;
  }
{code}
It will collect all regions into one RegionLocations object and only cache the 
first not null region location and then when we put or get to hbase, we do 
getCacheLocation() 
{code:java}
// code placeholder
public RegionLocations getCachedLocation(final TableName tableName, final byte 
[] row) {
  ConcurrentNavigableMap<byte[], RegionLocations> tableLocations =
    getTableLocations(tableName);

  Entry<byte[], RegionLocations> e = tableLocations.floorEntry(row);
  if (e == null) {
    if (metrics!= null) metrics.incrMetaCacheMiss();
    return null;
  }
  RegionLocations possibleRegion = e.getValue();

  // make sure that the end key is greater than the row we're looking
  // for, otherwise the row actually belongs in the next region, not
  // this one. the exception case is when the endkey is
  // HConstants.EMPTY_END_ROW, signifying that the region we're
  // checking is actually the last region in the table.
  byte[] endKey = 
possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
  if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
      getRowComparator(tableName).compareRows(
          endKey, 0, endKey.length, row, 0, row.length) > 0) {
    if (metrics != null) metrics.incrMetaCacheHit();
    return possibleRegion;
  }

  // Passed all the way through, so we got nothing - complete cache miss
  if (metrics != null) metrics.incrMetaCacheMiss();
  return null;
}
{code}
It will choose the first location to be possibleRegion and possibly it will 
miss match.

So did I forget something or may be wrong somewhere? If this is indeed a bug I 
think it can be fixed not very hard.

Hope commiters and PMC review this !

 

 

  was:
When we upgrade and restart  a new version application which will read and 
write to HBase, we will get some operation timeout. The time out is expected 
because when the application restarts，It will not hold any region locations 
cache and do communication with zk and meta regionserver to get region 
locations.

We want to avoid these timeouts so we do warmup work and as far as I am 
concerned,the method table.getRegionLocator().getAllRegionLocations() will 
fetch all region locations and cache them. However, it didn't work good. There 
are still a lot of time outs,so it confused me. 

I dig into the source code and find something below
{code:java}
// code placeholder
public List<HRegionLocation> getAllRegionLocations() throws IOException {
  TableName tableName = getName();
  NavigableMap<HRegionInfo, ServerName> locations =
      MetaScanner.allTableRegions(this.connection, tableName);
  ArrayList<HRegionLocation> regions = new ArrayList<>(locations.size());
  for (Entry<HRegionInfo, ServerName> entry : locations.entrySet()) {
    regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
  }
  if (regions.size() > 0) {
    connection.cacheLocation(tableName, new RegionLocations(regions));
  }
  return regions;
}

In MetaCache

public void cacheLocation(final TableName tableName, final RegionLocations 
locations) {
  byte [] startKey = 
locations.getRegionLocation().getRegionInfo().getStartKey();
  ConcurrentMap<byte[], RegionLocations> tableLocations = 
getTableLocations(tableName);
  RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, locations);
  boolean isNewCacheEntry = (oldLocation == null);
  if (isNewCacheEntry) {
    if (LOG.isTraceEnabled()) {
      LOG.trace("Cached location: " + locations);
    }
    addToCachedServers(locations);
    return;
  }
{code}
It will collect all regions into one RegionLocations object and only cache the 
first not null region location. When we do getCacheLocation() 
{code:java}
// code placeholder
public RegionLocations getCachedLocation(final TableName tableName, final byte 
[] row) {
  ConcurrentNavigableMap<byte[], RegionLocations> tableLocations =
    getTableLocations(tableName);

  Entry<byte[], RegionLocations> e = tableLocations.floorEntry(row);
  if (e == null) {
    if (metrics!= null) metrics.incrMetaCacheMiss();
    return null;
  }
  RegionLocations possibleRegion = e.getValue();

  // make sure that the end key is greater than the row we're looking
  // for, otherwise the row actually belongs in the next region, not
  // this one. the exception case is when the endkey is
  // HConstants.EMPTY_END_ROW, signifying that the region we're
  // checking is actually the last region in the table.
  byte[] endKey = 
possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
  if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
      getRowComparator(tableName).compareRows(
          endKey, 0, endKey.length, row, 0, row.length) > 0) {
    if (metrics != null) metrics.incrMetaCacheHit();
    return possibleRegion;
  }

  // Passed all the way through, so we got nothing - complete cache miss
  if (metrics != null) metrics.incrMetaCacheMiss();
  return null;
}
{code}
It will choose the first location to be possibleRegion and possibly it will 
miss match.

So did I forget something or may be wrong somewhere? If this is indeed a bug I 
think it can be fixed not very hard.

Hope commiters and PMC review this !

 

 


> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20697
>                 URL: https://issues.apache.org/jira/browse/HBASE-20697
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.1, 1.2.6
>            Reporter: zhaoyuan
>            Assignee: zhaoyuan
>            Priority: Minor
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts，It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List<HRegionLocation> getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap<HRegionInfo, ServerName> locations =
>       MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList<HRegionLocation> regions = new ArrayList<>(locations.size());
>   for (Entry<HRegionInfo, ServerName> entry : locations.entrySet()) {
>     regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
>     connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap<byte[], RegionLocations> tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
>     if (LOG.isTraceEnabled()) {
>       LOG.trace("Cached location: " + locations);
>     }
>     addToCachedServers(locations);
>     return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap<byte[], RegionLocations> tableLocations =
>     getTableLocations(tableName);
>   Entry<byte[], RegionLocations> e = tableLocations.floorEntry(row);
>   if (e == null) {
>     if (metrics!= null) metrics.incrMetaCacheMiss();
>     return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>       getRowComparator(tableName).compareRows(
>           endKey, 0, endKey.length, row, 0, row.length) > 0) {
>     if (metrics != null) metrics.incrMetaCacheHit();
>     return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

Reply via email to