ankitsinghal commented on code in PR #5341:
URL: https://github.com/apache/hbase/pull/5341#discussion_r1292944551
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1358,16 +1365,37 @@ private void verifyCapacityAndClasses(long
capacitySize, String ioclass, String
}
private void parsePB(BucketCacheProtos.BucketCacheEntry proto) throws
IOException {
+ backingMap = BucketProtoUtils.fromPB(proto.getDeserializersMap(),
proto.getBackingMap(),
+ this::createRecycler);
+ prefetchCompleted.clear();
+ prefetchCompleted.putAll(proto.getPrefetchedFilesMap());
if (proto.hasChecksum()) {
- ((PersistentIOEngine)
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
- algorithm);
+ try {
+ ((PersistentIOEngine)
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
+ algorithm);
+ } catch (IOException e) {
+ LOG.warn("Checksum for cache file failed. "
+ + "We need to validate each cache key in the backing map. This may
take some time...");
+ long startTime = EnvironmentEdgeManager.currentTime();
+ int totalKeysOriginally = backingMap.size();
+ for (Map.Entry<BlockCacheKey, BucketEntry> keyEntry :
backingMap.entrySet()) {
+ try {
+ ((FileIOEngine) ioEngine).checkCacheTime(keyEntry.getValue());
+ } catch (IOException e1) {
+ LOG.debug("Check for key {} failed. Removing it from map.",
keyEntry.getKey());
+ backingMap.remove(keyEntry.getKey());
+ prefetchCompleted.remove(keyEntry.getKey().getHfileName());
Review Comment:
In today's current implementation, the performance will be significantly
compromised if we rerun the prefetch even if 99% of blocks are cached, as
during the prefetch, if the block is present in the cache. we retrieve the
entire data from the bucket cache, proceed to uncompress it, and then promptly
discard it.
Options are :
1. Fix the prefetch to avoid loading the block from the file cache if we
find it in backingMap.
2. Or, Hfile from prefetchlist can be discarded on the basis of percentage ,
If the number of missing blocks falls below a certain threshold (x%), the
prefetch process would be triggered again. Users also have the flexibility to
set this threshold to 100% if they wish to rerun the prefetch when even a
single block is absent.
I would prefer option-1 if possible.
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1358,16 +1365,37 @@ private void verifyCapacityAndClasses(long
capacitySize, String ioclass, String
}
private void parsePB(BucketCacheProtos.BucketCacheEntry proto) throws
IOException {
+ backingMap = BucketProtoUtils.fromPB(proto.getDeserializersMap(),
proto.getBackingMap(),
+ this::createRecycler);
+ prefetchCompleted.clear();
+ prefetchCompleted.putAll(proto.getPrefetchedFilesMap());
if (proto.hasChecksum()) {
- ((PersistentIOEngine)
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
- algorithm);
+ try {
+ ((PersistentIOEngine)
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
+ algorithm);
+ } catch (IOException e) {
+ LOG.warn("Checksum for cache file failed. "
+ + "We need to validate each cache key in the backing map. This may
take some time...");
+ long startTime = EnvironmentEdgeManager.currentTime();
+ int totalKeysOriginally = backingMap.size();
+ for (Map.Entry<BlockCacheKey, BucketEntry> keyEntry :
backingMap.entrySet()) {
+ try {
+ ((FileIOEngine) ioEngine).checkCacheTime(keyEntry.getValue());
Review Comment:
This will regress with size, can we list down the scenarios in which we can
go inconsistent like system error or something?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]