wchevreuil commented on code in PR #5341:
URL: https://github.com/apache/hbase/pull/5341#discussion_r1290648257
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/PersistentIOEngine.java:
##########
@@ -70,6 +70,7 @@ protected byte[] calculateChecksum(String algorithm) {
sb.append(getFileSize(filePath));
sb.append(file.lastModified());
}
+ LOG.debug("Checksum for persistence cache: {}", sb);
Review Comment:
indeed
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1358,16 +1365,37 @@ private void verifyCapacityAndClasses(long
capacitySize, String ioclass, String
}
private void parsePB(BucketCacheProtos.BucketCacheEntry proto) throws
IOException {
+ backingMap = BucketProtoUtils.fromPB(proto.getDeserializersMap(),
proto.getBackingMap(),
+ this::createRecycler);
+ prefetchCompleted.clear();
+ prefetchCompleted.putAll(proto.getPrefetchedFilesMap());
if (proto.hasChecksum()) {
- ((PersistentIOEngine)
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
- algorithm);
+ try {
+ ((PersistentIOEngine)
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
+ algorithm);
+ } catch (IOException e) {
+ LOG.warn("Checksum for cache file failed. "
+ + "We need to validate each cache key in the backing map. This may
take some time...");
+ long startTime = EnvironmentEdgeManager.currentTime();
+ int totalKeysOriginally = backingMap.size();
+ for (Map.Entry<BlockCacheKey, BucketEntry> keyEntry :
backingMap.entrySet()) {
+ try {
+ ((FileIOEngine) ioEngine).checkCacheTime(keyEntry.getValue());
Review Comment:
I have thought about it, as performance is not optimal (in my tests, 190GB
cache with 2M blocks took around 105s to complete validation). The problem is
that we have BucketAllocator, who needs to configure its indexes to buckets and
offsets for the cache file according to the offset and length of each block in
the backing map. If we don't do this validation, the backing map might be
inconsistent, with more than one block for the same offset, then the bucket
allocator initialisation will fail and we'll not able to access the blocks in
the cache.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]