[GitHub] [hbase] wchevreuil commented on a diff in pull request #5341: HBASE-28004 Persistent cache map can get corrupt if crash happens midway through the write

via GitHub Tue, 15 Aug 2023 09:45:12 -0700


wchevreuil commented on code in PR #5341:
URL: https://github.com/apache/hbase/pull/5341#discussion_r1294865802



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1358,16 +1365,37 @@ private void verifyCapacityAndClasses(long 
capacitySize, String ioclass, String
   }
 
   private void parsePB(BucketCacheProtos.BucketCacheEntry proto) throws 
IOException {
+    backingMap = BucketProtoUtils.fromPB(proto.getDeserializersMap(), 
proto.getBackingMap(),
+      this::createRecycler);
+    prefetchCompleted.clear();
+    prefetchCompleted.putAll(proto.getPrefetchedFilesMap());
     if (proto.hasChecksum()) {
-      ((PersistentIOEngine) 
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
-        algorithm);
+      try {
+        ((PersistentIOEngine) 
ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(),
+          algorithm);
+      } catch (IOException e) {
+        LOG.warn("Checksum for cache file failed. "
+          + "We need to validate each cache key in the backing map. This may 
take some time...");
+        long startTime = EnvironmentEdgeManager.currentTime();
+        int totalKeysOriginally = backingMap.size();
+        for (Map.Entry<BlockCacheKey, BucketEntry> keyEntry : 
backingMap.entrySet()) {
+          try {
+            ((FileIOEngine) ioEngine).checkCacheTime(keyEntry.getValue());

Review Comment:
   On RS induced aborts, we sync the backing map when shutting down, so no need 
to validate the blocks when restarting and reloading the cache. I've done few 
tests where I've induced RS abort after HDFS/ZK errors during a write load that 
updates the cache, then on restart this validation wasn't needed (the backing 
map was consistent with the cache state). 
   
   So I think the validation would happen only in cases where the RS crash 
abruptly, from a JVM fatal error or OS sigkill situation. For the record, I've 
tested crashing RS with a 1.6TB cache, the validation then took around 35 
minutes to complete. This is not great, since the RS is useless during this 
time. One thought here is to allow bucket cache initialisation in the 
background, so that meanwhile reads would be going to the FS, but I would 
rather work on this in a separate jira/PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] wchevreuil commented on a diff in pull request #5341: HBASE-28004 Persistent cache map can get corrupt if crash happens midway through the write

Reply via email to