wchevreuil commented on code in PR #5371: URL: https://github.com/apache/hbase/pull/5371#discussion_r1321378222
########## hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java: ########## @@ -1285,12 +1320,104 @@ void persistToFile() throws IOException { LOG.warn("Failed to commit cache persistent file. We might lose cached blocks if " + "RS crashes/restarts before we successfully checkpoint again."); } + LOG.debug("Saving current state of bucket cache index map took {}ms.", + EnvironmentEdgeManager.currentTime() - startTime); + } + + private void recordTransaction(BlockCacheKey key, BucketEntry bucketEntry, + BucketCacheProtos.TransactionType type) { + if (persistencePath != null) { + File path = new File(persistencePath + "tx-" + System.nanoTime()); + long startTime = EnvironmentEdgeManager.currentTime(); + try (FileOutputStream fos = new FileOutputStream(path, false)) { + fos.write(ProtobufMagic.PB_MAGIC); + BucketProtoUtils.toPB(this, key, bucketEntry, type).writeDelimitedTo(fos); + txsCount.incrementAndGet(); + fos.flush(); + } catch (Exception e) { + LOG.error("Failed to record cache transaction {} for key {}. In the event of a crash, " + + "this key would require a re-cache.", type.name(), key, e); + } + LOG.debug("Cache transaction recording took {}ms", + EnvironmentEdgeManager.currentTime() - startTime); + } Review Comment: >we have this in the critical path of the reads, having it synchronous could delay the performance of the reads during both cases of eviction or addition blocks. For additions, the recording always happens on the BucketCache.WriterThread, so it's async to the read. Same for evictions due to cache full, since this happens in the WriterThread. Evictions are also triggered in exceptional scenarios of the read path, such as the compression mismatch or the checksum fail. In those cases, yeah, there would be an overhead to the client operation. > This will also generate a large amount of files as we are creating new files for every transaction. Yes, and it adds some IO pressure which can impact on the cache read/write performance, so better to place it on separate disks than the cache itself. The file amount is controlled by the periodic checkpoints. Ideally we should do something similar to the WAL writing approach, but that would over complicate this PR, so maybe worth address it on another ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org