wchevreuil commented on code in PR #5371:
URL: https://github.com/apache/hbase/pull/5371#discussion_r1321378222
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1285,12 +1320,104 @@ void persistToFile() throws IOException {
LOG.warn("Failed to commit cache persistent file. We might lose cached
blocks if "
+ "RS crashes/restarts before we successfully checkpoint again.");
}
+ LOG.debug("Saving current state of bucket cache index map took {}ms.",
+ EnvironmentEdgeManager.currentTime() - startTime);
+ }
+
+ private void recordTransaction(BlockCacheKey key, BucketEntry bucketEntry,
+ BucketCacheProtos.TransactionType type) {
+ if (persistencePath != null) {
+ File path = new File(persistencePath + "tx-" + System.nanoTime());
+ long startTime = EnvironmentEdgeManager.currentTime();
+ try (FileOutputStream fos = new FileOutputStream(path, false)) {
+ fos.write(ProtobufMagic.PB_MAGIC);
+ BucketProtoUtils.toPB(this, key, bucketEntry,
type).writeDelimitedTo(fos);
+ txsCount.incrementAndGet();
+ fos.flush();
+ } catch (Exception e) {
+ LOG.error("Failed to record cache transaction {} for key {}. In the
event of a crash, "
+ + "this key would require a re-cache.", type.name(), key, e);
+ }
+ LOG.debug("Cache transaction recording took {}ms",
+ EnvironmentEdgeManager.currentTime() - startTime);
+ }
Review Comment:
>we have this in the critical path of the reads, having it synchronous could
delay the performance of the reads during both cases of eviction or addition
blocks.
For additions, the recording always happens on the BucketCache.WriterThread,
so it's async to the read. Same for evictions due to cache full, since this
happens in the WriterThread. Evictions are also triggered in exceptional
scenarios of the read path, such as the compression mismatch or the checksum
fail. In those cases, yeah, there would be an overhead to the client operation.
> This will also generate a large amount of files as we are creating new
files for every transaction.
Yes, and it adds some IO pressure which can impact on the cache read/write
performance, so better to place it on separate disks than the cache itself. The
file amount is controlled by the periodic checkpoints. Ideally we should do
something similar to the WAL writing approach, but that would over complicate
this PR, so maybe worth address it on another ticket.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]