[GitHub] [hbase] jacob-leblanc commented on a change in pull request #707: HBASE-23066 Allow cache on write during compactions when prefetching …

GitBox Tue, 15 Oct 2019 19:15:54 -0700

jacob-leblanc commented on a change in pull request #707: HBASE-23066 Allow 
cache on write during compactions when prefetching …
URL: https://github.com/apache/hbase/pull/707#discussion_r335251644


 ##########
 File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
 ##########
 @@ -319,6 +337,13 @@ public boolean shouldPrefetchOnOpen() {
     return this.prefetchOnOpen;
   }
 
+  /**
+   * @return true if blocks should be cached while writing during compaction, 
false if not
+   */
+  public boolean shouldCacheCompactedBlocksOnWrite() {
+    return this.prefetchCompactedDataOnWrite && this.prefetchOnOpen;
 
 Review comment:
   Thanks for looking at this. My understanding is that in cases where prefetch 
is enabled, the new file is going to be read into the cache after compaction 
completes anyway. So the cache size requirements are the same when this new 
setting is enabled. This is why I wanted to limit the scope of the cache on 
write to only apply where prefetching is enabled: it simply is a way to do the 
cache loading more efficiently while we are writing the data out rather than 
having to read it back after compaction is done which I've found is very 
expensive when data is in S3.
   
   As far as the name goes, I struggled to come up with something intuitive - 
how do I explain in the name alone that this only applies when prefetching is 
on? I tried to convey "when prefetching, do the prefetch of compacted data on 
write." I'm not in love with the name and I'm open to suggestions. I didn't 
want to give the false impression that all compacted data is going to be cached 
on write. Maybe "cacheCompactedDataOnWriteIfPrefetching"? Is that too wordy?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [hbase] jacob-leblanc commented on a change in pull request #707: HBASE-23066 Allow cache on write during compactions when prefetching …

Reply via email to