wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##########
@@ -694,7 +700,7 @@ private void
refreshStoreFilesInternal(Collection<StoreFileInfo> newFiles) throw
refreshStoreSizeAndTotalBytes();
}
- protected HStoreFile createStoreFileAndReader(final Path p) throws
IOException {
+ public HStoreFile createStoreFileAndReader(final Path p) throws IOException {
Review comment:
> Although, I also see that HStore#openStoreFiles is using one instance
of createStoreFileAndReader, so we couldn't completely move this out of HStore.
Is it better to lift one method into Store and leave other instances "internal"
to HStore? That might help the smell a little, but an obvious solution isn't
jumping out at me.
Yes, there are many other usages, pretty much everything that needs access
to hfiles would end up accessing this method (indirectly).
In terms of responsibility, it does seem right to me to have HStore the
central point for store file accesses. Moving it to Store would still require
some code duplication there, or exposing the two methods, which is what we are
trying to avoid here.
Maybe a better solution is to define a lambda function in
Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader
when calling Compactor.commitCompaction().
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##########
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite)
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
- public void enableCacheOnWrite() {
+ public void enableCacheOnWriteForCompactions() {
this.cacheDataOnWrite = true;
this.cacheIndexesOnWrite = true;
this.cacheBloomsOnWrite = true;
}
+ /**
+ * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+ * 'totalCompactedFilesSize' is lower than
'cacheCompactedDataOnWriteThreshold',
+ * enables cache on write for below properties:
+ * - cacheDataOnWrite
+ * - cacheIndexesOnWrite
+ * - cacheBloomsOnWrite
+ *
+ * Otherwise, sets 'cacheDataOnWrite' only to false.
+ *
+ * @param totalCompactedFilesSize the total size of compacted files.
+ * @return true if the checks mentioned above pass and the cache is enabled,
false otherwise.
+ */
+ public boolean enableCacheOnWriteForCompactions(long
totalCompactedFilesSize) {
Review comment:
The javadoc had already been updated to mention the
cachecompactedblocksonwrite config property.
>Can always leave this old name as "deprecated" which calls a better-named
version, if you think it would help alleviate confusion.
This was not a method move/rename, just a logic that was previously embedded
in `HStore.createWriterInTmp` that I decided to move here in its own method for
reuse. The naming seems right for me, as it should only be called for
compactions, not for other types of writes. I add this to javadoc.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##########
@@ -278,6 +280,21 @@ protected final StoreFileWriter
createTmpWriter(FileDetails fd, boolean shouldDr
HConstants.EMPTY_STRING);
}
+ /**
+ * Default method for initializing a StoreFileWriter in the compaction
process, this creates the
Review comment:
>Is creating them in the temp directory really "default"? Or is it
"default" just because the direct-insert stuff isn't available yet? ;)
It is the default now, if we don't explicitly specify
`hbase.hstore.defaultengine.compactor.class` config property to set
`DirectStoreCompactor` implementation to be used, the previous existing
`DefaultCompactor` will be loaded, which implements this logic.
> When you say other compactors may not use it, it makes me wonder if it
makes sense to provide it here.
What I meant is that other compactors may overwrite the method with a
different logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]