[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r674801406 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -558,18 +559,19 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, * @param cr the compaction request. * @param newFiles the new files created by this compaction under a temp dir. * @param user the running user. + * @param fileAcessor a lambda expression with logic for loading a HStoreFile given a Path. * @return A list of the resulting store files already placed in the store dir and loaded into the * store cache. * @throws IOException if the commit fails. */ - public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) - throws IOException { + public List commitCompaction(CompactionRequestImpl cr, List newFiles, + User user, Function fileAcessor) throws IOException { Review comment: Definitely, and can make its method throw IOException too, leaving the HStore code cleaner. Had just pushed a new commit with it, for now, created as public inner class of Compactor itself, as it's strictly related to Compactor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r674797181 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStore.java ## @@ -1930,4 +1959,21 @@ public boolean add(T e) { @Override public List subList(int fromIndex, int toIndex) {return delegatee.subList(fromIndex, toIndex);} } + + private static class DummyCompactor extends DefaultCompactor { + +static CountDownLatch countDownLatch; + +public DummyCompactor(Configuration conf, HStore store) { Review comment: It actually matters, Compactor implementations are loaded via reflection, in this case via DefaultStoreEngine.createCompactor, where it expects a constructor to have exactly these two params. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r674791718 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java ## @@ -0,0 +1,123 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.regionserver.compactions; + +import static junit.framework.TestCase.assertEquals; +import static org.mockito.Mockito.mock; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.HBaseClassTestRule; +import org.apache.hadoop.hbase.HBaseTestingUtility; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Table; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.hadoop.hbase.testclassification.MediumTests; +import org.apache.hadoop.hbase.testclassification.RegionServerTests; +import org.apache.hadoop.hbase.util.Bytes; +import org.junit.After; +import org.junit.Before; +import org.junit.ClassRule; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.rules.TestName; + +/** + * Test class for DefaultCompactor. + */ +@Category({ RegionServerTests.class, MediumTests.class }) +public class TestDefaultCompactor { + + @ClassRule + public static final HBaseClassTestRule CLASS_RULE = +HBaseClassTestRule.forClass(TestDefaultCompactor.class); + + @Rule + public TestName name = new TestName(); + + private final Configuration config = new Configuration(); + private HStore store; + private final String cfName = "cf"; + private Compactor.FileDetails mockFileDetails; + + private HBaseTestingUtility UTIL = new HBaseTestingUtility(); + private TableName table; + + @Before + public void setup() throws Exception { Review comment: I am doing an individual TableName instance per test method, so setup has to run on each test initialisation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r674682958 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, dropDeletesFromRow, dropDeletesToRow); } + + /** + * Default implementation for committing store files created after a compaction. Assumes new files + * had been created on a temp directory, so it renames those files into the actual store dir, + * then create a reader and cache it into the store. + * @param cr the compaction request. + * @param newFiles the new files created by this compaction under a temp dir. + * @param user the running user. + * @return A list of the resulting store files already placed in the store dir and loaded into the + * store cache. + * @throws IOException if the commit fails. + */ + public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) + throws IOException { +List sfs = new ArrayList<>(newFiles.size()); +for (Path newFile : newFiles) { + assert newFile != null; + this.store.validateStoreFile(newFile); + // Move the file into the right spot + HStoreFile sf = createFileInStoreDir(newFile); + if (this.store.getCoprocessorHost() != null) { +this.store.getCoprocessorHost().postCompact(this.store, sf, cr.getTracker(), cr, user); + } + assert sf != null; + sfs.add(sf); +} +return sfs; + } + + /** + * Assumes new file was created initially on a temp folder. Review comment: I honestly had not looked further into these two. Am only worried to not break current logic used by those. Was planning to take care of those later on another PR, if we are to make it compatible with direct store/renameless writing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r673471477 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, dropDeletesFromRow, dropDeletesToRow); } + + /** + * Default implementation for committing store files created after a compaction. Assumes new files + * had been created on a temp directory, so it renames those files into the actual store dir, + * then create a reader and cache it into the store. + * @param cr the compaction request. + * @param newFiles the new files created by this compaction under a temp dir. + * @param user the running user. + * @return A list of the resulting store files already placed in the store dir and loaded into the + * store cache. + * @throws IOException if the commit fails. + */ + public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) + throws IOException { +List sfs = new ArrayList<>(newFiles.size()); +for (Path newFile : newFiles) { + assert newFile != null; + this.store.validateStoreFile(newFile); + // Move the file into the right spot + HStoreFile sf = createFileInStoreDir(newFile); + if (this.store.getCoprocessorHost() != null) { +this.store.getCoprocessorHost().postCompact(this.store, sf, cr.getTracker(), cr, user); + } + assert sf != null; + sfs.add(sf); +} +return sfs; + } + + /** + * Assumes new file was created initially on a temp folder. Review comment: Actually, when I tried to move it out of `Compactor` by making it abstract, it immediately broke `StripeCompactor` and `DateTieredCompactor`, which are relying in this logic, but not extending `DefaultCompactor`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r673437645 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, dropDeletesFromRow, dropDeletesToRow); } + + /** + * Default implementation for committing store files created after a compaction. Assumes new files + * had been created on a temp directory, so it renames those files into the actual store dir, + * then create a reader and cache it into the store. + * @param cr the compaction request. + * @param newFiles the new files created by this compaction under a temp dir. + * @param user the running user. + * @return A list of the resulting store files already placed in the store dir and loaded into the + * store cache. + * @throws IOException if the commit fails. + */ + public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) + throws IOException { +List sfs = new ArrayList<>(newFiles.size()); +for (Path newFile : newFiles) { + assert newFile != null; + this.store.validateStoreFile(newFile); + // Move the file into the right spot + HStoreFile sf = createFileInStoreDir(newFile); + if (this.store.getCoprocessorHost() != null) { +this.store.getCoprocessorHost().postCompact(this.store, sf, cr.getTracker(), cr, user); + } + assert sf != null; + sfs.add(sf); +} +return sfs; + } + + /** + * Assumes new file was created initially on a temp folder. Review comment: Originally, the logic of creating files in temp dir was already implemented in the abstract `Compactor`, not on `DefaultCompactor`. That's why I left it here. I can leave this method abstract here and move the implementation to `DefaultCompactor`, though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r673229935 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java ## @@ -0,0 +1,123 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.regionserver.compactions; + +import static junit.framework.TestCase.assertEquals; +import static org.mockito.Mockito.mock; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.HBaseClassTestRule; +import org.apache.hadoop.hbase.HBaseTestingUtility; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Table; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.hadoop.hbase.testclassification.MediumTests; +import org.apache.hadoop.hbase.testclassification.RegionServerTests; +import org.apache.hadoop.hbase.util.Bytes; +import org.junit.After; +import org.junit.Before; +import org.junit.ClassRule; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.rules.TestName; + +/** + * Test class for DefaultCompactor. + */ +@Category({ RegionServerTests.class, MediumTests.class }) +public class TestDefaultCompactor { + + @ClassRule + public static final HBaseClassTestRule CLASS_RULE = +HBaseClassTestRule.forClass(TestDefaultCompactor.class); + + @Rule + public TestName name = new TestName(); + + private final Configuration config = new Configuration(); + private HStore store; + private final String cfName = "cf"; + private Compactor.FileDetails mockFileDetails; + + private HBaseTestingUtility UTIL = new HBaseTestingUtility(); + private TableName table; + + @Before + public void setup() throws Exception { Review comment: You mean, as BeforeClass and reuse cluster between tests? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r673228785 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDirectStoreCompactor.java ## @@ -0,0 +1,135 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.regionserver.compactions; + +import static junit.framework.TestCase.assertEquals; +import static org.mockito.ArgumentMatchers.any; +import static org.mockito.ArgumentMatchers.anyBoolean; +import static org.mockito.ArgumentMatchers.isNull; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.hbase.CellComparator; +import org.apache.hadoop.hbase.HBaseClassTestRule; +import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor; +import org.apache.hadoop.hbase.client.RegionInfo; +import org.apache.hadoop.hbase.io.ByteBuffAllocator; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.crypto.Encryption; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.BloomType; +import org.apache.hadoop.hbase.regionserver.HRegion; +import org.apache.hadoop.hbase.regionserver.HRegionFileSystem; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreContext; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.hadoop.hbase.testclassification.MediumTests; +import org.apache.hadoop.hbase.testclassification.RegionServerTests; +import org.apache.hadoop.hbase.util.ChecksumType; +import org.junit.Before; +import org.junit.ClassRule; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.rules.TestName; +import org.mockito.ArgumentCaptor; + +/** + * Test class for DirectStoreCompactor. + */ +@Category({ RegionServerTests.class, MediumTests.class }) +public class TestDirectStoreCompactor { Review comment: I normally favour using mocks as it makes tests more lightweight to run. I agree that the amount of mocking here is just too much, hard to understand/maintain and also may be not really testing functionality as a whole. Let change this to use HBaseTestingUtility, instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r672923935 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: The javadoc had already been updated to mention the cachecompactedblocksonwrite config property. >Can always leave this old name as "deprecated" which calls a better-named version, if you think it would help alleviate confusion. This was not a method move/rename, just a logic that was previously embedded in `HStore.createWriterInTmp` that I decided to move here in its own method for reuse. The naming seems right for me, as it should only be called for compactions, not for other types of writes. I will add this to javadoc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -694,7 +700,7 @@ private void refreshStoreFilesInternal(Collection newFiles) throw refreshStoreSizeAndTotalBytes(); } - protected HStoreFile createStoreFileAndReader(final Path p) throws IOException { + public HStoreFile createStoreFileAndReader(final Path p) throws IOException { Review comment: > Although, I also see that HStore#openStoreFiles is using one instance of createStoreFileAndReader, so we couldn't completely move this out of HStore. Is it better to lift one method into Store and leave other instances "internal" to HStore? That might help the smell a little, but an obvious solution isn't jumping out at me. Yes, there are many other usages, pretty much everything that needs access to hfiles would end up accessing this method (indirectly). In terms of responsibility, it does seem right to me to have HStore the central point for store file accesses. Moving it to Store would still require some code duplication there, or exposing the two methods, which is what we are trying to avoid here. Maybe a better solution is to define a lambda function in Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader when calling Compactor.commitCompaction(). ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: The javadoc had already been updated to mention the cachecompactedblocksonwrite config property. >Can always leave this old name as "deprecated" which calls a better-named version, if you think it would help alleviate confusion. This was not a method move/rename, just a logic that was previously embedded in `HStore.createWriterInTmp` that I decided to move here in its own method for reuse. The naming seems right for me, as it should only be called for compactions, not for other types of writes. I add this to javadoc. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -278,6 +280,21 @@ protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDr HConstants.EMPTY_STRING); } + /** + * Default method for initializing a StoreFileWriter in the compaction process, this creates the Review comment: >Is creating them in the temp directory really "default"? Or is it "default" just because the direct-insert stuff isn't available yet? ;) It is the default now, if we don't explicitly specify `hbase.hstore.defaultengine.compactor.class` config property to set `DirectStoreCompactor` implementation to be used, the previous existing `DefaultCompactor` will be loaded, which implements this logic. > When you say other compactors may not use it, it makes me wonder if it makes sense to provide it here. What I meant is that other compactors may overwrite the method with a different logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -694,7 +700,7 @@ private void refreshStoreFilesInternal(Collection newFiles) throw refreshStoreSizeAndTotalBytes(); } - protected HStoreFile createStoreFileAndReader(final Path p) throws IOException { + public HStoreFile createStoreFileAndReader(final Path p) throws IOException { Review comment: > Although, I also see that HStore#openStoreFiles is using one instance of createStoreFileAndReader, so we couldn't completely move this out of HStore. Is it better to lift one method into Store and leave other instances "internal" to HStore? That might help the smell a little, but an obvious solution isn't jumping out at me. Yes, there are many other usages, pretty much everything that needs access to hfiles would end up accessing this method (indirectly). In terms of responsibility, it does seem right to me to have HStore the central point for store file accesses. Moving it to Store would still require some code duplication there, or exposing the two methods, which is what we are trying to avoid here. Maybe a better solution is to define a lambda function in Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader when calling Compactor.commitCompaction(). ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: The javadoc had already been updated to mention the cachecompactedblocksonwrite config property. >Can always leave this old name as "deprecated" which calls a better-named version, if you think it would help alleviate confusion. This was not a method move/rename, just a logic that was previously embedded in `HStore.createWriterInTmp` that I decided to move here in its own method for reuse. The naming seems right for me, as it should only be called for compactions, not for other types of writes. I add this to javadoc. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -278,6 +280,21 @@ protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDr HConstants.EMPTY_STRING); } + /** + * Default method for initializing a StoreFileWriter in the compaction process, this creates the Review comment: >Is creating them in the temp directory really "default"? Or is it "default" just because the direct-insert stuff isn't available yet? ;) It is the default now, if we don't explicitly specify `hbase.hstore.defaultengine.compactor.class` config property to set `DirectStoreCompactor` implementation to be used, the previous existing `DefaultCompactor` will be loaded, which implements this logic. > When you say other compactors may not use it, it makes me wonder if it makes sense to provide it here. What I meant is that other compactors may overwrite the method with a different logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r672932031 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -278,6 +280,21 @@ protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDr HConstants.EMPTY_STRING); } + /** + * Default method for initializing a StoreFileWriter in the compaction process, this creates the Review comment: >Is creating them in the temp directory really "default"? Or is it "default" just because the direct-insert stuff isn't available yet? ;) It is the default now, if we don't explicitly specify `hbase.hstore.defaultengine.compactor.class` config property to set `DirectStoreCompactor` implementation to be used, the previous existing `DefaultCompactor` will be loaded, which implements this logic. > When you say other compactors may not use it, it makes me wonder if it makes sense to provide it here. What I meant is that other compactors may overwrite the method with a different logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r672923935 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: The javadoc had already been updated to mention the cachecompactedblocksonwrite config property. >Can always leave this old name as "deprecated" which calls a better-named version, if you think it would help alleviate confusion. This was not a method move/rename, just a logic that was previously embedded in `HStore.createWriterInTmp` that I decided to move here in its own method for reuse. The naming seems right for me, as it should only be called for compactions, not for other types of writes. I add this to javadoc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -694,7 +700,7 @@ private void refreshStoreFilesInternal(Collection newFiles) throw refreshStoreSizeAndTotalBytes(); } - protected HStoreFile createStoreFileAndReader(final Path p) throws IOException { + public HStoreFile createStoreFileAndReader(final Path p) throws IOException { Review comment: > Although, I also see that HStore#openStoreFiles is using one instance of createStoreFileAndReader, so we couldn't completely move this out of HStore. Is it better to lift one method into Store and leave other instances "internal" to HStore? That might help the smell a little, but an obvious solution isn't jumping out at me. Yes, there are many other usages, pretty much everything that needs access to hfiles would end up accessing this method (indirectly). In terms of responsibility, it does seem right to me to have HStore the central point for store file accesses. Moving it to Store would still require some code duplication there, or exposing the two methods, which is what we are trying to avoid here. Maybe a better solution is to define a lambda function in Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader when calling Compactor.commitCompaction(). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663780661 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: Yeah, this logic was moved here from `HStore.createWriterInTmp` (as you noticed later on your following comment). Because now we override the writer creation in DirectStoreCompactor to not use this `HStore.createWriterInTmp`, we would need to duplicate this logic there, but since `CacheConfig` already has all the info required to decide on this, thought about moving it to `CacheConfig`. I reckon naming and cohesion is not been great, but I was just trying to concentrate on the original problem here of allowing compacting files getting created in the store directory, without the need for tmp dirs and renames. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: BTW, let me fix the javadoc description. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectStoreFlushContext.java ## @@ -30,14 +30,14 @@ * directly in the store dir, and therefore, doesn't perform a rename from tmp dir * into the store dir. * - * To be used only when PersistedStoreEngine is configured as the StoreEngine implementation. + * To be used only when DirectStoreFlushContext is configured as the StoreEngine implementation. Review comment: Thanks! ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -1145,16 +1151,13 @@ public StoreFileWriter createWriterInTmp(long maxKeyCount, Compression.Algorithm // if data blocks are to be cached on write // during compaction, we should forcefully // cache index and bloom blocks as well - if (cacheCompactedBlocksOnWrite && totalCompactedFilesSize <= cacheConf -.getCacheCompactedBlocksOnWriteThreshold()) { -writerCacheConf.enableCacheOnWrite(); + if (writerCacheConf.enableCacheOnWriteForCompactions(totalCompactedFilesSize)) { Review comment: Yep, see my reply above. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -1167,7 +1170,7 @@ public StoreFileWriter createWriterInTmp(long maxKeyCount, Compression.Algorithm } else { final boolean shouldCacheDataOnWrite = cacheConf.shouldCacheDataOnWrite(); if (shouldCacheDataOnWrite) { -writerCacheConf.enableCacheOnWrite(); +writerCacheConf.enableCacheOnWriteForCompactions(); Review comment: Yeah, should be only `enableCacheOnWrite`, as here we are calling it for other writes than compactions. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { *
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663979746 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectStoreCompactor.java ## @@ -0,0 +1,85 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectStoreCompactor extends DefaultCompactor { + public DirectStoreCompactor(Configuration conf, HStore store) { +super(conf, store); + } + + @Override + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +// When all MVCC readpoints are 0, don't write them. +// See HBASE-8166, HBASE-12600, and HBASE-13389. +return createWriterInFamilyDir(fd.maxKeyCount, + major ? majorCompactionCompression : minorCompactionCompression, + fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, + shouldDropBehind, fd.totalCompactedFilesSize); + } + + private StoreFileWriter createWriterInFamilyDir(long maxKeyCount, Review comment: Will do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663979457 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectStoreCompactor.java ## @@ -0,0 +1,85 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectStoreCompactor extends DefaultCompactor { Review comment: Will do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663974669 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, dropDeletesFromRow, dropDeletesToRow); } + + /** + * Default implementation for committing store files created after a compaction. Assumes new files + * had been created on a temp directory, so it renames those files into the actual store dir, + * then create a reader and cache it into the store. + * @param cr the compaction request. + * @param newFiles the new files created by this compaction under a temp dir. + * @param user the running user. + * @return A list of the resulting store files already placed in the store dir and loaded into the + * store cache. + * @throws IOException if the commit fails. + */ + public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) + throws IOException { +List sfs = new ArrayList<>(newFiles.size()); +for (Path newFile : newFiles) { + assert newFile != null; + this.store.validateStoreFile(newFile); + // Move the file into the right spot + HStoreFile sf = createFileInStoreDir(newFile); + if (this.store.getCoprocessorHost() != null) { +this.store.getCoprocessorHost().postCompact(this.store, sf, cr.getTracker(), cr, user); + } + assert sf != null; + sfs.add(sf); +} +return sfs; + } + + /** + * Assumes new file was created initially on a temp folder. Review comment: We still need to vary the logic of "where" the store file is originally created. Default behaviour is to create it under temp (that's done by the call from lines #591/#592), whilst `DirectStoreCompactor` assumes the passed path is already in the store dir itself, so it just needs to create the reader over the passed path. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663966127 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -278,6 +280,21 @@ protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDr HConstants.EMPTY_STRING); } + /** + * Default method for initializing a StoreFileWriter in the compaction process, this creates the Review comment: StoreEngine delegates the work to a `Compactor` implementation, where `DefaultCompator`, the default Compactor implementation, uses this one to create the file writer. So that's why we say it's the "default" method. Alternatives compactors may not use it, like the `DirectStoreCompactor` that implements its own logic that creates the file on the store directory, rather than under temp. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663958206 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -278,6 +280,21 @@ protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDr HConstants.EMPTY_STRING); } + /** + * Default method for initializing a StoreFileWriter in the compaction process, this creates the + * resulting files on a temp directory. Therefore, upon compaction commit time, these files + * should be renamed into the actual store dir. + * @param fd the file details. + * @param shouldDropBehind boolean for the drop-behind output stream cache settings. + * @param major if compaction is major. + * @return Writer for a new StoreFile in the tmp dir. + * @throws IOException if it fails to initialise the writer. + */ + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +return this.createTmpWriter(fd, shouldDropBehind, major); Review comment: Noted. Removing 'this' on next commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663957101 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -135,7 +135,7 @@ public CompactionProgress getProgress() { /** Min SeqId to keep during a major compaction **/ public long minSeqIdToKeep = 0; /** Total size of the compacted files **/ -private long totalCompactedFilesSize = 0; +public long totalCompactedFilesSize = 0; Review comment: This was the only marked as private here. Unfortunately needed exposing that to `DirectStoreCompactor`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663949806 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -1192,7 +1195,7 @@ public StoreFileWriter createWriterInTmp(long maxKeyCount, Compression.Algorithm return builder.build(); } - HFileContext createFileContext(Compression.Algorithm compression, + public HFileContext createFileContext(Compression.Algorithm compression, Review comment: This one in particular, could had logic duplicated wherever it's needed, but it's more of a helper method, so didn't think making it public would be harmful. I can revert it back to package private and re-implement this in the `DirectStoreCompactor`, if you think this shouldn't be exposed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663949806 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -1192,7 +1195,7 @@ public StoreFileWriter createWriterInTmp(long maxKeyCount, Compression.Algorithm return builder.build(); } - HFileContext createFileContext(Compression.Algorithm compression, + public HFileContext createFileContext(Compression.Algorithm compression, Review comment: This one in particular, could had logic duplicated wherever it's needed, but it's more of a helper method, so didn't think making it public would be harmful. I can revert it back to package private and re-implement this in the `DirectStoreCompactor`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663947613 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -694,7 +700,7 @@ private void refreshStoreFilesInternal(Collection newFiles) throw refreshStoreSizeAndTotalBytes(); } - protected HStoreFile createStoreFileAndReader(final Path p) throws IOException { + public HStoreFile createStoreFileAndReader(final Path p) throws IOException { Review comment: Prior tho this, the "commit compaction" logic was implemented straight inside `HStore.doCompaction -> moveCompactedFilesIntoPlace -> moveFileIntoPlace` which was where `createStoreFileAndReader` was getting called. I believe this is "compaction specific" logic, and if we are aiming for pluggable compaction behaviour, it shouldn't be implemented in the HStore. That's why here we changed it to be `HStore.doCompaction -> storeEngine.compactor.commitCompaction`, so we had to eventually call it from the compactors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663837361 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -431,7 +437,7 @@ public static long determineTTLFromFamily(final ColumnFamilyDescriptor family) { return ttl; } - StoreContext getStoreContext() { + public StoreContext getStoreContext() { Review comment: Honestly, I can't remember now why I had duplicated `cryptoContext` on `HStore` here, when it's already defined on `StoreContext`. It doesn't seem correct to me now, let me remove. The need to expose `getStoreContext` as `public` stands, though, as we need that from the `DirectStoreCompactor.createWriterInFamilyDir`, which is in a separate package. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663828425 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java ## @@ -1372,7 +1372,7 @@ public RegionInfo getRegionInfo() { * @return Instance of {@link RegionServerServices} used by this HRegion. * Can be null. */ - RegionServerServices getRegionServerServices() { + public RegionServerServices getRegionServerServices() { Review comment: Not needed after the changes addressing some of @z-york suggestions. Let me revert this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663781091 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: BTW, let me fix the javadoc description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663822974 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -1167,7 +1170,7 @@ public StoreFileWriter createWriterInTmp(long maxKeyCount, Compression.Algorithm } else { final boolean shouldCacheDataOnWrite = cacheConf.shouldCacheDataOnWrite(); if (shouldCacheDataOnWrite) { -writerCacheConf.enableCacheOnWrite(); +writerCacheConf.enableCacheOnWriteForCompactions(); Review comment: Yeah, should be only `enableCacheOnWrite`, as here we are calling it for other writes than compactions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663822333 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java ## @@ -1145,16 +1151,13 @@ public StoreFileWriter createWriterInTmp(long maxKeyCount, Compression.Algorithm // if data blocks are to be cached on write // during compaction, we should forcefully // cache index and bloom blocks as well - if (cacheCompactedBlocksOnWrite && totalCompactedFilesSize <= cacheConf -.getCacheCompactedBlocksOnWriteThreshold()) { -writerCacheConf.enableCacheOnWrite(); + if (writerCacheConf.enableCacheOnWriteForCompactions(totalCompactedFilesSize)) { Review comment: Yep, see my reply above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663781334 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectStoreFlushContext.java ## @@ -30,14 +30,14 @@ * directly in the store dir, and therefore, doesn't perform a rename from tmp dir * into the store dir. * - * To be used only when PersistedStoreEngine is configured as the StoreEngine implementation. + * To be used only when DirectStoreFlushContext is configured as the StoreEngine implementation. Review comment: Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663781091 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: BTW, let me fix the javadoc description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r663780661 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) { * cacheIndexesOnWrite * cacheBloomsOnWrite */ - public void enableCacheOnWrite() { + public void enableCacheOnWriteForCompactions() { this.cacheDataOnWrite = true; this.cacheIndexesOnWrite = true; this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWriteForCompactions(long totalCompactedFilesSize) { Review comment: Yeah, this logic was moved here from `HStore.createWriterInTmp` (as you noticed later on your following comment). Because now we override the writer creation in DirectStoreCompactor to not use this `HStore.createWriterInTmp`, we would need to duplicate this logic there, but since `CacheConfig` already has all the info required to decide on this, thought about moving it to `CacheConfig`. I reckon naming and cohesion is not been great, but I was just trying to concentrate on the original problem here of allowing compacting files getting created in the store directory, without the need for tmp dirs and renames. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r657484035 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java ## @@ -0,0 +1,90 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectInStoreCompactor extends DefaultCompactor { + public DirectInStoreCompactor(Configuration conf, HStore store) { +super(conf, store); + } + + @Override + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +// When all MVCC readpoints are 0, don't write them. +// See HBASE-8166, HBASE-12600, and HBASE-13389. +return createWriterInFamilyDir(fd.maxKeyCount, + major ? majorCompactionCompression : minorCompactionCompression, + fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind); + } + + private StoreFileWriter createWriterInFamilyDir(long maxKeyCount, + Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean includesTag, +boolean shouldDropBehind) throws IOException { +final CacheConfig writerCacheConf; +// Don't cache data on write on compactions. +writerCacheConf = new CacheConfig(store.getCacheConfig()); +writerCacheConf.setCacheDataOnWrite(false); + +InetSocketAddress[] favoredNodes = null; +if (store.getHRegion().getRegionServerServices() != null) { + favoredNodes = store.getHRegion().getRegionServerServices().getFavoredNodesForRegion( +store.getHRegion().getRegionInfo().getEncodedName()); Review comment: Wasn't finding `store.getStoreContext()` before, since it was package private. Made it public on HStore and am calling it there now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r657483340 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectStoreCompactor.java ## @@ -0,0 +1,90 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectStoreCompactor extends DefaultCompactor { + public DirectStoreCompactor(Configuration conf, HStore store) { +super(conf, store); + } + + @Override + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +// When all MVCC readpoints are 0, don't write them. +// See HBASE-8166, HBASE-12600, and HBASE-13389. +return createWriterInFamilyDir(fd.maxKeyCount, + major ? majorCompactionCompression : minorCompactionCompression, + fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, + shouldDropBehind, fd.totalCompactedFilesSize); + } + + private StoreFileWriter createWriterInFamilyDir(long maxKeyCount, + Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean includesTag, +boolean shouldDropBehind, long totalCompactedFilesSize) throws IOException { +final CacheConfig writerCacheConf; +// Don't cache data on write on compactions. Review comment: Removed now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r657483258 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java ## @@ -297,6 +297,30 @@ public void enableCacheOnWrite() { this.cacheBloomsOnWrite = true; } + /** + * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and + * 'totalCompactedFilesSize' is lower than 'cacheCompactedDataOnWriteThreshold', + * enables cache on write for below properties: + * - cacheDataOnWrite + * - cacheIndexesOnWrite + * - cacheBloomsOnWrite + * + * Otherwise, sets 'cacheDataOnWrite' only to false. + * + * @param totalCompactedFilesSize the total size of compacted files. + * @return true if the checks mentioned above pass and the cache is enabled, false otherwise. + */ + public boolean enableCacheOnWrite(long totalCompactedFilesSize) { Review comment: Renamed it on last commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656009419 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java ## @@ -33,11 +33,11 @@ * To be used only when PersistedStoreEngine is configured as the StoreEngine implementation. */ @InterfaceAudience.Private -public class PersistedStoreFlushContext extends DefaultStoreFlushContext { +public class DirectInStoreFlushContext extends DefaultStoreFlushContext { Review comment: Changed name after comments from @saintstack on HBASE-25391. I can do "DirectStore" as suggested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656100941 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java ## @@ -33,11 +33,11 @@ * To be used only when PersistedStoreEngine is configured as the StoreEngine implementation. Review comment: Yes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656098099 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java ## @@ -0,0 +1,90 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectInStoreCompactor extends DefaultCompactor { + public DirectInStoreCompactor(Configuration conf, HStore store) { +super(conf, store); + } + + @Override + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +// When all MVCC readpoints are 0, don't write them. +// See HBASE-8166, HBASE-12600, and HBASE-13389. +return createWriterInFamilyDir(fd.maxKeyCount, + major ? majorCompactionCompression : minorCompactionCompression, + fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind); + } + + private StoreFileWriter createWriterInFamilyDir(long maxKeyCount, + Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean includesTag, +boolean shouldDropBehind) throws IOException { +final CacheConfig writerCacheConf; +// Don't cache data on write on compactions. +writerCacheConf = new CacheConfig(store.getCacheConfig()); +writerCacheConf.setCacheDataOnWrite(false); + +InetSocketAddress[] favoredNodes = null; +if (store.getHRegion().getRegionServerServices() != null) { + favoredNodes = store.getHRegion().getRegionServerServices().getFavoredNodesForRegion( +store.getHRegion().getRegionInfo().getEncodedName()); +} +HFileContext hFileContext = store.createFileContext(compression, includeMVCCReadpoint, + includesTag, store.getCryptoContext()); +Path familyDir = new Path(store.getRegionFileSystem().getRegionDir(), + store.getColumnFamilyDescriptor().getNameAsString()); +StoreFileWriter.Builder builder = new StoreFileWriter.Builder(conf, writerCacheConf, + store.getFileSystem()) + .withOutputDir(familyDir) + .withBloomType(store.getColumnFamilyDescriptor().getBloomFilterType()) + .withMaxKeyCount(maxKeyCount) + .withFavoredNodes(favoredNodes) + .withFileContext(hFileContext) + .withShouldDropCacheBehind(shouldDropBehind) + .withCompactedFilesSupplier(() -> store.getCompactedFiles()); Review comment: Ack. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656045500 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java ## @@ -0,0 +1,90 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectInStoreCompactor extends DefaultCompactor { + public DirectInStoreCompactor(Configuration conf, HStore store) { +super(conf, store); + } + + @Override + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +// When all MVCC readpoints are 0, don't write them. +// See HBASE-8166, HBASE-12600, and HBASE-13389. +return createWriterInFamilyDir(fd.maxKeyCount, + major ? majorCompactionCompression : minorCompactionCompression, + fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind); + } + + private StoreFileWriter createWriterInFamilyDir(long maxKeyCount, + Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean includesTag, +boolean shouldDropBehind) throws IOException { +final CacheConfig writerCacheConf; +// Don't cache data on write on compactions. +writerCacheConf = new CacheConfig(store.getCacheConfig()); +writerCacheConf.setCacheDataOnWrite(false); + +InetSocketAddress[] favoredNodes = null; +if (store.getHRegion().getRegionServerServices() != null) { + favoredNodes = store.getHRegion().getRegionServerServices().getFavoredNodesForRegion( +store.getHRegion().getRegionInfo().getEncodedName()); Review comment: I already need an instance of HStore, as you noted on your comment below. This HStore instance need is not exclusive from this Compactor extension, it is already defined on the parent classes single declared constructors, so it's used for other stuff in the hierarchy, not only the filereader creation. I don't think there's much value in create an extra StoreContext here when we have everything we need in HStore instance itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656019402 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java ## @@ -0,0 +1,90 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.regionserver.compactions; + +import java.io.IOException; +import java.net.InetSocketAddress; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.io.compress.Compression; +import org.apache.hadoop.hbase.io.hfile.CacheConfig; +import org.apache.hadoop.hbase.io.hfile.HFileContext; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.yetus.audience.InterfaceAudience; + +@InterfaceAudience.Private +public class DirectInStoreCompactor extends DefaultCompactor { + public DirectInStoreCompactor(Configuration conf, HStore store) { +super(conf, store); + } + + @Override + protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) +throws IOException { +// When all MVCC readpoints are 0, don't write them. +// See HBASE-8166, HBASE-12600, and HBASE-13389. +return createWriterInFamilyDir(fd.maxKeyCount, + major ? majorCompactionCompression : minorCompactionCompression, + fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind); + } + + private StoreFileWriter createWriterInFamilyDir(long maxKeyCount, + Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean includesTag, +boolean shouldDropBehind) throws IOException { +final CacheConfig writerCacheConf; +// Don't cache data on write on compactions. +writerCacheConf = new CacheConfig(store.getCacheConfig()); +writerCacheConf.setCacheDataOnWrite(false); Review comment: Yeah, missed this while converting my original PoC to this PR. Let me try avoid duplication with `HStore.createWriterInTmp`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656009419 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java ## @@ -33,11 +33,11 @@ * To be used only when PersistedStoreEngine is configured as the StoreEngine implementation. */ @InterfaceAudience.Private -public class PersistedStoreFlushContext extends DefaultStoreFlushContext { +public class DirectInStoreFlushContext extends DefaultStoreFlushContext { Review comment: Changed name after comments from @saintstack on HBASE-25391. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r656009419 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java ## @@ -33,11 +33,11 @@ * To be used only when PersistedStoreEngine is configured as the StoreEngine implementation. */ @InterfaceAudience.Private -public class PersistedStoreFlushContext extends DefaultStoreFlushContext { +public class DirectInStoreFlushContext extends DefaultStoreFlushContext { Review comment: Name change suggestion came after comments from @saintstack on HBASE-25391. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r655650248 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, List s * @param fd The file details. * @return Writer for a new StoreFile in the tmp dir. * @throws IOException if creation failed + * @deprecated Use initWriter instead. Review comment: You mean the one in line #271? ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, List s * @param fd The file details. * @return Writer for a new StoreFile in the tmp dir. * @throws IOException if creation failed + * @deprecated Use initWriter instead. Review comment: You mean the one in line #271? Original commit didn't have it, but I had added on the second commit from last week. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -533,4 +549,46 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, dropDeletesFromRow, dropDeletesToRow); } + + /** + * Default implementation for committing store files created after a compaction. Assumes new files + * had been created on a temp directory, so it renames those files into the actual store dir, + * then create a reader and cache it into the store. + * @param cr the compaction request. + * @param newFiles the new files created by this compaction under a temp dir. + * @param user the running user/ + * @return A list of the resulting store files already placed in the store dir and loaded into the + * store cache. + * @throws IOException + */ + public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) + throws IOException { +List sfs = new ArrayList<>(newFiles.size()); +for (Path newFile : newFiles) { + assert newFile != null; + this.store.validateStoreFile(newFile); Review comment: This was already been called before, indirectly, in HStore, via `doCompaction -> moveCompactedFilesIntoPlace -> moveFileIntoPlace -> validateStoreFile`. I had just merged some of the methods and moved this new method in the Compactor class itself. ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java ## @@ -0,0 +1,123 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.regionserver.compactions; + +import static junit.framework.TestCase.assertEquals; +import static org.mockito.Mockito.mock; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.HBaseClassTestRule; +import org.apache.hadoop.hbase.HBaseTestingUtility; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Table; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.hadoop.hbase.testclassification.MediumTests; +import org.apache.hadoop.hbase.testclassification.RegionServerTests; +import org.apache.hadoop.hbase.util.Bytes; +import org.junit.After; +import org.junit.Before; +import org.junit.ClassRule; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.rules.TestName; + +/** + * Test class for DirectInStoreCompactor. + */ Review comment: Indeed, addressing it on next commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r655657711 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java ## @@ -0,0 +1,123 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.regionserver.compactions; + +import static junit.framework.TestCase.assertEquals; +import static org.mockito.Mockito.mock; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hbase.HBaseClassTestRule; +import org.apache.hadoop.hbase.HBaseTestingUtility; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Table; +import org.apache.hadoop.hbase.regionserver.HStore; +import org.apache.hadoop.hbase.regionserver.HStoreFile; +import org.apache.hadoop.hbase.regionserver.StoreFileWriter; +import org.apache.hadoop.hbase.testclassification.MediumTests; +import org.apache.hadoop.hbase.testclassification.RegionServerTests; +import org.apache.hadoop.hbase.util.Bytes; +import org.junit.After; +import org.junit.Before; +import org.junit.ClassRule; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.rules.TestName; + +/** + * Test class for DirectInStoreCompactor. + */ Review comment: Indeed, addressing it on next commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r655657218 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -533,4 +549,46 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, dropDeletesFromRow, dropDeletesToRow); } + + /** + * Default implementation for committing store files created after a compaction. Assumes new files + * had been created on a temp directory, so it renames those files into the actual store dir, + * then create a reader and cache it into the store. + * @param cr the compaction request. + * @param newFiles the new files created by this compaction under a temp dir. + * @param user the running user/ + * @return A list of the resulting store files already placed in the store dir and loaded into the + * store cache. + * @throws IOException + */ + public List commitCompaction(CompactionRequestImpl cr, List newFiles, User user) + throws IOException { +List sfs = new ArrayList<>(newFiles.size()); +for (Path newFile : newFiles) { + assert newFile != null; + this.store.validateStoreFile(newFile); Review comment: This was already been called before, indirectly, in HStore, via `doCompaction -> moveCompactedFilesIntoPlace -> moveFileIntoPlace -> validateStoreFile`. I had just merged some of the methods and moved this new method in the Compactor class itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r655650248 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, List s * @param fd The file details. * @return Writer for a new StoreFile in the tmp dir. * @throws IOException if creation failed + * @deprecated Use initWriter instead. Review comment: You mean the one in line #271? Original commit didn't have it, but I had added on the second commit from last week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.
wchevreuil commented on a change in pull request #3389: URL: https://github.com/apache/hbase/pull/3389#discussion_r655650248 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java ## @@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, List s * @param fd The file details. * @return Writer for a new StoreFile in the tmp dir. * @throws IOException if creation failed + * @deprecated Use initWriter instead. Review comment: You mean the one in line #271? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org