[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r674801406



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -558,18 +559,19 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
* @param cr the compaction request.
* @param newFiles the new files created by this compaction under a temp dir.
* @param user the running user.
+   * @param fileAcessor a lambda expression with logic for loading a 
HStoreFile given a Path.
* @return A list of the resulting store files already placed in the store 
dir and loaded into the
* store cache.
* @throws IOException if the commit fails.
*/
-  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
-  throws IOException {
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles,
+  User user, Function fileAcessor) throws IOException {

Review comment:
   Definitely, and can make its method throw IOException too, leaving the 
HStore code cleaner. Had just pushed a new commit with it, for now, created as 
public inner class of Compactor itself, as it's strictly related to Compactor.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r674797181



##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStore.java
##
@@ -1930,4 +1959,21 @@ public boolean add(T e) {
 @Override
 public List subList(int fromIndex, int toIndex) {return 
delegatee.subList(fromIndex, toIndex);}
   }
+
+  private static class DummyCompactor extends DefaultCompactor {
+
+static CountDownLatch countDownLatch;
+
+public DummyCompactor(Configuration conf, HStore store) {

Review comment:
   It actually matters, Compactor implementations are loaded via 
reflection, in this case via DefaultStoreEngine.createCompactor, where it 
expects a constructor to have exactly these two params.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r674791718



##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java
##
@@ -0,0 +1,123 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.mockito.Mockito.mock;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.hadoop.hbase.testclassification.MediumTests;
+import org.apache.hadoop.hbase.testclassification.RegionServerTests;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+
+/**
+ * Test class for DefaultCompactor.
+ */
+@Category({ RegionServerTests.class, MediumTests.class })
+public class TestDefaultCompactor {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+HBaseClassTestRule.forClass(TestDefaultCompactor.class);
+
+  @Rule
+  public TestName name = new TestName();
+
+  private final Configuration config = new Configuration();
+  private HStore store;
+  private final String cfName = "cf";
+  private Compactor.FileDetails mockFileDetails;
+
+  private HBaseTestingUtility UTIL = new HBaseTestingUtility();
+  private TableName table;
+
+  @Before
+  public void setup() throws Exception {

Review comment:
   I am doing an individual TableName instance per test method, so setup 
has to run on each test initialisation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r674682958



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
 return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, 
earliestPutTs,
 dropDeletesFromRow, dropDeletesToRow);
   }
+
+  /**
+   * Default implementation for committing store files created after a 
compaction. Assumes new files
+   * had been created on a temp directory, so it renames those files into the 
actual store dir,
+   * then create a reader and cache it into the store.
+   * @param cr the compaction request.
+   * @param newFiles the new files created by this compaction under a temp dir.
+   * @param user the running user.
+   * @return A list of the resulting store files already placed in the store 
dir and loaded into the
+   * store cache.
+   * @throws IOException if the commit fails.
+   */
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
+  throws IOException {
+List sfs = new ArrayList<>(newFiles.size());
+for (Path newFile : newFiles) {
+  assert newFile != null;
+  this.store.validateStoreFile(newFile);
+  // Move the file into the right spot
+  HStoreFile sf = createFileInStoreDir(newFile);
+  if (this.store.getCoprocessorHost() != null) {
+this.store.getCoprocessorHost().postCompact(this.store, sf, 
cr.getTracker(), cr, user);
+  }
+  assert sf != null;
+  sfs.add(sf);
+}
+return sfs;
+  }
+
+  /**
+   * Assumes new file was created initially on a temp folder.

Review comment:
   I honestly had not looked further into these two. Am only worried to not 
break current logic used by those. Was planning to take care of those later on 
another PR, if we are to make it compatible with direct store/renameless 
writing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r673471477



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
 return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, 
earliestPutTs,
 dropDeletesFromRow, dropDeletesToRow);
   }
+
+  /**
+   * Default implementation for committing store files created after a 
compaction. Assumes new files
+   * had been created on a temp directory, so it renames those files into the 
actual store dir,
+   * then create a reader and cache it into the store.
+   * @param cr the compaction request.
+   * @param newFiles the new files created by this compaction under a temp dir.
+   * @param user the running user.
+   * @return A list of the resulting store files already placed in the store 
dir and loaded into the
+   * store cache.
+   * @throws IOException if the commit fails.
+   */
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
+  throws IOException {
+List sfs = new ArrayList<>(newFiles.size());
+for (Path newFile : newFiles) {
+  assert newFile != null;
+  this.store.validateStoreFile(newFile);
+  // Move the file into the right spot
+  HStoreFile sf = createFileInStoreDir(newFile);
+  if (this.store.getCoprocessorHost() != null) {
+this.store.getCoprocessorHost().postCompact(this.store, sf, 
cr.getTracker(), cr, user);
+  }
+  assert sf != null;
+  sfs.add(sf);
+}
+return sfs;
+  }
+
+  /**
+   * Assumes new file was created initially on a temp folder.

Review comment:
   Actually, when I tried to move it out of `Compactor` by making it 
abstract, it immediately  broke `StripeCompactor` and `DateTieredCompactor`, 
which are relying in this logic, but not extending `DefaultCompactor`. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r673437645



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
 return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, 
earliestPutTs,
 dropDeletesFromRow, dropDeletesToRow);
   }
+
+  /**
+   * Default implementation for committing store files created after a 
compaction. Assumes new files
+   * had been created on a temp directory, so it renames those files into the 
actual store dir,
+   * then create a reader and cache it into the store.
+   * @param cr the compaction request.
+   * @param newFiles the new files created by this compaction under a temp dir.
+   * @param user the running user.
+   * @return A list of the resulting store files already placed in the store 
dir and loaded into the
+   * store cache.
+   * @throws IOException if the commit fails.
+   */
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
+  throws IOException {
+List sfs = new ArrayList<>(newFiles.size());
+for (Path newFile : newFiles) {
+  assert newFile != null;
+  this.store.validateStoreFile(newFile);
+  // Move the file into the right spot
+  HStoreFile sf = createFileInStoreDir(newFile);
+  if (this.store.getCoprocessorHost() != null) {
+this.store.getCoprocessorHost().postCompact(this.store, sf, 
cr.getTracker(), cr, user);
+  }
+  assert sf != null;
+  sfs.add(sf);
+}
+return sfs;
+  }
+
+  /**
+   * Assumes new file was created initially on a temp folder.

Review comment:
   Originally, the logic of creating files in temp dir was already 
implemented in the abstract `Compactor`, not on `DefaultCompactor`. That's why 
I left it here. I can leave this method abstract here and move the 
implementation to `DefaultCompactor`, though.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r673229935



##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java
##
@@ -0,0 +1,123 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.mockito.Mockito.mock;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.hadoop.hbase.testclassification.MediumTests;
+import org.apache.hadoop.hbase.testclassification.RegionServerTests;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+
+/**
+ * Test class for DefaultCompactor.
+ */
+@Category({ RegionServerTests.class, MediumTests.class })
+public class TestDefaultCompactor {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+HBaseClassTestRule.forClass(TestDefaultCompactor.class);
+
+  @Rule
+  public TestName name = new TestName();
+
+  private final Configuration config = new Configuration();
+  private HStore store;
+  private final String cfName = "cf";
+  private Compactor.FileDetails mockFileDetails;
+
+  private HBaseTestingUtility UTIL = new HBaseTestingUtility();
+  private TableName table;
+
+  @Before
+  public void setup() throws Exception {

Review comment:
   You mean, as BeforeClass and reuse cluster between tests?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r673228785



##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDirectStoreCompactor.java
##
@@ -0,0 +1,135 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.ArgumentMatchers.anyBoolean;
+import static org.mockito.ArgumentMatchers.isNull;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.hbase.CellComparator;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.io.ByteBuffAllocator;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.crypto.Encryption;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.BloomType;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionFileSystem;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreContext;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.hadoop.hbase.testclassification.MediumTests;
+import org.apache.hadoop.hbase.testclassification.RegionServerTests;
+import org.apache.hadoop.hbase.util.ChecksumType;
+import org.junit.Before;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+import org.mockito.ArgumentCaptor;
+
+/**
+ * Test class for DirectStoreCompactor.
+ */
+@Category({ RegionServerTests.class, MediumTests.class })
+public class TestDirectStoreCompactor {

Review comment:
   I normally favour using mocks as it makes tests more lightweight to run. 
I agree that the amount of mocking here is just too much, hard to 
understand/maintain and also may be not really testing functionality as a 
whole. Let change this to use HBaseTestingUtility, instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672923935



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   The javadoc had already been updated to mention the 
cachecompactedblocksonwrite config property.
   
   >Can always leave this old name as "deprecated" which calls a better-named 
version, if you think it would help  alleviate confusion.
   
   This was not a method move/rename, just a logic that was previously embedded 
in `HStore.createWriterInTmp` that I decided to move here in its own method for 
reuse. The naming seems right for me, as it should only be called for 
compactions, not for other types of writes. I will add this to javadoc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -694,7 +700,7 @@ private void 
refreshStoreFilesInternal(Collection newFiles) throw
 refreshStoreSizeAndTotalBytes();
   }
 
-  protected HStoreFile createStoreFileAndReader(final Path p) throws 
IOException {
+  public HStoreFile createStoreFileAndReader(final Path p) throws IOException {

Review comment:
   > Although, I also see that HStore#openStoreFiles is using one instance 
of createStoreFileAndReader, so we couldn't completely move this out of HStore. 
Is it better to lift one method into Store and leave other instances "internal" 
to HStore? That might help the smell a little, but an obvious solution isn't 
jumping out at me.
   
   Yes, there are many other usages, pretty much everything that needs access 
to hfiles would end up accessing this method (indirectly). 
   
   In terms of responsibility, it does seem right to me to have HStore the 
central point for store file accesses. Moving it to Store would still require 
some code duplication there, or exposing the two methods, which is what we are 
trying to avoid here.
   
   Maybe a better solution is to define a lambda function in 
Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader 
when calling Compactor.commitCompaction().

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   The javadoc had already been updated to mention the 
cachecompactedblocksonwrite config property.
   
   >Can always leave this old name as "deprecated" which calls a better-named 
version, if you think it would help  alleviate confusion.
   
   This was not a method move/rename, just a logic that was previously embedded 
in `HStore.createWriterInTmp` that I decided to move here in its own method for 
reuse. The naming seems right for me, as it should only be called for 
compactions, not for other types of writes. I add this to javadoc.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -278,6 +280,21 @@ protected final StoreFileWriter 
createTmpWriter(FileDetails fd, boolean shouldDr
 HConstants.EMPTY_STRING);
   }
 
+  /**
+   * Default method for initializing a StoreFileWriter in the compaction 
process, this creates the

Review comment:
   >Is creating them in the temp directory really "default"? Or is it 
"default" just because the direct-insert stuff isn't available yet? ;)
   
   It is the default now, if we don't explicitly specify 
`hbase.hstore.defaultengine.compactor.class` config property to set 
`DirectStoreCompactor` implementation to be used, the previous existing 
`DefaultCompactor` will be loaded, which implements this logic.
   
   > When you say other compactors may not use it, it makes me wonder if it 
makes sense to provide it here.
   
   What I meant is that other compactors may overwrite the method with a 
different logic.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -694,7 +700,7 @@ private void 
refreshStoreFilesInternal(Collection newFiles) throw
 refreshStoreSizeAndTotalBytes();
   }
 
-  protected HStoreFile createStoreFileAndReader(final Path p) throws 
IOException {
+  public HStoreFile createStoreFileAndReader(final Path p) throws IOException {

Review comment:
   > Although, I also see that HStore#openStoreFiles is using one instance 
of createStoreFileAndReader, so we couldn't completely move this out of HStore. 
Is it better to lift one method into Store and leave other instances "internal" 
to HStore? That might help the smell a little, but an obvious solution isn't 
jumping out at me.
   
   Yes, there are many other usages, pretty much everything that needs access 
to hfiles would end up accessing this method (indirectly). 
   
   In terms of responsibility, it does seem right to me to have HStore the 
central point for store file accesses. Moving it to Store would still require 
some code duplication there, or exposing the two methods, which is what we are 
trying to avoid here.
   
   Maybe a better solution is to define a lambda function in 
Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader 
when calling Compactor.commitCompaction().

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   The javadoc had already been updated to mention the 
cachecompactedblocksonwrite config property.
   
   >Can always leave this old name as "deprecated" which calls a better-named 
version, if you think it would help  alleviate confusion.
   
   This was not a method move/rename, just a logic that was previously embedded 
in `HStore.createWriterInTmp` that I decided to move here in its own method for 
reuse. The naming seems right for me, as it should only be called for 
compactions, not for other types of writes. I add this to javadoc.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -278,6 +280,21 @@ protected final StoreFileWriter 
createTmpWriter(FileDetails fd, boolean shouldDr
 HConstants.EMPTY_STRING);
   }
 
+  /**
+   * Default method for initializing a StoreFileWriter in the compaction 
process, this creates the

Review comment:
   >Is creating them in the temp directory really "default"? Or is it 
"default" just because the direct-insert stuff isn't available yet? ;)
   
   It is the default now, if we don't explicitly specify 
`hbase.hstore.defaultengine.compactor.class` config property to set 
`DirectStoreCompactor` implementation to be used, the previous existing 
`DefaultCompactor` will be loaded, which implements this logic.
   
   > When you say other compactors may not use it, it makes me wonder if it 
makes sense to provide it here.
   
   What I meant is that other compactors may overwrite the method with a 
different logic.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672932031



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -278,6 +280,21 @@ protected final StoreFileWriter 
createTmpWriter(FileDetails fd, boolean shouldDr
 HConstants.EMPTY_STRING);
   }
 
+  /**
+   * Default method for initializing a StoreFileWriter in the compaction 
process, this creates the

Review comment:
   >Is creating them in the temp directory really "default"? Or is it 
"default" just because the direct-insert stuff isn't available yet? ;)
   
   It is the default now, if we don't explicitly specify 
`hbase.hstore.defaultengine.compactor.class` config property to set 
`DirectStoreCompactor` implementation to be used, the previous existing 
`DefaultCompactor` will be loaded, which implements this logic.
   
   > When you say other compactors may not use it, it makes me wonder if it 
makes sense to provide it here.
   
   What I meant is that other compactors may overwrite the method with a 
different logic.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-20 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672923935



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   The javadoc had already been updated to mention the 
cachecompactedblocksonwrite config property.
   
   >Can always leave this old name as "deprecated" which calls a better-named 
version, if you think it would help  alleviate confusion.
   
   This was not a method move/rename, just a logic that was previously embedded 
in `HStore.createWriterInTmp` that I decided to move here in its own method for 
reuse. The naming seems right for me, as it should only be called for 
compactions, not for other types of writes. I add this to javadoc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-19 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r672308643



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -694,7 +700,7 @@ private void 
refreshStoreFilesInternal(Collection newFiles) throw
 refreshStoreSizeAndTotalBytes();
   }
 
-  protected HStoreFile createStoreFileAndReader(final Path p) throws 
IOException {
+  public HStoreFile createStoreFileAndReader(final Path p) throws IOException {

Review comment:
   > Although, I also see that HStore#openStoreFiles is using one instance 
of createStoreFileAndReader, so we couldn't completely move this out of HStore. 
Is it better to lift one method into Store and leave other instances "internal" 
to HStore? That might help the smell a little, but an obvious solution isn't 
jumping out at me.
   
   Yes, there are many other usages, pretty much everything that needs access 
to hfiles would end up accessing this method (indirectly). 
   
   In terms of responsibility, it does seem right to me to have HStore the 
central point for store file accesses. Moving it to Store would still require 
some code duplication there, or exposing the two methods, which is what we are 
trying to avoid here.
   
   Maybe a better solution is to define a lambda function in 
Compactor.commitCompaction, so that HStore can wrap up createStoreFileAndReader 
when calling Compactor.commitCompaction().




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-06 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663780661



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   Yeah, this logic was moved here from `HStore.createWriterInTmp` (as you 
noticed later on your following comment). Because now we override the writer 
creation in DirectStoreCompactor to not use this `HStore.createWriterInTmp`, we 
would need to duplicate this logic there, but since `CacheConfig` already has 
all the info required to decide on this, thought about moving it to 
`CacheConfig`. 
   
   I reckon naming and cohesion is not been great, but I was just trying to 
concentrate on the original problem here of allowing compacting files getting 
created in the store directory, without the need for tmp dirs and renames.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   BTW, let me fix the javadoc description.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectStoreFlushContext.java
##
@@ -30,14 +30,14 @@
  * directly in the store dir, and therefore, doesn't perform a rename from tmp 
dir
  * into the store dir.
  *
- * To be used only when PersistedStoreEngine is configured as the StoreEngine 
implementation.
+ * To be used only when DirectStoreFlushContext is configured as the 
StoreEngine implementation.

Review comment:
   Thanks!

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -1145,16 +1151,13 @@ public StoreFileWriter createWriterInTmp(long 
maxKeyCount, Compression.Algorithm
   // if data blocks are to be cached on write
   // during compaction, we should forcefully
   // cache index and bloom blocks as well
-  if (cacheCompactedBlocksOnWrite && totalCompactedFilesSize <= cacheConf
-.getCacheCompactedBlocksOnWriteThreshold()) {
-writerCacheConf.enableCacheOnWrite();
+  if 
(writerCacheConf.enableCacheOnWriteForCompactions(totalCompactedFilesSize)) {

Review comment:
   Yep, see my reply above.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -1167,7 +1170,7 @@ public StoreFileWriter createWriterInTmp(long 
maxKeyCount, Compression.Algorithm
 } else {
   final boolean shouldCacheDataOnWrite = 
cacheConf.shouldCacheDataOnWrite();
   if (shouldCacheDataOnWrite) {
-writerCacheConf.enableCacheOnWrite();
+writerCacheConf.enableCacheOnWriteForCompactions();

Review comment:
   Yeah, should be only `enableCacheOnWrite`, as here we are calling it for 
other writes than compactions.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* 

[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663979746



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectStoreCompactor.java
##
@@ -0,0 +1,85 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectStoreCompactor extends DefaultCompactor {
+  public DirectStoreCompactor(Configuration conf, HStore store) {
+super(conf, store);
+  }
+
+  @Override
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+// When all MVCC readpoints are 0, don't write them.
+// See HBASE-8166, HBASE-12600, and HBASE-13389.
+return createWriterInFamilyDir(fd.maxKeyCount,
+  major ? majorCompactionCompression : minorCompactionCompression,
+  fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0,
+  shouldDropBehind, fd.totalCompactedFilesSize);
+  }
+
+  private StoreFileWriter createWriterInFamilyDir(long maxKeyCount,

Review comment:
   Will do.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663979457



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectStoreCompactor.java
##
@@ -0,0 +1,85 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectStoreCompactor extends DefaultCompactor {

Review comment:
   Will do.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663974669



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -533,4 +550,46 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
 return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, 
earliestPutTs,
 dropDeletesFromRow, dropDeletesToRow);
   }
+
+  /**
+   * Default implementation for committing store files created after a 
compaction. Assumes new files
+   * had been created on a temp directory, so it renames those files into the 
actual store dir,
+   * then create a reader and cache it into the store.
+   * @param cr the compaction request.
+   * @param newFiles the new files created by this compaction under a temp dir.
+   * @param user the running user.
+   * @return A list of the resulting store files already placed in the store 
dir and loaded into the
+   * store cache.
+   * @throws IOException if the commit fails.
+   */
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
+  throws IOException {
+List sfs = new ArrayList<>(newFiles.size());
+for (Path newFile : newFiles) {
+  assert newFile != null;
+  this.store.validateStoreFile(newFile);
+  // Move the file into the right spot
+  HStoreFile sf = createFileInStoreDir(newFile);
+  if (this.store.getCoprocessorHost() != null) {
+this.store.getCoprocessorHost().postCompact(this.store, sf, 
cr.getTracker(), cr, user);
+  }
+  assert sf != null;
+  sfs.add(sf);
+}
+return sfs;
+  }
+
+  /**
+   * Assumes new file was created initially on a temp folder.

Review comment:
   We still need to vary the logic of "where" the store file is originally 
created. Default behaviour is to create it under temp (that's done by the call 
from lines #591/#592), whilst `DirectStoreCompactor` assumes the passed path is 
already in the store dir itself, so it just needs to create the reader over the 
passed path. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663966127



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -278,6 +280,21 @@ protected final StoreFileWriter 
createTmpWriter(FileDetails fd, boolean shouldDr
 HConstants.EMPTY_STRING);
   }
 
+  /**
+   * Default method for initializing a StoreFileWriter in the compaction 
process, this creates the

Review comment:
   StoreEngine delegates the work to a `Compactor` implementation, where 
`DefaultCompator`, the default Compactor implementation, uses this one to 
create the file writer. So that's why we say it's the "default" method. 
Alternatives compactors may not use it, like the `DirectStoreCompactor` that 
implements its own logic that creates the file on the store directory, rather 
than under temp.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663958206



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -278,6 +280,21 @@ protected final StoreFileWriter 
createTmpWriter(FileDetails fd, boolean shouldDr
 HConstants.EMPTY_STRING);
   }
 
+  /**
+   * Default method for initializing a StoreFileWriter in the compaction 
process, this creates the
+   * resulting files on a temp directory. Therefore, upon compaction commit 
time, these files
+   * should be renamed into the actual store dir.
+   * @param fd the file details.
+   * @param shouldDropBehind boolean for the drop-behind output stream cache 
settings.
+   * @param major if compaction is major.
+   * @return Writer for a new StoreFile in the tmp dir.
+   * @throws IOException if it fails to initialise the writer.
+   */
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+return this.createTmpWriter(fd, shouldDropBehind, major);

Review comment:
   Noted. Removing 'this' on next commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663957101



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -135,7 +135,7 @@ public CompactionProgress getProgress() {
 /** Min SeqId to keep during a major compaction **/
 public long minSeqIdToKeep = 0;
 /** Total size of the compacted files **/
-private long totalCompactedFilesSize = 0;
+public long totalCompactedFilesSize = 0;

Review comment:
   This was the only marked as private here. Unfortunately needed exposing 
that to `DirectStoreCompactor`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663949806



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -1192,7 +1195,7 @@ public StoreFileWriter createWriterInTmp(long 
maxKeyCount, Compression.Algorithm
 return builder.build();
   }
 
-  HFileContext createFileContext(Compression.Algorithm compression,
+  public HFileContext createFileContext(Compression.Algorithm compression,

Review comment:
   This one in particular, could had logic duplicated wherever it's needed, 
but it's more of a helper method, so didn't think making it public would be 
harmful. I can revert it back to package private and re-implement this in the 
`DirectStoreCompactor`, if you think this shouldn't be exposed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663949806



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -1192,7 +1195,7 @@ public StoreFileWriter createWriterInTmp(long 
maxKeyCount, Compression.Algorithm
 return builder.build();
   }
 
-  HFileContext createFileContext(Compression.Algorithm compression,
+  public HFileContext createFileContext(Compression.Algorithm compression,

Review comment:
   This one in particular, could had logic duplicated wherever it's needed, 
but it's more of a helper method, so didn't think making it public would be 
harmful. I can revert it back to package private and re-implement this in the 
`DirectStoreCompactor`. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663947613



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -694,7 +700,7 @@ private void 
refreshStoreFilesInternal(Collection newFiles) throw
 refreshStoreSizeAndTotalBytes();
   }
 
-  protected HStoreFile createStoreFileAndReader(final Path p) throws 
IOException {
+  public HStoreFile createStoreFileAndReader(final Path p) throws IOException {

Review comment:
   Prior tho this, the "commit compaction" logic was implemented straight 
inside `HStore.doCompaction -> moveCompactedFilesIntoPlace -> 
moveFileIntoPlace` which was where `createStoreFileAndReader` was getting 
called. I believe this is "compaction specific" logic, and if we are aiming for 
pluggable compaction behaviour, it shouldn't be implemented in the HStore. 
That's why here we changed it to be `HStore.doCompaction -> 
storeEngine.compactor.commitCompaction`, so we had to eventually call it from 
the compactors. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663837361



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -431,7 +437,7 @@ public static long determineTTLFromFamily(final 
ColumnFamilyDescriptor family) {
 return ttl;
   }
 
-  StoreContext getStoreContext() {
+  public StoreContext getStoreContext() {

Review comment:
   Honestly, I can't remember now why I had duplicated `cryptoContext` on 
`HStore` here, when it's already defined on `StoreContext`. It doesn't seem 
correct to me now, let me remove. The need to expose `getStoreContext` as 
`public` stands, though, as we need that from the 
`DirectStoreCompactor.createWriterInFamilyDir`, which is in a separate package.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663828425



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##
@@ -1372,7 +1372,7 @@ public RegionInfo getRegionInfo() {
* @return Instance of {@link RegionServerServices} used by this HRegion.
* Can be null.
*/
-  RegionServerServices getRegionServerServices() {
+  public RegionServerServices getRegionServerServices() {

Review comment:
   Not needed after the changes addressing some of @z-york suggestions. Let 
me revert this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663781091



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   BTW, let me fix the javadoc description.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663822974



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -1167,7 +1170,7 @@ public StoreFileWriter createWriterInTmp(long 
maxKeyCount, Compression.Algorithm
 } else {
   final boolean shouldCacheDataOnWrite = 
cacheConf.shouldCacheDataOnWrite();
   if (shouldCacheDataOnWrite) {
-writerCacheConf.enableCacheOnWrite();
+writerCacheConf.enableCacheOnWriteForCompactions();

Review comment:
   Yeah, should be only `enableCacheOnWrite`, as here we are calling it for 
other writes than compactions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663822333



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
##
@@ -1145,16 +1151,13 @@ public StoreFileWriter createWriterInTmp(long 
maxKeyCount, Compression.Algorithm
   // if data blocks are to be cached on write
   // during compaction, we should forcefully
   // cache index and bloom blocks as well
-  if (cacheCompactedBlocksOnWrite && totalCompactedFilesSize <= cacheConf
-.getCacheCompactedBlocksOnWriteThreshold()) {
-writerCacheConf.enableCacheOnWrite();
+  if 
(writerCacheConf.enableCacheOnWriteForCompactions(totalCompactedFilesSize)) {

Review comment:
   Yep, see my reply above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663781334



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectStoreFlushContext.java
##
@@ -30,14 +30,14 @@
  * directly in the store dir, and therefore, doesn't perform a rename from tmp 
dir
  * into the store dir.
  *
- * To be used only when PersistedStoreEngine is configured as the StoreEngine 
implementation.
+ * To be used only when DirectStoreFlushContext is configured as the 
StoreEngine implementation.

Review comment:
   Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663781091



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   BTW, let me fix the javadoc description.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-07-05 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r663780661



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -291,12 +291,36 @@ public void setCacheDataOnWrite(boolean cacheDataOnWrite) 
{
* cacheIndexesOnWrite
* cacheBloomsOnWrite
*/
-  public void enableCacheOnWrite() {
+  public void enableCacheOnWriteForCompactions() {
 this.cacheDataOnWrite = true;
 this.cacheIndexesOnWrite = true;
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWriteForCompactions(long 
totalCompactedFilesSize) {

Review comment:
   Yeah, this logic was moved here from `HStore.createWriterInTmp` (as you 
noticed later on your following comment). Because now we override the writer 
creation in DirectStoreCompactor to not use this `HStore.createWriterInTmp`, we 
would need to duplicate this logic there, but since `CacheConfig` already has 
all the info required to decide on this, thought about moving it to 
`CacheConfig`. 
   
   I reckon naming and cohesion is not been great, but I was just trying to 
concentrate on the original problem here of allowing compacting files getting 
created in the store directory, without the need for tmp dirs and renames.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-23 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r657484035



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java
##
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectInStoreCompactor extends DefaultCompactor {
+  public DirectInStoreCompactor(Configuration conf, HStore store) {
+super(conf, store);
+  }
+
+  @Override
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+// When all MVCC readpoints are 0, don't write them.
+// See HBASE-8166, HBASE-12600, and HBASE-13389.
+return createWriterInFamilyDir(fd.maxKeyCount,
+  major ? majorCompactionCompression : minorCompactionCompression,
+  fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind);
+  }
+
+  private StoreFileWriter createWriterInFamilyDir(long maxKeyCount,
+  Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean 
includesTag,
+boolean shouldDropBehind) throws IOException {
+final CacheConfig writerCacheConf;
+// Don't cache data on write on compactions.
+writerCacheConf = new CacheConfig(store.getCacheConfig());
+writerCacheConf.setCacheDataOnWrite(false);
+
+InetSocketAddress[] favoredNodes = null;
+if (store.getHRegion().getRegionServerServices() != null) {
+  favoredNodes = 
store.getHRegion().getRegionServerServices().getFavoredNodesForRegion(
+store.getHRegion().getRegionInfo().getEncodedName());

Review comment:
   Wasn't finding `store.getStoreContext()` before, since it was package 
private. Made it public on HStore and am calling it there now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-23 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r657483340



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectStoreCompactor.java
##
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectStoreCompactor extends DefaultCompactor {
+  public DirectStoreCompactor(Configuration conf, HStore store) {
+super(conf, store);
+  }
+
+  @Override
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+// When all MVCC readpoints are 0, don't write them.
+// See HBASE-8166, HBASE-12600, and HBASE-13389.
+return createWriterInFamilyDir(fd.maxKeyCount,
+  major ? majorCompactionCompression : minorCompactionCompression,
+  fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0,
+  shouldDropBehind, fd.totalCompactedFilesSize);
+  }
+
+  private StoreFileWriter createWriterInFamilyDir(long maxKeyCount,
+  Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean 
includesTag,
+boolean shouldDropBehind, long totalCompactedFilesSize) throws 
IOException {
+final CacheConfig writerCacheConf;
+// Don't cache data on write on compactions.

Review comment:
   Removed now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-23 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r657483258



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
##
@@ -297,6 +297,30 @@ public void enableCacheOnWrite() {
 this.cacheBloomsOnWrite = true;
   }
 
+  /**
+   * If hbase.rs.cachecompactedblocksonwrite configuration is set to true and
+   * 'totalCompactedFilesSize' is lower than 
'cacheCompactedDataOnWriteThreshold',
+   * enables cache on write for below properties:
+   * - cacheDataOnWrite
+   * - cacheIndexesOnWrite
+   * - cacheBloomsOnWrite
+   *
+   * Otherwise, sets 'cacheDataOnWrite' only to false.
+   *
+   * @param totalCompactedFilesSize the total size of compacted files.
+   * @return true if the checks mentioned above pass and the cache is enabled, 
false otherwise.
+   */
+  public boolean enableCacheOnWrite(long totalCompactedFilesSize) {

Review comment:
   Renamed it on last commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656009419



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java
##
@@ -33,11 +33,11 @@
  * To be used only when PersistedStoreEngine is configured as the StoreEngine 
implementation.
  */
 @InterfaceAudience.Private
-public class PersistedStoreFlushContext extends DefaultStoreFlushContext {
+public class DirectInStoreFlushContext extends DefaultStoreFlushContext {

Review comment:
   Changed name after comments from @saintstack on HBASE-25391. I can do 
"DirectStore" as suggested.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656100941



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java
##
@@ -33,11 +33,11 @@
  * To be used only when PersistedStoreEngine is configured as the StoreEngine 
implementation.

Review comment:
   Yes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656098099



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java
##
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectInStoreCompactor extends DefaultCompactor {
+  public DirectInStoreCompactor(Configuration conf, HStore store) {
+super(conf, store);
+  }
+
+  @Override
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+// When all MVCC readpoints are 0, don't write them.
+// See HBASE-8166, HBASE-12600, and HBASE-13389.
+return createWriterInFamilyDir(fd.maxKeyCount,
+  major ? majorCompactionCompression : minorCompactionCompression,
+  fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind);
+  }
+
+  private StoreFileWriter createWriterInFamilyDir(long maxKeyCount,
+  Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean 
includesTag,
+boolean shouldDropBehind) throws IOException {
+final CacheConfig writerCacheConf;
+// Don't cache data on write on compactions.
+writerCacheConf = new CacheConfig(store.getCacheConfig());
+writerCacheConf.setCacheDataOnWrite(false);
+
+InetSocketAddress[] favoredNodes = null;
+if (store.getHRegion().getRegionServerServices() != null) {
+  favoredNodes = 
store.getHRegion().getRegionServerServices().getFavoredNodesForRegion(
+store.getHRegion().getRegionInfo().getEncodedName());
+}
+HFileContext hFileContext = store.createFileContext(compression, 
includeMVCCReadpoint,
+  includesTag, store.getCryptoContext());
+Path familyDir = new Path(store.getRegionFileSystem().getRegionDir(),
+  store.getColumnFamilyDescriptor().getNameAsString());
+StoreFileWriter.Builder builder = new StoreFileWriter.Builder(conf, 
writerCacheConf,
+  store.getFileSystem())
+  .withOutputDir(familyDir)
+  .withBloomType(store.getColumnFamilyDescriptor().getBloomFilterType())
+  .withMaxKeyCount(maxKeyCount)
+  .withFavoredNodes(favoredNodes)
+  .withFileContext(hFileContext)
+  .withShouldDropCacheBehind(shouldDropBehind)
+  .withCompactedFilesSupplier(() -> store.getCompactedFiles());

Review comment:
   Ack.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656045500



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java
##
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectInStoreCompactor extends DefaultCompactor {
+  public DirectInStoreCompactor(Configuration conf, HStore store) {
+super(conf, store);
+  }
+
+  @Override
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+// When all MVCC readpoints are 0, don't write them.
+// See HBASE-8166, HBASE-12600, and HBASE-13389.
+return createWriterInFamilyDir(fd.maxKeyCount,
+  major ? majorCompactionCompression : minorCompactionCompression,
+  fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind);
+  }
+
+  private StoreFileWriter createWriterInFamilyDir(long maxKeyCount,
+  Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean 
includesTag,
+boolean shouldDropBehind) throws IOException {
+final CacheConfig writerCacheConf;
+// Don't cache data on write on compactions.
+writerCacheConf = new CacheConfig(store.getCacheConfig());
+writerCacheConf.setCacheDataOnWrite(false);
+
+InetSocketAddress[] favoredNodes = null;
+if (store.getHRegion().getRegionServerServices() != null) {
+  favoredNodes = 
store.getHRegion().getRegionServerServices().getFavoredNodesForRegion(
+store.getHRegion().getRegionInfo().getEncodedName());

Review comment:
   I already need an instance of HStore, as you noted on your comment 
below. This HStore instance need is not exclusive from this Compactor 
extension, it is already defined on the parent classes single declared 
constructors, so it's used for other stuff in the hierarchy, not only the 
filereader creation. I don't think there's much value in create an extra 
StoreContext here when we have everything we need in HStore instance itself. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656019402



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DirectInStoreCompactor.java
##
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.io.compress.Compression;
+import org.apache.hadoop.hbase.io.hfile.CacheConfig;
+import org.apache.hadoop.hbase.io.hfile.HFileContext;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Private
+public class DirectInStoreCompactor extends DefaultCompactor {
+  public DirectInStoreCompactor(Configuration conf, HStore store) {
+super(conf, store);
+  }
+
+  @Override
+  protected StoreFileWriter initWriter(FileDetails fd, boolean 
shouldDropBehind, boolean major)
+throws IOException {
+// When all MVCC readpoints are 0, don't write them.
+// See HBASE-8166, HBASE-12600, and HBASE-13389.
+return createWriterInFamilyDir(fd.maxKeyCount,
+  major ? majorCompactionCompression : minorCompactionCompression,
+  fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0, shouldDropBehind);
+  }
+
+  private StoreFileWriter createWriterInFamilyDir(long maxKeyCount,
+  Compression.Algorithm compression, boolean includeMVCCReadpoint, boolean 
includesTag,
+boolean shouldDropBehind) throws IOException {
+final CacheConfig writerCacheConf;
+// Don't cache data on write on compactions.
+writerCacheConf = new CacheConfig(store.getCacheConfig());
+writerCacheConf.setCacheDataOnWrite(false);

Review comment:
   Yeah, missed this while converting my original PoC to this PR. Let me 
try avoid duplication with `HStore.createWriterInTmp`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656009419



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java
##
@@ -33,11 +33,11 @@
  * To be used only when PersistedStoreEngine is configured as the StoreEngine 
implementation.
  */
 @InterfaceAudience.Private
-public class PersistedStoreFlushContext extends DefaultStoreFlushContext {
+public class DirectInStoreFlushContext extends DefaultStoreFlushContext {

Review comment:
   Changed name after comments from @saintstack on HBASE-25391.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r656009419



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DirectInStoreFlushContext.java
##
@@ -33,11 +33,11 @@
  * To be used only when PersistedStoreEngine is configured as the StoreEngine 
implementation.
  */
 @InterfaceAudience.Private
-public class PersistedStoreFlushContext extends DefaultStoreFlushContext {
+public class DirectInStoreFlushContext extends DefaultStoreFlushContext {

Review comment:
   Name change suggestion came after comments from @saintstack on 
HBASE-25391.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-22 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r655650248



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, 
List s
* @param fd The file details.
* @return Writer for a new StoreFile in the tmp dir.
* @throws IOException if creation failed
+   * @deprecated Use initWriter instead.

Review comment:
   You mean the one in line #271?

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, 
List s
* @param fd The file details.
* @return Writer for a new StoreFile in the tmp dir.
* @throws IOException if creation failed
+   * @deprecated Use initWriter instead.

Review comment:
   You mean the one in line #271? Original commit didn't have it, but I had 
added on the second commit from last week.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -533,4 +549,46 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
 return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, 
earliestPutTs,
 dropDeletesFromRow, dropDeletesToRow);
   }
+
+  /**
+   * Default implementation for committing store files created after a 
compaction. Assumes new files
+   * had been created on a temp directory, so it renames those files into the 
actual store dir,
+   * then create a reader and cache it into the store.
+   * @param cr the compaction request.
+   * @param newFiles the new files created by this compaction under a temp dir.
+   * @param user the running user/
+   * @return A list of the resulting store files already placed in the store 
dir and loaded into the
+   * store cache.
+   * @throws IOException
+   */
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
+  throws IOException {
+List sfs = new ArrayList<>(newFiles.size());
+for (Path newFile : newFiles) {
+  assert newFile != null;
+  this.store.validateStoreFile(newFile);

Review comment:
   This was already been called before, indirectly, in HStore, via 
`doCompaction -> moveCompactedFilesIntoPlace -> moveFileIntoPlace -> 
validateStoreFile`. I had just merged some of the methods and moved this new 
method in the Compactor class itself.

##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java
##
@@ -0,0 +1,123 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.mockito.Mockito.mock;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.hadoop.hbase.testclassification.MediumTests;
+import org.apache.hadoop.hbase.testclassification.RegionServerTests;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+
+/**
+ * Test class for DirectInStoreCompactor.
+ */

Review comment:
   Indeed, addressing it on next commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For 

[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-21 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r655657711



##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestDefaultCompactor.java
##
@@ -0,0 +1,123 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.regionserver.compactions;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.mockito.Mockito.mock;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.regionserver.HStore;
+import org.apache.hadoop.hbase.regionserver.HStoreFile;
+import org.apache.hadoop.hbase.regionserver.StoreFileWriter;
+import org.apache.hadoop.hbase.testclassification.MediumTests;
+import org.apache.hadoop.hbase.testclassification.RegionServerTests;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+
+/**
+ * Test class for DirectInStoreCompactor.
+ */

Review comment:
   Indeed, addressing it on next commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-21 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r655657218



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -533,4 +549,46 @@ protected InternalScanner createScanner(HStore store, 
ScanInfo scanInfo,
 return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, 
earliestPutTs,
 dropDeletesFromRow, dropDeletesToRow);
   }
+
+  /**
+   * Default implementation for committing store files created after a 
compaction. Assumes new files
+   * had been created on a temp directory, so it renames those files into the 
actual store dir,
+   * then create a reader and cache it into the store.
+   * @param cr the compaction request.
+   * @param newFiles the new files created by this compaction under a temp dir.
+   * @param user the running user/
+   * @return A list of the resulting store files already placed in the store 
dir and loaded into the
+   * store cache.
+   * @throws IOException
+   */
+  public List commitCompaction(CompactionRequestImpl cr, 
List newFiles, User user)
+  throws IOException {
+List sfs = new ArrayList<>(newFiles.size());
+for (Path newFile : newFiles) {
+  assert newFile != null;
+  this.store.validateStoreFile(newFile);

Review comment:
   This was already been called before, indirectly, in HStore, via 
`doCompaction -> moveCompactedFilesIntoPlace -> moveFileIntoPlace -> 
validateStoreFile`. I had just merged some of the methods and moved this new 
method in the Compactor class itself.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-21 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r655650248



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, 
List s
* @param fd The file details.
* @return Writer for a new StoreFile in the tmp dir.
* @throws IOException if creation failed
+   * @deprecated Use initWriter instead.

Review comment:
   You mean the one in line #271? Original commit didn't have it, but I had 
added on the second commit from last week.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] wchevreuil commented on a change in pull request #3389: HBASE-25392 Direct insert compacted HFiles into data directory.

2021-06-21 Thread GitBox


wchevreuil commented on a change in pull request #3389:
URL: https://github.com/apache/hbase/pull/3389#discussion_r655650248



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
##
@@ -266,6 +266,7 @@ public InternalScanner createScanner(ScanInfo scanInfo, 
List s
* @param fd The file details.
* @return Writer for a new StoreFile in the tmp dir.
* @throws IOException if creation failed
+   * @deprecated Use initWriter instead.

Review comment:
   You mean the one in line #271?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org