joshelser commented on a change in pull request #3684:
URL: https://github.com/apache/hbase/pull/3684#discussion_r709584776
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientSideRegionScanner.java
##########
@@ -60,6 +62,14 @@ public ClientSideRegionScanner(Configuration conf,
FileSystem fs,
region = HRegion.newHRegion(CommonFSUtils.getTableDir(rootDir,
htd.getTableName()), null, fs,
conf, hri, htd, null);
region.setRestoredRegion(true);
+ // non RS process does not have a block cache, and this a client side
scanner,
+ // create one for MapReduce jobs to cache the INDEX block
+ conf.setFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY,
+ conf.getFloat(HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_KEY,
+ HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_DEFAULT));
Review comment:
I guess we could actually do this by doing the opposite of the
`MemorySizeUtil.getOnHeapCacheSize` method.
1. Take a discrete value from the user (32MB)
2. Compute 32MB / MemorySizeUtil.safeGetHeapMemoryUsage()
3. Set the value from step 2 as the bucket cache size
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientSideRegionScanner.java
##########
@@ -60,6 +62,14 @@ public ClientSideRegionScanner(Configuration conf,
FileSystem fs,
region = HRegion.newHRegion(CommonFSUtils.getTableDir(rootDir,
htd.getTableName()), null, fs,
conf, hri, htd, null);
region.setRestoredRegion(true);
+ // non RS process does not have a block cache, and this a client side
scanner,
+ // create one for MapReduce jobs to cache the INDEX block
+ conf.setFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY,
+ conf.getFloat(HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_KEY,
+ HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_DEFAULT));
Review comment:
I wish we had the ability to set a discrete value here. For clients, I
don't think it's as valid an assumption that we set this blockcache as a
percentage of total heap. Not sure if this is something we could easily
implement.
##########
File path:
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientSideRegionScanner.java
##########
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertTrue;
+import java.io.IOException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtil;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.io.hfile.BlockCache;
+import org.apache.hadoop.hbase.io.hfile.LruBlockCache;
+import org.apache.hadoop.hbase.testclassification.ClientTests;
+import org.apache.hadoop.hbase.testclassification.SmallTests;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+@Category({ SmallTests.class, ClientTests.class })
+public class TestClientSideRegionScanner {
+ @ClassRule
+ public static final HBaseClassTestRule CLASS_RULE =
+ HBaseClassTestRule.forClass(TestClientSideRegionScanner.class);
+
+ private final static HBaseTestingUtil TEST_UTIL = new HBaseTestingUtil();
+
+ private Configuration conf;
+ private Path rootDir;
+ private FileSystem fs;
+ private TableDescriptor htd;
+ private RegionInfo hri;
+ private Scan scan;
+
+ @BeforeClass
+ public static void setUpBeforeClass() throws Exception {
+ TEST_UTIL.startMiniCluster(1);
+ }
+
+ @AfterClass
+ public static void tearDownAfterClass() throws Exception {
+ TEST_UTIL.shutdownMiniCluster();
+ }
+
+ @Before
+ public void setup() throws IOException {
+ conf = TEST_UTIL.getConfiguration();
+ rootDir = TEST_UTIL.getDefaultRootDirPath();
+ fs = TEST_UTIL.getTestFileSystem();
+ htd = TEST_UTIL.getAdmin().getDescriptor(TableName.META_TABLE_NAME);
+ hri = TEST_UTIL.getAdmin().getRegions(TableName.META_TABLE_NAME).get(0);
+ scan = new Scan();
+ }
+
+ @Test
+ public void testDefaultBlockCache() throws IOException {
+ Configuration copyConf = new Configuration(conf);
+ ClientSideRegionScanner clientSideRegionScanner =
+ new ClientSideRegionScanner(copyConf, fs, rootDir, htd, hri, scan, null);
+
+ BlockCache blockCache =
clientSideRegionScanner.getRegion().getBlockCache();
+ assertNotNull(blockCache);
+ assertTrue(blockCache instanceof LruBlockCache);
+
+ float actualBlockCacheRatio = copyConf
+ .getFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY,
+ copyConf.getFloat(HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_KEY,
+ HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_DEFAULT));
+
+ assertTrue(HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_DEFAULT ==
actualBlockCacheRatio);
+ }
+
+ @Test
+ public void testConfiguredBlockCache() throws IOException {
+ Configuration copyConf = new Configuration(conf);
+ float blockCacheRatio = 0.05f;
+ copyConf.setFloat(HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_KEY,
blockCacheRatio);
+ ClientSideRegionScanner clientSideRegionScanner =
+ new ClientSideRegionScanner(copyConf, fs, rootDir, htd, hri, scan, null);
+
+ BlockCache blockCache =
clientSideRegionScanner.getRegion().getBlockCache();
+ assertNotNull(blockCache);
+ assertTrue(blockCache instanceof LruBlockCache);
Review comment:
Related to the above comment, LruBlockCache here will cache all blocks
from the HFile, not just the index blocks which was the goal.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientSideRegionScanner.java
##########
@@ -60,6 +62,14 @@ public ClientSideRegionScanner(Configuration conf,
FileSystem fs,
region = HRegion.newHRegion(CommonFSUtils.getTableDir(rootDir,
htd.getTableName()), null, fs,
conf, hri, htd, null);
region.setRestoredRegion(true);
+ // non RS process does not have a block cache, and this a client side
scanner,
+ // create one for MapReduce jobs to cache the INDEX block
+ conf.setFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY,
+ conf.getFloat(HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_KEY,
+ HConstants.HBASE_CLIENT_SCANNER_BLOCK_CACHE_SIZE_DEFAULT));
+ // don't allow L2 bucket cache for non RS process to avoid unexpected disk
usage.
+ conf.unset(HConstants.BUCKET_CACHE_IOENGINE_KEY);
Review comment:
agree with Anoop -- no need for anything but an L1 on-heap blockcache
(and specifically L1 only for index blocks, not data blocks: how we normally do
it when we use bucketcache).
However, because we unset the bucketcache, that means that we don't use
CombinedBlockCache, but only LruBlockCache directly in the
ClientSideRegionScanner. This means that the block cache will also be caching
data blocks which I don't think was the spirit of this change. The problem is
that I don't see a way for LruBlockCache to only cache index blocks.
I feel like we'd have to update `CombinedBlockCache` to allow for a null
`L2` and make a special BlockCacheFactory call for this special, client-side
case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]