[jira] [Commented] (HBASE-22887) HFileOutputFormat2 split a lot of HFile by roll once per rowkey
[ https://issues.apache.org/jira/browse/HBASE-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935761#comment-16935761 ] Liu Shaohui commented on HBASE-22887: - [~langdamao] Encountered same problem in job of converting cube data to HFile in kylin. Thanks for your explanation~ > HFileOutputFormat2 split a lot of HFile by roll once per rowkey > --- > > Key: HBASE-22887 > URL: https://issues.apache.org/jira/browse/HBASE-22887 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0 > Environment: HBase 2.0.0 >Reporter: langdamao >Priority: Major > > When I use HFileOutputFormat2 in mr job to build HFiles,in reducer it creates > lots of files. > Here is the log: > {code:java} > 2019-08-16 14:42:51,988 INFO [main] > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2: > Writer=hdfs://hfile/_temporary/1/_temporary/attempt_1558444096078_519332_r_16_0/F1/06f3b0e9f0644ee782b7cf4469f44a70, > wrote=893827310 > Writer=hdfs://hfile/_temporary/1/_temporary/attempt_1558444096078_519332_r_16_0/F1/1454ea148f1547499209a266ad25387f, > wrote=61 > Writer=hdfs://hfile/_temporary/1/_temporary/attempt_1558444096078_519332_r_16_0/F1/9d35446634154b4ca4be56f361b57c8b, > wrote=55 > ... {code} > It keep writing a new file every rowkey comes. > then I output more logs for detail and found the problem. Code > Here[GitHub|[https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java#L289]] > {code:java} > if (wl != null && wl.written + length >= maxsize) { > this.rollRequested = true; > } > // This can only happen once a row is finished though > if (rollRequested && Bytes.compareTo(this.previousRow, rowKey) != 0) { > rollWriters(wl); > }{code} > In my Case,I have two fimaly F1 & F2,and writer of F2 arrives the maxsize > ,so rollRequested becomes true, but it's rowkey was the same with > previousRow so writer won't be roll. When next rowkey comes with fimaly F1, > both of rollRequested && Bytes.compareTo(this.previousRow, rowKey) != 0 is > true,and writter of F1 will be roll , new Hfile create. And then same rowkey > with fimaly F2 comes set rollRequested > true, and next rowkey with fimaly F1 comes writter of F1 rolled. > So, it will create a new Hfile for every rowkey with fimaly F1, and F2 will > never be roll until job ends. > > Here is my questions and part of solutions: > Q1. First whether hbase 2.0.0 support different family of same HbaseTable has > different rowkey cut?Which means rowkeyA writes in the first HFile of F1,but > may be the second HFile of F2. For hbase 1.x.x it doesn't support so we roll > all the writter and won't get this problem. I guess the answer is > "Yes,support" , we goes to Q2. > Q2. Do we allow same rowkey with same family, comes to > HFileOutputFormat2.write? > If not, can we fix it this way, cause this rowKey will never be the same with > previouseRow > {code:java} > if (wl != null && wl.written + length >= maxsize) { > rollWriters(wl); > }{code} > If yes, should we need Map to record previouseRow > {code:java} > private final Map previousRows = > new TreeMap<>(Bytes.BYTES_COMPARATOR); > if (wl != null && wl.written + length >= maxsize && > Bytes.compareTo(this.previousRows.get(family), rowKey) != 0) { > rollWriters(wl); > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15420: Resolution: Fixed Status: Resolved (was: Patch Available) > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184673#comment-15184673 ] Liu Shaohui commented on HBASE-15420: - [~Apache9] [~anoop.hbase] Thanks for your review~ If no objection, I will commit tomorrow. > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184668#comment-15184668 ] Liu Shaohui commented on HBASE-15420: - [~anoop.hbase] {quote} BTW do we have tests in CacheConfig which considers configs specified within an HCD also? If not will be good to add some {quote} Yes. See the TestCacheConfig#269 ``` ... conf.setBoolean(CacheConfig.CACHE_DATA_ON_READ_KEY, true); conf.setBoolean(CacheConfig.CACHE_BLOCKS_ON_WRITE_KEY, false); HColumnDescriptor family = new HColumnDescriptor("testDisableCacheDataBlock"); family.setBlockCacheEnabled(false); cacheConfig = new CacheConfig(conf, family); assertFalse(cacheConfig.shouldCacheBlockOnRead(BlockCategory.DATA)); assertFalse(cacheConfig.shouldCacheCompressed(BlockCategory.DATA)); assertFalse(cacheConfig.shouldCacheDataCompressed()); assertFalse(cacheConfig.shouldCacheDataOnWrite()); assertFalse(cacheConfig.shouldCacheDataOnRead()); assertTrue(cacheConfig.shouldCacheBlockOnRead(BlockCategory.INDEX)); assertFalse(cacheConfig.shouldCacheBlockOnRead(BlockCategory.META)); assertTrue(cacheConfig.shouldCacheBlockOnRead(BlockCategory.BLOOM)); assertTrue(cacheConfig.shouldCacheBloomsOnWrite()); assertTrue(cacheConfig.shouldCacheIndexesOnWrite()); ``` > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184286#comment-15184286 ] Liu Shaohui commented on HBASE-15420: - [~Apache9] [~stack] [~anoop.hbase] Please help to review this patch? > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184282#comment-15184282 ] Liu Shaohui commented on HBASE-15420: - The test failed for patch v7 in HBASE-15338. In this version, the meta blocks don't need to be cached if cache on read is disabled. So just change the assertTrue to assertFalse to fix this failed tests~ > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15420: Attachment: HBASE-15420-v1.diff Fix the failed test~ > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15420) TestCacheConfig failed after HBASE-15338
[ https://issues.apache.org/jira/browse/HBASE-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15420: Status: Patch Available (was: Open) > TestCacheConfig failed after HBASE-15338 > > > Key: HBASE-15420 > URL: https://issues.apache.org/jira/browse/HBASE-15420 > Project: HBase > Issue Type: Test > Components: test >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15420-v1.diff > > > TestCacheConfig failed after HBASE-15338. > Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15420) TestCacheConfig failed after HBASE-15338
Liu Shaohui created HBASE-15420: --- Summary: TestCacheConfig failed after HBASE-15338 Key: HBASE-15420 URL: https://issues.apache.org/jira/browse/HBASE-15420 Project: HBase Issue Type: Test Components: test Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 TestCacheConfig failed after HBASE-15338. Fix it in this issue~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184271#comment-15184271 ] Liu Shaohui commented on HBASE-15338: - Sorry for that. I open a issue and fix it. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff, > HBASE-15338-trunk-v7.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15409) TestHFileBackedByBucketCache failed randomly on jdk8
Liu Shaohui created HBASE-15409: --- Summary: TestHFileBackedByBucketCache failed randomly on jdk8 Key: HBASE-15409 URL: https://issues.apache.org/jira/browse/HBASE-15409 Project: HBase Issue Type: Test Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor When running the small tests, we found TestHFileBackedByBucketCache failed randomly {code} mvn clean package install -DrunSmallTests -Dtest=TestHFileBackedByBucketCache Running org.apache.hadoop.hbase.io.hfile.TestHFileBackedByBucketCache Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.262 sec <<< FAILURE! - in org.apache.hadoop.hbase.io.hfile.TestHFileBackedByBucketCache testBucketCacheCachesAndPersists(org.apache.hadoop.hbase.io.hfile.TestHFileBackedByBucketCache) Time elapsed: 0.69 sec <<< FAILURE! java.lang.AssertionError: expected:<5> but was:<4> at org.apache.hadoop.hbase.io.hfile.TestHFileBackedByBucketCache.testBucketCacheCachesAndPersists(TestHFileBackedByBucketCache.java:161) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182749#comment-15182749 ] Liu Shaohui commented on HBASE-15338: - Committed to master~ > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff, > HBASE-15338-trunk-v7.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui closed HBASE-15338. --- > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff, > HBASE-15338-trunk-v7.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15391) Avoid too large "deleted from META" info log
[ https://issues.apache.org/jira/browse/HBASE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15391: Attachment: HBASE-15391-trunk-v2.diff Update for Duo Zhang's review > Avoid too large "deleted from META" info log > > > Key: HBASE-15391 > URL: https://issues.apache.org/jira/browse/HBASE-15391 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15391-trunk-v1.diff, HBASE-15391-trunk-v2.diff > > > When deleting a large table in HBase, there will be a large info log in > HMaster. > {code} > 2016-02-29,05:58:45,920 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => 4b54572150941cd03f5addfdeab0a754, NAME => > 'YCSBTest,,1453186492932.4b54572150941cd03f5addfdeab0a754.', STARTKEY => '', > ENDKEY => 'user01'}, {ENCODED => 715e142bcd6a31d7842abf286ef8a5fe, NAME => > 'YCSBTest,user01,1453186492933.715e142bcd6a31d7842abf286ef8a5fe.', STARTKEY > => 'user01', ENDKEY => 'user02'}, {ENCODED => > 5f9cef5714973f13baa63fba29a68d70, NAME => > 'YCSBTest,user02,1453186492933.5f9cef5714973f13baa63fba29a68d70.', STARTKEY > => 'user02', ENDKEY => 'user03'}, {ENCODED => > 86cf3fa4c0a6b911275512c1d4b78533, NAME => 'YCSBTest,user0... > {code} > The reason is that MetaTableAccessor will log all regions when deleting them > from meta. See, MetaTableAccessor.java#deleteRegions > {code} > public static void deleteRegions(Connection connection, >List regionsInfo, long ts) > throws IOException { > List deletes = new ArrayList(regionsInfo.size()); > for (HRegionInfo hri: regionsInfo) { > Delete e = new Delete(hri.getRegionName()); > e.addFamily(getCatalogFamily(), ts); > deletes.add(e); > } > deleteFromMetaTable(connection, deletes); > LOG.info("Deleted " + regionsInfo); > } > {code} > Just change the info log to debug and add a info log about the number of > deleted regions. Others suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15391) Avoid too large "deleted from META" info log
[ https://issues.apache.org/jira/browse/HBASE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182620#comment-15182620 ] Liu Shaohui commented on HBASE-15391: - [~Apache9] Sorry about this typo. Fix it in patch v2. Thanks~ > Avoid too large "deleted from META" info log > > > Key: HBASE-15391 > URL: https://issues.apache.org/jira/browse/HBASE-15391 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15391-trunk-v1.diff > > > When deleting a large table in HBase, there will be a large info log in > HMaster. > {code} > 2016-02-29,05:58:45,920 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => 4b54572150941cd03f5addfdeab0a754, NAME => > 'YCSBTest,,1453186492932.4b54572150941cd03f5addfdeab0a754.', STARTKEY => '', > ENDKEY => 'user01'}, {ENCODED => 715e142bcd6a31d7842abf286ef8a5fe, NAME => > 'YCSBTest,user01,1453186492933.715e142bcd6a31d7842abf286ef8a5fe.', STARTKEY > => 'user01', ENDKEY => 'user02'}, {ENCODED => > 5f9cef5714973f13baa63fba29a68d70, NAME => > 'YCSBTest,user02,1453186492933.5f9cef5714973f13baa63fba29a68d70.', STARTKEY > => 'user02', ENDKEY => 'user03'}, {ENCODED => > 86cf3fa4c0a6b911275512c1d4b78533, NAME => 'YCSBTest,user0... > {code} > The reason is that MetaTableAccessor will log all regions when deleting them > from meta. See, MetaTableAccessor.java#deleteRegions > {code} > public static void deleteRegions(Connection connection, >List regionsInfo, long ts) > throws IOException { > List deletes = new ArrayList(regionsInfo.size()); > for (HRegionInfo hri: regionsInfo) { > Delete e = new Delete(hri.getRegionName()); > e.addFamily(getCatalogFamily(), ts); > deletes.add(e); > } > deleteFromMetaTable(connection, deletes); > LOG.info("Deleted " + regionsInfo); > } > {code} > Just change the info log to debug and add a info log about the number of > deleted regions. Others suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15391) Avoid too large "deleted from META" info log
[ https://issues.apache.org/jira/browse/HBASE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182602#comment-15182602 ] Liu Shaohui commented on HBASE-15391: - [~stack] [~Apache9] Please help to review this patch? Thanks~ > Avoid too large "deleted from META" info log > > > Key: HBASE-15391 > URL: https://issues.apache.org/jira/browse/HBASE-15391 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15391-trunk-v1.diff > > > When deleting a large table in HBase, there will be a large info log in > HMaster. > {code} > 2016-02-29,05:58:45,920 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => 4b54572150941cd03f5addfdeab0a754, NAME => > 'YCSBTest,,1453186492932.4b54572150941cd03f5addfdeab0a754.', STARTKEY => '', > ENDKEY => 'user01'}, {ENCODED => 715e142bcd6a31d7842abf286ef8a5fe, NAME => > 'YCSBTest,user01,1453186492933.715e142bcd6a31d7842abf286ef8a5fe.', STARTKEY > => 'user01', ENDKEY => 'user02'}, {ENCODED => > 5f9cef5714973f13baa63fba29a68d70, NAME => > 'YCSBTest,user02,1453186492933.5f9cef5714973f13baa63fba29a68d70.', STARTKEY > => 'user02', ENDKEY => 'user03'}, {ENCODED => > 86cf3fa4c0a6b911275512c1d4b78533, NAME => 'YCSBTest,user0... > {code} > The reason is that MetaTableAccessor will log all regions when deleting them > from meta. See, MetaTableAccessor.java#deleteRegions > {code} > public static void deleteRegions(Connection connection, >List regionsInfo, long ts) > throws IOException { > List deletes = new ArrayList(regionsInfo.size()); > for (HRegionInfo hri: regionsInfo) { > Delete e = new Delete(hri.getRegionName()); > e.addFamily(getCatalogFamily(), ts); > deletes.add(e); > } > deleteFromMetaTable(connection, deletes); > LOG.info("Deleted " + regionsInfo); > } > {code} > Just change the info log to debug and add a info log about the number of > deleted regions. Others suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15391) Avoid too large "deleted from META" info log
[ https://issues.apache.org/jira/browse/HBASE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15391: Status: Patch Available (was: Open) > Avoid too large "deleted from META" info log > > > Key: HBASE-15391 > URL: https://issues.apache.org/jira/browse/HBASE-15391 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15391-trunk-v1.diff > > > When deleting a large table in HBase, there will be a large info log in > HMaster. > {code} > 2016-02-29,05:58:45,920 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => 4b54572150941cd03f5addfdeab0a754, NAME => > 'YCSBTest,,1453186492932.4b54572150941cd03f5addfdeab0a754.', STARTKEY => '', > ENDKEY => 'user01'}, {ENCODED => 715e142bcd6a31d7842abf286ef8a5fe, NAME => > 'YCSBTest,user01,1453186492933.715e142bcd6a31d7842abf286ef8a5fe.', STARTKEY > => 'user01', ENDKEY => 'user02'}, {ENCODED => > 5f9cef5714973f13baa63fba29a68d70, NAME => > 'YCSBTest,user02,1453186492933.5f9cef5714973f13baa63fba29a68d70.', STARTKEY > => 'user02', ENDKEY => 'user03'}, {ENCODED => > 86cf3fa4c0a6b911275512c1d4b78533, NAME => 'YCSBTest,user0... > {code} > The reason is that MetaTableAccessor will log all regions when deleting them > from meta. See, MetaTableAccessor.java#deleteRegions > {code} > public static void deleteRegions(Connection connection, >List regionsInfo, long ts) > throws IOException { > List deletes = new ArrayList(regionsInfo.size()); > for (HRegionInfo hri: regionsInfo) { > Delete e = new Delete(hri.getRegionName()); > e.addFamily(getCatalogFamily(), ts); > deletes.add(e); > } > deleteFromMetaTable(connection, deletes); > LOG.info("Deleted " + regionsInfo); > } > {code} > Just change the info log to debug and add a info log about the number of > deleted regions. Others suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15391) Avoid too large "deleted from META" info log
[ https://issues.apache.org/jira/browse/HBASE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15391: Attachment: HBASE-15391-trunk-v1.diff Simple patch to master > Avoid too large "deleted from META" info log > > > Key: HBASE-15391 > URL: https://issues.apache.org/jira/browse/HBASE-15391 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15391-trunk-v1.diff > > > When deleting a large table in HBase, there will be a large info log in > HMaster. > {code} > 2016-02-29,05:58:45,920 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => 4b54572150941cd03f5addfdeab0a754, NAME => > 'YCSBTest,,1453186492932.4b54572150941cd03f5addfdeab0a754.', STARTKEY => '', > ENDKEY => 'user01'}, {ENCODED => 715e142bcd6a31d7842abf286ef8a5fe, NAME => > 'YCSBTest,user01,1453186492933.715e142bcd6a31d7842abf286ef8a5fe.', STARTKEY > => 'user01', ENDKEY => 'user02'}, {ENCODED => > 5f9cef5714973f13baa63fba29a68d70, NAME => > 'YCSBTest,user02,1453186492933.5f9cef5714973f13baa63fba29a68d70.', STARTKEY > => 'user02', ENDKEY => 'user03'}, {ENCODED => > 86cf3fa4c0a6b911275512c1d4b78533, NAME => 'YCSBTest,user0... > {code} > The reason is that MetaTableAccessor will log all regions when deleting them > from meta. See, MetaTableAccessor.java#deleteRegions > {code} > public static void deleteRegions(Connection connection, >List regionsInfo, long ts) > throws IOException { > List deletes = new ArrayList(regionsInfo.size()); > for (HRegionInfo hri: regionsInfo) { > Delete e = new Delete(hri.getRegionName()); > e.addFamily(getCatalogFamily(), ts); > deletes.add(e); > } > deleteFromMetaTable(connection, deletes); > LOG.info("Deleted " + regionsInfo); > } > {code} > Just change the info log to debug and add a info log about the number of > deleted regions. Others suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179372#comment-15179372 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] Sorry, the patch have not be committed~ I just add the release note first and then commit it if no objection. Any more suggestion about the patch v7? Thanks~ > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff, > HBASE-15338-trunk-v7.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15391) Avoid too large "deleted from META" info log
Liu Shaohui created HBASE-15391: --- Summary: Avoid too large "deleted from META" info log Key: HBASE-15391 URL: https://issues.apache.org/jira/browse/HBASE-15391 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 When deleting a large table in HBase, there will be a large info log in HMaster. {code} 2016-02-29,05:58:45,920 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED => 4b54572150941cd03f5addfdeab0a754, NAME => 'YCSBTest,,1453186492932.4b54572150941cd03f5addfdeab0a754.', STARTKEY => '', ENDKEY => 'user01'}, {ENCODED => 715e142bcd6a31d7842abf286ef8a5fe, NAME => 'YCSBTest,user01,1453186492933.715e142bcd6a31d7842abf286ef8a5fe.', STARTKEY => 'user01', ENDKEY => 'user02'}, {ENCODED => 5f9cef5714973f13baa63fba29a68d70, NAME => 'YCSBTest,user02,1453186492933.5f9cef5714973f13baa63fba29a68d70.', STARTKEY => 'user02', ENDKEY => 'user03'}, {ENCODED => 86cf3fa4c0a6b911275512c1d4b78533, NAME => 'YCSBTest,user0... {code} The reason is that MetaTableAccessor will log all regions when deleting them from meta. See, MetaTableAccessor.java#deleteRegions {code} public static void deleteRegions(Connection connection, List regionsInfo, long ts) throws IOException { List deletes = new ArrayList(regionsInfo.size()); for (HRegionInfo hri: regionsInfo) { Delete e = new Delete(hri.getRegionName()); e.addFamily(getCatalogFamily(), ts); deletes.add(e); } deleteFromMetaTable(connection, deletes); LOG.info("Deleted " + regionsInfo); } {code} Just change the info log to debug and add a info log about the number of deleted regions. Others suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v7.diff Update the patch for [~jingcheng...@intel.com] 's suggestion~ > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff, > HBASE-15338-trunk-v7.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179343#comment-15179343 ] Liu Shaohui commented on HBASE-15338: - [~jingcheng...@intel.com] The meta blocks don't need to be cached if cache on read is disabled. This is expected by the current implementation (without this patch). I will remove the line "category == BlockCategory.META" and keep this behavior. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Resolution: Fixed Hadoop Flags: Reviewed Release Note: Add a new config: hbase.block.data.cacheonread, which is a global switch for caching data blocks on read. The default value of this switch is true, and data blocks will be cached on read if the block cache is enabled for the family and cacheBlocks flag is set to be true for get and scan operations. If this global switch is set to false, data blocks won't be cached even if the block cache is enabled for the family and the cacheBlocks flag of Gets or Scans are sets as true. Bloom blocks and index blocks are always be cached if the block cache of the regionserver is enabled. One usage of this switch is for the performance tests for the extreme case that the cache for data blocks all missed and all data blocks are read from underlying file system. Status: Resolved (was: Patch Available) > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179220#comment-15179220 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] Currently, If the per family switch for read cache is set to false, data blocks will not be cached even if the cacheBlocks flag of Gets or Scans are sets as true. See HFileReaderImpl#1536. I think the behavior of the global switch is consistent with that of per family switch. {code} // Cache the block if necessary if (cacheBlock && cacheConf.shouldCacheBlockOnRead(category)) { cacheConf.getBlockCache().cacheBlock(cacheKey, cacheConf.shouldCacheCompressed(category) ? hfileBlock : unpacked, cacheConf.isInMemory(), this.cacheConf.isCacheDataInL1()); } {code} > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-15385) A failed atomic folder rename operation can never recovery for the destination file is deleted in Wasb filesystem
[ https://issues.apache.org/jira/browse/HBASE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui resolved HBASE-15385. - Resolution: Invalid Fix Version/s: (was: 3.0.0) > A failed atomic folder rename operation can never recovery for the > destination file is deleted in Wasb filesystem > - > > Key: HBASE-15385 > URL: https://issues.apache.org/jira/browse/HBASE-15385 > Project: HBase > Issue Type: Bug > Components: hadoop-azure >Reporter: Liu Shaohui >Priority: Critical > > When using Wsab file system, we found that a failed atomic folder rename > operation can never recovery for the destination file deleted in Wasb > filesystem. > {quota} > ls: Attempting to complete rename of file > hbase/azurtst-xiaomi/data/default/YCSBTest/.tabledesc during folder rename > redo, and file was not found in source or destination. > {quote} > The reason is the the file is renamed to the destination file before the > crash, and the destination file is deleted by another process after crash. So > the recovery is blocked during finishing the rename operation of this file > when found the source and destination files all don't exist. > See: NativeAzureFileSystem.java #finishSingleFileRename > Another serious problem is that the recovery of atomic rename operation may > delete new created file which is same name as the source file, because the > file system don't check if there are rename operation need be redo. > Suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HBASE-15385) A failed atomic folder rename operation can never recovery for the destination file is deleted in Wasb filesystem
[ https://issues.apache.org/jira/browse/HBASE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui closed HBASE-15385. --- > A failed atomic folder rename operation can never recovery for the > destination file is deleted in Wasb filesystem > - > > Key: HBASE-15385 > URL: https://issues.apache.org/jira/browse/HBASE-15385 > Project: HBase > Issue Type: Bug > Components: hadoop-azure >Reporter: Liu Shaohui >Priority: Critical > > When using Wsab file system, we found that a failed atomic folder rename > operation can never recovery for the destination file deleted in Wasb > filesystem. > {quota} > ls: Attempting to complete rename of file > hbase/azurtst-xiaomi/data/default/YCSBTest/.tabledesc during folder rename > redo, and file was not found in source or destination. > {quote} > The reason is the the file is renamed to the destination file before the > crash, and the destination file is deleted by another process after crash. So > the recovery is blocked during finishing the rename operation of this file > when found the source and destination files all don't exist. > See: NativeAzureFileSystem.java #finishSingleFileRename > Another serious problem is that the recovery of atomic rename operation may > delete new created file which is same name as the source file, because the > file system don't check if there are rename operation need be redo. > Suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15385) A failed atomic folder rename operation can never recovery for the destination file is deleted in Wasb filesystem
[ https://issues.apache.org/jira/browse/HBASE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177837#comment-15177837 ] Liu Shaohui commented on HBASE-15385: - Sorry. It is a issue for Hadoop. Close it~ > A failed atomic folder rename operation can never recovery for the > destination file is deleted in Wasb filesystem > - > > Key: HBASE-15385 > URL: https://issues.apache.org/jira/browse/HBASE-15385 > Project: HBase > Issue Type: Bug > Components: hadoop-azure >Reporter: Liu Shaohui >Priority: Critical > Fix For: 3.0.0 > > > When using Wsab file system, we found that a failed atomic folder rename > operation can never recovery for the destination file deleted in Wasb > filesystem. > {quota} > ls: Attempting to complete rename of file > hbase/azurtst-xiaomi/data/default/YCSBTest/.tabledesc during folder rename > redo, and file was not found in source or destination. > {quote} > The reason is the the file is renamed to the destination file before the > crash, and the destination file is deleted by another process after crash. So > the recovery is blocked during finishing the rename operation of this file > when found the source and destination files all don't exist. > See: NativeAzureFileSystem.java #finishSingleFileRename > Another serious problem is that the recovery of atomic rename operation may > delete new created file which is same name as the source file, because the > file system don't check if there are rename operation need be redo. > Suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15385) A failed atomic folder rename operation can never recovery for the destination file is deleted in Wasb filesystem
[ https://issues.apache.org/jira/browse/HBASE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177822#comment-15177822 ] Liu Shaohui commented on HBASE-15385: - [~cnauroth] > A failed atomic folder rename operation can never recovery for the > destination file is deleted in Wasb filesystem > - > > Key: HBASE-15385 > URL: https://issues.apache.org/jira/browse/HBASE-15385 > Project: HBase > Issue Type: Bug > Components: hadoop-azure >Reporter: Liu Shaohui >Priority: Critical > Fix For: 3.0.0 > > > When using Wsab file system, we found that a failed atomic folder rename > operation can never recovery for the destination file deleted in Wasb > filesystem. > {quota} > ls: Attempting to complete rename of file > hbase/azurtst-xiaomi/data/default/YCSBTest/.tabledesc during folder rename > redo, and file was not found in source or destination. > {quote} > The reason is the the file is renamed to the destination file before the > crash, and the destination file is deleted by another process after crash. So > the recovery is blocked during finishing the rename operation of this file > when found the source and destination files all don't exist. > See: NativeAzureFileSystem.java #finishSingleFileRename > Another serious problem is that the recovery of atomic rename operation may > delete new created file which is same name as the source file, because the > file system don't check if there are rename operation need be redo. > Suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15385) A failed atomic folder rename operation can never recovery for the destination file is deleted in Wasb filesystem
[ https://issues.apache.org/jira/browse/HBASE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15385: Summary: A failed atomic folder rename operation can never recovery for the destination file is deleted in Wasb filesystem (was: A failed atomic folder rename operation can never recovery for the destination file deleted in Wasb filesystem) > A failed atomic folder rename operation can never recovery for the > destination file is deleted in Wasb filesystem > - > > Key: HBASE-15385 > URL: https://issues.apache.org/jira/browse/HBASE-15385 > Project: HBase > Issue Type: Bug > Components: hadoop-azure >Reporter: Liu Shaohui >Priority: Critical > Fix For: 3.0.0 > > > When using Wsab file system, we found that a failed atomic folder rename > operation can never recovery for the destination file deleted in Wasb > filesystem. > {quota} > ls: Attempting to complete rename of file > hbase/azurtst-xiaomi/data/default/YCSBTest/.tabledesc during folder rename > redo, and file was not found in source or destination. > {quote} > The reason is the the file is renamed to the destination file before the > crash, and the destination file is deleted by another process after crash. So > the recovery is blocked during finishing the rename operation of this file > when found the source and destination files all don't exist. > See: NativeAzureFileSystem.java #finishSingleFileRename > Another serious problem is that the recovery of atomic rename operation may > delete new created file which is same name as the source file, because the > file system don't check if there are rename operation need be redo. > Suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15385) A failed atomic folder rename operation can never recovery for the destination file deleted in Wasb filesystem
Liu Shaohui created HBASE-15385: --- Summary: A failed atomic folder rename operation can never recovery for the destination file deleted in Wasb filesystem Key: HBASE-15385 URL: https://issues.apache.org/jira/browse/HBASE-15385 Project: HBase Issue Type: Bug Components: hadoop-azure Reporter: Liu Shaohui Priority: Critical Fix For: 3.0.0 When using Wsab file system, we found that a failed atomic folder rename operation can never recovery for the destination file deleted in Wasb filesystem. {quota} ls: Attempting to complete rename of file hbase/azurtst-xiaomi/data/default/YCSBTest/.tabledesc during folder rename redo, and file was not found in source or destination. {quote} The reason is the the file is renamed to the destination file before the crash, and the destination file is deleted by another process after crash. So the recovery is blocked during finishing the rename operation of this file when found the source and destination files all don't exist. See: NativeAzureFileSystem.java #finishSingleFileRename Another serious problem is that the recovery of atomic rename operation may delete new created file which is same name as the source file, because the file system don't check if there are rename operation need be redo. Suggestions are welcomed~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177472#comment-15177472 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] {quote} And what abt enabled on Scan level? It is not set to true but Scan sets as cache we wont cache still right? {quote} Yes. If the global switch for read cache is set to false, we won't cache even if the cacheBlocks flag of Gets or Scans are sets as true. It's a very dangerous switch but it's useful in some cases like perf tests. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177399#comment-15177399 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] [~stack] [~jingcheng...@intel.com] Any other suggestions? I will commit it tomorrow of no objection. Thanks~ > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175187#comment-15175187 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] {quote} + conf.getBoolean(CACHE_DATA_ON_READ_KEY, DEFAULT_CACHE_DATA_ON_READ) + && family.isBlockCacheEnabled(), For other configs we have || condition btw global one and family specific. Why this is different? There was one issue with this discussion. I forgot which issue and whether we have closed that or not. {quote} By default, the global switch for CACHE_DATA_ON_READ is true, and the switch for family is true too, If we want disable the global or per family data cache for read, we can just change one switch when using && condition. If we use || condition, only changing one switch will take no efforts. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui closed HBASE-15312. --- > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff, HBASE-15312-trunk-v2.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175178#comment-15175178 ] Liu Shaohui commented on HBASE-15312: - Committed the addendum to master. Thanks all for reviewing~ > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff, HBASE-15312-trunk-v2.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171363#comment-15171363 ] Liu Shaohui commented on HBASE-15312: - Thanks [~stack] I will revert patch v1 and commit patch v2 tomorrow if no objection. > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff, HBASE-15312-trunk-v2.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v6.diff Fix the checkstyle error. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff, HBASE-15338-trunk-v6.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v5.diff Fix the check style errors > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff, > HBASE-15338-trunk-v5.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v4.diff Update the patch ~ Thanks for [~jingcheng...@intel.com] and [~anoop.hbase]. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff, HBASE-15338-trunk-v4.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168733#comment-15168733 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] {quote} Why we need this new config? Why can not we rely on HCD setting? {quote} I think it's better to have a global switch. {quote} This may be the issue you are saying? This is called from getMetaBlock(). As per the comment, when we read meta blocks, we must cache it. As we do not pass any type we seems may not do that.. That is a bug IMO.. So we better correct that bug (Any other?) and test ur case with HCD setting? And ya as per Jingcheng suggestion, we need consider META block category as well? {quote} Thanks for pointing out. I will update patch according Jingcheng's suggestion and add more tests. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168728#comment-15168728 ] Liu Shaohui commented on HBASE-15338: - [~jingcheng...@intel.com] {quota} So the meta blocks don't need to be cached if cache on read is disabled? Is it done on purpose? {quota} No. I will update the patch later according to your advice. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15327) Canary will always invoke admin.balancer() in each sniffing period when writeSniffing is enabled
[ https://issues.apache.org/jira/browse/HBASE-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168643#comment-15168643 ] Liu Shaohui commented on HBASE-15327: - LGTM~ Thanks [~cuijianwei] > Canary will always invoke admin.balancer() in each sniffing period when > writeSniffing is enabled > > > Key: HBASE-15327 > URL: https://issues.apache.org/jira/browse/HBASE-15327 > Project: HBase > Issue Type: Bug > Components: canary >Affects Versions: 2.0.0 >Reporter: Jianwei Cui >Priority: Minor > Attachments: HBASE-15327-trunk.patch > > > When Canary#writeSniffing is enabled, Canary#checkWriteTableDistribution will > make sure the regions of write table distributed on all region servers as: > {code} > int numberOfServers = admin.getClusterStatus().getServers().size(); > .. > int numberOfCoveredServers = serverSet.size(); > if (numberOfCoveredServers < numberOfServers) { > admin.balancer(); > } > {code} > The master will also work as a regionserver, so that ClusterStatus#getServers > will contain the master. On the other hand, write table of Canary will not be > assigned to master, making numberOfCoveredServers always smaller than > numberOfServers and admin.balancer always be invoked in each sniffing period. > This may cause frequent region moves. A simple fix is excluding master from > numberOfServers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v3.diff Making cache on read in CacheConfig configurable according to [~jingcheng...@intel.com] and [~anoop.hbase] 's suggestions. Thanks > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff, > HBASE-15338-trunk-v3.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168616#comment-15168616 ] Liu Shaohui commented on HBASE-15338: - [~jingcheng...@intel.com] [~anoop.hbase] Agreed with you. Making cache on read in CacheConfig configurable will works. Thanks for your suggestions~ I will update the patch later~ > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v2.diff Update the comments > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff, HBASE-15338-trunk-v2.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168564#comment-15168564 ] Liu Shaohui commented on HBASE-15338: - Thanks [~anoop.hbase] When table is created with data block caching as false in CF config, the index and meta blocks will not be cached too according to the code. This is not what we want. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168556#comment-15168556 ] Liu Shaohui commented on HBASE-15338: - Thanks [~jingcheng...@intel.com] {quote} I meant just making cache on read in CacheConfig configurable (a false value for cache on read) can work as well instead of add a new global switch. {quote} If we make cache on read in CacheConfig configurable and set it to false, the index and meta blocks will not be cached. This is not what we expect. Of course, we can update the meaning of this config by changing the code. And the data block may also be cached if we set hbase.rs.cacheblocksonwrite to be true. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168533#comment-15168533 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] {quote} ut my thinking was when we specify cache blocks as false in Get/Scan, we will consider that for DATA blocks only. The index blocks will get cached any way. Did not see the code now.. Can check {quote} After the check the code in HFileReaderImpl.java#1533, the index blocks will be also not cached when we specify cache blocks as false in Get/Scan, because the var cacheBlock is false. What's more, we don't like to change the client code when using 3rd-party benchmark tools like YCSB. {code} // Cache the block if necessary if (cacheBlock && cacheConf.shouldCacheBlockOnRead(category)) { cacheConf.getBlockCache().cacheBlock(cacheKey, cacheConf.shouldCacheCompressed(category) ? hfileBlock : unpacked, cacheConf.isInMemory(), this.cacheConf.isCacheDataInL1()); } {code} > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168522#comment-15168522 ] Liu Shaohui commented on HBASE-15338: - [~chenheng] [~jingcheng...@intel.com] {quote} what's the difference if we just set hfile.block.cache.size to be 0. {quote} In this case, index blocks and meta blocks will not be cached and need to be read from the file system for every get/scan. The latency will be very bad and don't reflect the usual state of the hbase cluster where the amount of data is more larger than the sum of memory. I think tests for this case have no meanings. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Status: Patch Available (was: Open) > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Fix Version/s: 2.0.0 > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168498#comment-15168498 ] Liu Shaohui commented on HBASE-15338: - [~anoop.hbase] {quote} During these tests, we can specify cache blocks as false in Scan/Get and achieve what you want? {quote} Not actually. If we specify cache blocks as false, the index blocks and meta blocks need to be read from the underlying file system for Scan/Get, which is not consistent with actual state of the HBase cluster. Usually the index block and meta block can be cached in BlockCache. What we want to test is the latency in the bad case when data block is not in cache and need to be read from the file system for get. > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Attachment: HBASE-15338-trunk-v1.diff Patch for master branch > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-15338-trunk-v1.diff > > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
Liu Shaohui created HBASE-15338: --- Summary: Add a option to disable the data block cache for testing the performance of underlying file system Key: HBASE-15338 URL: https://issues.apache.org/jira/browse/HBASE-15338 Project: HBase Issue Type: Improvement Components: integration tests Reporter: Liu Shaohui Assignee: Liu Shaohui When testing and comparing the performance of different file systems(HDFS, Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the affect of the HBase BlockCache and get the actually random read latency when data block is read from underlying file system. (Usually, the index block and meta block should be cached in memory in the testing). So we add a option in CacheConfig to disable the data block cache. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15338) Add a option to disable the data block cache for testing the performance of underlying file system
[ https://issues.apache.org/jira/browse/HBASE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15338: Priority: Minor (was: Major) > Add a option to disable the data block cache for testing the performance of > underlying file system > -- > > Key: HBASE-15338 > URL: https://issues.apache.org/jira/browse/HBASE-15338 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > > When testing and comparing the performance of different file systems(HDFS, > Azure blob storage, AWS S3 and so on) for HBase, it's better to avoid the > affect of the HBase BlockCache and get the actually random read latency when > data block is read from underlying file system. (Usually, the index block and > meta block should be cached in memory in the testing). > So we add a option in CacheConfig to disable the data block cache. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168425#comment-15168425 ] Liu Shaohui commented on HBASE-15312: - @Appy [~saint@gmail.com] [~eclark] Could you help to review the patch v2? Thanks~ > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff, HBASE-15312-trunk-v2.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15312: Attachment: HBASE-15312-trunk-v2.diff Update the patch according to [~eclark]'s suggestion~ > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff, HBASE-15312-trunk-v2.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166587#comment-15166587 ] Liu Shaohui commented on HBASE-15312: - [~eclark] {quote} hbase-testing-util is supposed to be the answer to all of this ? Does that not work ? {quote} Yes. After test, hbase-testing-util works for the mini cluster tests. Thanks for your reminding. [~appy] [~saint@gmail.com] I will update the patch according to [~eclark]'s suggestion. > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160082#comment-15160082 ] Liu Shaohui commented on HBASE-15312: - [~appy] Thanks for your review~ {quote} 1. hbase-hadoop-compat is also a dependency in hbase-server/pom.xml. If it'll always be included transitively, maybe remove it from here? {quote} Yes, hbase-hadoop-compat is also a dependency in hbase-server/pom.xml. But the scope for test-jar of hbase-hadoop-compat is test, which will no be included transitively. {quote} 2. You are removing hadoop-hdfs' (non test-jar) from dependency, is it really not needed? {quote} hadoop-hdfs is also a dependency in hbase-server/pom.xml and it be included transitively. > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15312-trunk-v1.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
[ https://issues.apache.org/jira/browse/HBASE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-15312: Attachment: HBASE-15312-trunk-v1.diff Simple patch to update the dependences for Mini-Cluster. And test it in a java maven project > Update the dependences of pom for mini cluster in HBase Book > > > Key: HBASE-15312 > URL: https://issues.apache.org/jira/browse/HBASE-15312 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-15312-trunk-v1.diff > > > In HBase book, the dependences of pom for mini cluster is outdated after > version 0.96. > See: > http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15312) Update the dependences of pom for mini cluster in HBase Book
Liu Shaohui created HBASE-15312: --- Summary: Update the dependences of pom for mini cluster in HBase Book Key: HBASE-15312 URL: https://issues.apache.org/jira/browse/HBASE-15312 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor In HBase book, the dependences of pom for mini cluster is outdated after version 0.96. See: http://hbase.apache.org/book.html#_integration_testing_with_an_hbase_mini_cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11393) Replication TableCfs should be a PB object rather than a string
[ https://issues.apache.org/jira/browse/HBASE-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978103#comment-14978103 ] Liu Shaohui commented on HBASE-11393: - [~chenheng] See ReplicationAdmin.java line 370/421. But these usage are just in HBase. No compatibility problem. > Replication TableCfs should be a PB object rather than a string > --- > > Key: HBASE-11393 > URL: https://issues.apache.org/jira/browse/HBASE-11393 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar > Fix For: 2.0.0 > > > We concatenate the list of tables and column families in format > "table1:cf1,cf2;table2:cfA,cfB" in zookeeper for table-cf to replication peer > mapping. > This results in ugly parsing code. We should do this a PB object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HBASE-14591) Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui closed HBASE-14591. --- > Region with reference hfile may split after a forced split in > IncreasingToUpperBoundRegionSplitPolicy > - > > Key: HBASE-14591 > URL: https://issues.apache.org/jira/browse/HBASE-14591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.15 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14591-v1.patch > > > In the IncreasingToUpperBoundRegionSplitPolicy, a region with a store having > hfile reference may split after a forced split. This will break many > assumptions of design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14591) Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14591: Resolution: Fixed Fix Version/s: 0.98.16 1.1.3 1.0.3 1.3.0 1.2.0 Status: Resolved (was: Patch Available) > Region with reference hfile may split after a forced split in > IncreasingToUpperBoundRegionSplitPolicy > - > > Key: HBASE-14591 > URL: https://issues.apache.org/jira/browse/HBASE-14591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.15 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14591-v1.patch > > > In the IncreasingToUpperBoundRegionSplitPolicy, a region with a store having > hfile reference may split after a forced split. This will break many > assumptions of design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14591) Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy
Liu Shaohui created HBASE-14591: --- Summary: Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy Key: HBASE-14591 URL: https://issues.apache.org/jira/browse/HBASE-14591 Project: HBase Issue Type: Bug Affects Versions: 0.98.15 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 In the IncreasingToUpperBoundRegionSplitPolicy, a region with a store having hfile reference may split after a forced split. This will break many assumptions of design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14591) Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14591: Attachment: HBASE-14591-v1.patch Simple patch to fix this and add a test > Region with reference hfile may split after a forced split in > IncreasingToUpperBoundRegionSplitPolicy > - > > Key: HBASE-14591 > URL: https://issues.apache.org/jira/browse/HBASE-14591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.15 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14591-v1.patch > > > In the IncreasingToUpperBoundRegionSplitPolicy, a region with a store having > hfile reference may split after a forced split. This will break many > assumptions of design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14591) Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14591: Status: Patch Available (was: Open) > Region with reference hfile may split after a forced split in > IncreasingToUpperBoundRegionSplitPolicy > - > > Key: HBASE-14591 > URL: https://issues.apache.org/jira/browse/HBASE-14591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.15 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14591-v1.patch > > > In the IncreasingToUpperBoundRegionSplitPolicy, a region with a store having > hfile reference may split after a forced split. This will break many > assumptions of design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14591) Region with reference hfile may split after a forced split in IncreasingToUpperBoundRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954237#comment-14954237 ] Liu Shaohui commented on HBASE-14591: - Thanks for [~yuzhih...@gmail.com] and [~stack]'s reviews~ I will commit it to all 0.98+ branches if no objection until tomorrow. > Region with reference hfile may split after a forced split in > IncreasingToUpperBoundRegionSplitPolicy > - > > Key: HBASE-14591 > URL: https://issues.apache.org/jira/browse/HBASE-14591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.15 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14591-v1.patch > > > In the IncreasingToUpperBoundRegionSplitPolicy, a region with a store having > hfile reference may split after a forced split. This will break many > assumptions of design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14517) Show regionserver's version in master status page
[ https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14517: Attachment: HBASE-14517-v2.patch Fix checkstyle errors > Show regionserver's version in master status page > - > > Key: HBASE-14517 > URL: https://issues.apache.org/jira/browse/HBASE-14517 > Project: HBase > Issue Type: Improvement > Components: monitoring >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14517-v1.diff, HBASE-14517-v1.patch, > HBASE-14517-v2.patch > > > In production env, regionservers may be removed from the cluster for hardware > problems and rejoined the cluster after the repair. There is a potential risk > that the version of rejoined regionserver may diff from others because the > cluster has been upgraded through many versions. > To solve this, we can show the all regionservers' version in the server list > of master's status page, and highlight the regionserver when its version is > different from the master's version, similar to HDFS-3245 > Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14517) Show regionserver's version in master status page
[ https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14517: Attachment: HBASE-14517-v1.patch Same patch to trigger the ci build. > Show regionserver's version in master status page > - > > Key: HBASE-14517 > URL: https://issues.apache.org/jira/browse/HBASE-14517 > Project: HBase > Issue Type: Improvement > Components: monitoring >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14517-v1.diff, HBASE-14517-v1.patch > > > In production env, regionservers may be removed from the cluster for hardware > problems and rejoined the cluster after the repair. There is a potential risk > that the version of rejoined regionserver may diff from others because the > cluster has been upgraded through many versions. > To solve this, we can show the all regionservers' version in the server list > of master's status page, and highlight the regionserver when its version is > different from the master's version, similar to HDFS-3245 > Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14517) Show regionserver's version in master status page
[ https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947990#comment-14947990 ] Liu Shaohui commented on HBASE-14517: - [~stack] {quote} Why you move the VersionInfo from RPC to HBase protos? {quote} Because RPC.proto has depended on HBase.proto and there will be a cycle dependency if the VersionInfo is put in RPC.proto. > Show regionserver's version in master status page > - > > Key: HBASE-14517 > URL: https://issues.apache.org/jira/browse/HBASE-14517 > Project: HBase > Issue Type: Improvement > Components: monitoring >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14517-v1.diff > > > In production env, regionservers may be removed from the cluster for hardware > problems and rejoined the cluster after the repair. There is a potential risk > that the version of rejoined regionserver may diff from others because the > cluster has been upgraded through many versions. > To solve this, we can show the all regionservers' version in the server list > of master's status page, and highlight the regionserver when its version is > different from the master's version, similar to HDFS-3245 > Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14517) Show regionserver's version in master status page
[ https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14517: Attachment: HBASE-14517-v1.diff Patch for master > Show regionserver's version in master status page > - > > Key: HBASE-14517 > URL: https://issues.apache.org/jira/browse/HBASE-14517 > Project: HBase > Issue Type: Improvement > Components: monitoring >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14517-v1.diff > > > In production env, regionservers may be removed from the cluster for hardware > problems and rejoined the cluster after the repair. There is a potential risk > that the version of rejoined regionserver may diff from others because the > cluster has been upgraded through many versions. > To solve this, we can show the all regionservers' version in the server list > of master's status page, and highlight the regionserver when its version is > different from the master's version, similar to HDFS-3245 > Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14517) Show regionserver's version in master status page
[ https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14517: Fix Version/s: 2.0.0 Status: Patch Available (was: Open) > Show regionserver's version in master status page > - > > Key: HBASE-14517 > URL: https://issues.apache.org/jira/browse/HBASE-14517 > Project: HBase > Issue Type: Improvement > Components: monitoring >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14517-v1.diff > > > In production env, regionservers may be removed from the cluster for hardware > problems and rejoined the cluster after the repair. There is a potential risk > that the version of rejoined regionserver may diff from others because the > cluster has been upgraded through many versions. > To solve this, we can show the all regionservers' version in the server list > of master's status page, and highlight the regionserver when its version is > different from the master's version, similar to HDFS-3245 > Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14517) Show regionserver's version in master status page
Liu Shaohui created HBASE-14517: --- Summary: Show regionserver's version in master status page Key: HBASE-14517 URL: https://issues.apache.org/jira/browse/HBASE-14517 Project: HBase Issue Type: Improvement Components: monitoring Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor In production env, regionservers may be removed from the cluster for hardware problems and rejoined the cluster after the repair. There is a potential risk that the version of rejoined regionserver may diff from others because the cluster has been upgraded through many versions. To solve this, we can show the all regionservers' version in the server list of master's status page, and highlight the regionserver when its version is different from the master's version, similar to HDFS-3245 Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui reopened HBASE-14404: - [~apurtell] There are typos in patch v2, which made the test failed. All failed tests passed with patch v3. You can see the diff of v2 and v3 from the file: v3-v2.diff > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell > Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14404: Attachment: (was: HBASE-14404-0.98-v3.diff) > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell >Assignee: Liu Shaohui > Fix For: 0.98.15 > > Attachments: HBASE-14404-0.98-v3.patch, HBASE-14404-0.98.patch, > HBASE-14404-0.98.patch, v3-v2.diff > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14404: Attachment: v3-v2.diff The diff of patch v2 and v3 > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell > Attachments: HBASE-14404-0.98-v3.diff, HBASE-14404-0.98.patch, > HBASE-14404-0.98.patch, v3-v2.diff > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14404: Attachment: HBASE-14404-0.98-v3.diff Patch v3 > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell > Attachments: HBASE-14404-0.98-v3.diff, HBASE-14404-0.98.patch, > HBASE-14404-0.98.patch > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906199#comment-14906199 ] Liu Shaohui commented on HBASE-14404: - The build failed for TestBytes was killed. {quote} Running org.apache.hadoop.hbase.util.TestBytes Killed {quote} > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell >Assignee: Liu Shaohui > Fix For: 0.98.15 > > Attachments: HBASE-14404-0.98-v3.patch, HBASE-14404-0.98.patch, > HBASE-14404-0.98.patch, v3-v2.diff > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14404: Assignee: Liu Shaohui Fix Version/s: 0.98.15 Status: Patch Available (was: Reopened) > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell >Assignee: Liu Shaohui > Fix For: 0.98.15 > > Attachments: HBASE-14404-0.98-v3.diff, HBASE-14404-0.98.patch, > HBASE-14404-0.98.patch, v3-v2.diff > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14404: Attachment: HBASE-14404-0.98-v3.patch Patch v3 > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell >Assignee: Liu Shaohui > Fix For: 0.98.15 > > Attachments: HBASE-14404-0.98-v3.diff, HBASE-14404-0.98-v3.patch, > HBASE-14404-0.98.patch, HBASE-14404-0.98.patch, v3-v2.diff > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907394#comment-14907394 ] Liu Shaohui commented on HBASE-14404: - Thanks [~apurtell]. It's a very nice feature to have in 0.98. > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell >Assignee: Liu Shaohui > Fix For: 0.98.15 > > Attachments: HBASE-14404-0.98-v3.patch, HBASE-14404-0.98.patch, > HBASE-14404-0.98.patch, v3-v2.diff > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14373) PerformanceEvaluation tool should support huge number of rows beyond int range
[ https://issues.apache.org/jira/browse/HBASE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732190#comment-14732190 ] Liu Shaohui commented on HBASE-14373: - This issue is duplicated with HBASE-13319. There was a patch in HBASE-13319. Maybe we push that issue? > PerformanceEvaluation tool should support huge number of rows beyond int range > -- > > Key: HBASE-14373 > URL: https://issues.apache.org/jira/browse/HBASE-14373 > Project: HBase > Issue Type: Improvement > Components: test >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Minor > > We have test tool “org.apache.hadoop.hbase.PerformanceEvaluation” to evaluate > HBase performance and scalability. > > Suppose this script is executed as below, > {noformat} > hbase org.apache.hadoop.hbase.PerformanceEvaluation --presplit=120 > --rows=1000 randomWrite 500 > {noformat} > Here total 500 clients and each clients have 1000 rows. > As per the code, > {code} > opts.totalRows = opts.perClientRunRows * opts.numClientThreads > {code} > optt.totalRows is int, so 1000*500 will be out of range. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14346) Typo in FamilyFilter
[ https://issues.apache.org/jira/browse/HBASE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726623#comment-14726623 ] Liu Shaohui commented on HBASE-14346: - LGTM~ Thanks [~lars_francke] > Typo in FamilyFilter > > > Key: HBASE-14346 > URL: https://issues.apache.org/jira/browse/HBASE-14346 > Project: HBase > Issue Type: Bug > Components: documentation >Reporter: Joshua Batson >Assignee: Lars Francke >Priority: Trivial > Attachments: HBASE-14346.patch > > Original Estimate: 5m > Remaining Estimate: 5m > > I think there's a typo. "qualifier name" should read "column family name" > Family Filter > This filter takes a compare operator and a comparator. It compares each > qualifier name with the comparator using the compare operator and if the > comparison returns true, it returns all the key-values in that column. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-7519) Support level compaction
[ https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725304#comment-14725304 ] Liu Shaohui commented on HBASE-7519: Any progress about this feature? I would like to contribute some time on this issue if needed. > Support level compaction > > > Key: HBASE-7519 > URL: https://issues.apache.org/jira/browse/HBASE-7519 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Jimmy Xiang >Assignee: Sergey Shelukhin > Attachments: level-compaction.pdf, level-compactions-notes.txt, > level-compactions-notes.txt > > > The level compaction algorithm may help HBase for some use cases, for > example, read heavy loads (especially, just one version is used), relative > small key space updated frequently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14316) On truncate table command, the hbase doesn't maintain the pre-defined splits
[ https://issues.apache.org/jira/browse/HBASE-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712714#comment-14712714 ] Liu Shaohui commented on HBASE-14316: - Please use the truncate_preserve command, which will maintain the predefined splits. But the acls of table will removed. See: HBASE-5525. On truncate table command, the hbase doesn't maintain the pre-defined splits Key: HBASE-14316 URL: https://issues.apache.org/jira/browse/HBASE-14316 Project: HBase Issue Type: Bug Reporter: debarshi basak On truncate table command, the hbase doesn't maintain the pre-defined splits. It simply drops and re creates table. It should have some mechanism to maintain the predefined splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14254) Wrong error message when throwing NamespaceNotFoundException in shell
[ https://issues.apache.org/jira/browse/HBASE-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708893#comment-14708893 ] Liu Shaohui commented on HBASE-14254: - @mbertozzi [~tedyu] Could you help to review this small patch? Thanks~ Wrong error message when throwing NamespaceNotFoundException in shell - Key: HBASE-14254 URL: https://issues.apache.org/jira/browse/HBASE-14254 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Attachments: HBASE-14254-v001.diff Wrong error message when throwing NamespaceNotFoundException in shell {code} hbase(main):004:0 create 'ns:t1', {NAME = 'f1'} ERROR: Unknown namespace ns:t1! {code} The namespace shoud be {color:red}ns {color}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14247: Attachment: HBASE-14247-v003.diff Fix the checkstyle problems Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Components: wal Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, HBASE-14247-v003.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708910#comment-14708910 ] Liu Shaohui commented on HBASE-14247: - [~davelatham] {quote} One concern with this change is that the OldLogCleaner as it is implemented will now have to run through the cleaner checks for every regionserver's subdirectory instead of doing them all in one batch. It will likely make the cleaner chore much slower and may not be able to keep up for large clusters. {quote} I don't think this will be problem. Currenly the HFileCleaner have to run through the cleaner checks for every store's subdirectory of every region of every table and we didn't see the problem of efficiency. So I won't need to worry about the OldLogCleaner in new layout of old layout. What's more, we can turn up the cleaner period by config hbase.master.cleaner.interval. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14247: Component/s: wal Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Components: wal Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14277) TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name
[ https://issues.apache.org/jira/browse/HBASE-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14277: Fix Version/s: 1.1.3 1.0.3 1.2.1 0.98.15 TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name - Key: HBASE-14277 URL: https://issues.apache.org/jira/browse/HBASE-14277 Project: HBase Issue Type: Test Components: test Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0, 0.98.15, 1.2.1, 1.0.3, 1.1.3 Attachments: HBASE-14277-v001.diff, HBASE-14277-v002.diff After HBASE-13995, hostname will be converted to lower case in ServerName. It may cause the test: TestRegionServerHostname.testRegionServerHostname failed at host with a case sensitive name. Just fix it in test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14277) TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name
[ https://issues.apache.org/jira/browse/HBASE-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14277: Fix Version/s: (was: 1.1.3) (was: 1.0.3) (was: 1.2.1) (was: 0.98.15) 1.1.2 1.2.0 TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name - Key: HBASE-14277 URL: https://issues.apache.org/jira/browse/HBASE-14277 Project: HBase Issue Type: Test Components: test Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14277-v001.diff, HBASE-14277-v002.diff After HBASE-13995, hostname will be converted to lower case in ServerName. It may cause the test: TestRegionServerHostname.testRegionServerHostname failed at host with a case sensitive name. Just fix it in test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14277) TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name
[ https://issues.apache.org/jira/browse/HBASE-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708747#comment-14708747 ] Liu Shaohui commented on HBASE-14277: - Push to master, branch-1.1 and branch-1.2. No need for 0.98 and branch 1.0 because they do not have this test. TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name - Key: HBASE-14277 URL: https://issues.apache.org/jira/browse/HBASE-14277 Project: HBase Issue Type: Test Components: test Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14277-v001.diff, HBASE-14277-v002.diff After HBASE-13995, hostname will be converted to lower case in ServerName. It may cause the test: TestRegionServerHostname.testRegionServerHostname failed at host with a case sensitive name. Just fix it in test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14277) TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name
[ https://issues.apache.org/jira/browse/HBASE-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14277: Resolution: Fixed Status: Resolved (was: Patch Available) TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name - Key: HBASE-14277 URL: https://issues.apache.org/jira/browse/HBASE-14277 Project: HBase Issue Type: Test Components: test Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14277-v001.diff, HBASE-14277-v002.diff After HBASE-13995, hostname will be converted to lower case in ServerName. It may cause the test: TestRegionServerHostname.testRegionServerHostname failed at host with a case sensitive name. Just fix it in test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13995) ServerName is not fully case insensitive
[ https://issues.apache.org/jira/browse/HBASE-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-13995: Fix Version/s: (was: 1.0.2) (was: 0.98.14) ServerName is not fully case insensitive Key: HBASE-13995 URL: https://issues.apache.org/jira/browse/HBASE-13995 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0, 1.2.0, 0.98.12.1, 1.0.1.1, 1.1.0.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-13995-v0.patch, HBASE-13995-v0.patch we ended up with two ServerName with different cases, AAA and aaa. Trying to create a table, every once in a while, we ended up with the region lost and not assigned. BaseLoadBalancer.roundRobinAssignment() goes through each server and create a map with what to assign to them. We had to server on the list AAA and aaa which are the same machine, the problem is that the round robin now is assigning an empty list to one of the two. so depending on the order we ended up with a region not assigned. ServerName equals() does the case insensitive comparison but the hashCode() is done on a case sensitive server name, so the Map in ServerManager will never hit the item and compare it using equals, so we end up with two entries that are the same server. similar thing for ServerName.isSameHostnameAndPort() where we don't check for cases -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13995) ServerName is not fully case insensitive
[ https://issues.apache.org/jira/browse/HBASE-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-13995: Fix Version/s: 0.98.14 1.0.2 ServerName is not fully case insensitive Key: HBASE-13995 URL: https://issues.apache.org/jira/browse/HBASE-13995 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0, 1.2.0, 0.98.12.1, 1.0.1.1, 1.1.0.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-13995-v0.patch, HBASE-13995-v0.patch we ended up with two ServerName with different cases, AAA and aaa. Trying to create a table, every once in a while, we ended up with the region lost and not assigned. BaseLoadBalancer.roundRobinAssignment() goes through each server and create a map with what to assign to them. We had to server on the list AAA and aaa which are the same machine, the problem is that the round robin now is assigning an empty list to one of the two. so depending on the order we ended up with a region not assigned. ServerName equals() does the case insensitive comparison but the hashCode() is done on a case sensitive server name, so the Map in ServerManager will never hit the item and compare it using equals, so we end up with two entries that are the same server. similar thing for ServerName.isSameHostnameAndPort() where we don't check for cases -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13996) Add write sniffing in canary
[ https://issues.apache.org/jira/browse/HBASE-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-13996: Attachment: HBASE-13996-v004.diff Fix the checkstyle errors Add write sniffing in canary Key: HBASE-13996 URL: https://issues.apache.org/jira/browse/HBASE-13996 Project: HBase Issue Type: Improvement Components: canary Affects Versions: 0.98.13, 1.1.0.1 Reporter: Liu Shaohui Assignee: Liu Shaohui Fix For: 2.0.0, 1.3.0, 0.98.15 Attachments: HBASE-13996-v001.diff, HBASE-13996-v002.diff, HBASE-13996-v003.diff, HBASE-13996-v004.diff Currently the canary tool only sniff the read operations, it's hard to finding the problem in write path. To support the write sniffing, we create a system table named '_canary_' in the canary tool. And the tool will make sure that the region number is large than the number of the regionserver and the regions will be distributed onto all regionservers. Periodically, the tool will put data to these regions to calculate the write availability of HBase and send alerts if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14277) TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name
[ https://issues.apache.org/jira/browse/HBASE-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706440#comment-14706440 ] Liu Shaohui commented on HBASE-14277: - [~tedyu] It seems that the CI build was not triggered automatically. Is there something wrong? Or How to start it manually? Thanks~ TestRegionServerHostname.testRegionServerHostname may fail at host with a case sensitive name - Key: HBASE-14277 URL: https://issues.apache.org/jira/browse/HBASE-14277 Project: HBase Issue Type: Test Components: test Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14277-v001.diff After HBASE-13995, hostname will be converted to lower case in ServerName. It may cause the test: TestRegionServerHostname.testRegionServerHostname failed at host with a case sensitive name. Just fix it in test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-14247: Attachment: HBASE-14247-v002.diff Fix some failed tests Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706449#comment-14706449 ] Liu Shaohui commented on HBASE-14247: - [~stack] [~apurtell] What's your suggestions about this change? Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)