[jira] [Created] (HBASE-21012) Revert the use of proto.TimeRangeTracker when writing hfile
Chia-Ping Tsai created HBASE-21012: -- Summary: Revert the use of proto.TimeRangeTracker when writing hfile Key: HBASE-21012 URL: https://issues.apache.org/jira/browse/HBASE-21012 Project: HBase Issue Type: Sub-task Reporter: Chia-Ping Tsai HBASE-18754 change the serialization of TimeRangeTracker from "manual way" to protobuf. However, the change breaks the backward compatibility of hfile. We should revert the change ASAP. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21012) Revert the use of proto.TimeRangeTracker when writing hfile
[ https://issues.apache.org/jira/browse/HBASE-21012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-21012: --- Fix Version/s: 3.0.0 > Revert the use of proto.TimeRangeTracker when writing hfile > --- > > Key: HBASE-21012 > URL: https://issues.apache.org/jira/browse/HBASE-21012 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Priority: Critical > Fix For: 3.0.0, 2.0.2, 2.1.1 > > > HBASE-18754 change the serialization of TimeRangeTracker from "manual way" to > protobuf. However, the change breaks the backward compatibility of hfile. We > should revert the change ASAP. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21012) Revert the use of proto.TimeRangeTracker when writing hfile
[ https://issues.apache.org/jira/browse/HBASE-21012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-21012: --- Fix Version/s: 2.1.1 2.0.2 > Revert the use of proto.TimeRangeTracker when writing hfile > --- > > Key: HBASE-21012 > URL: https://issues.apache.org/jira/browse/HBASE-21012 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Priority: Critical > Fix For: 3.0.0, 2.0.2, 2.1.1 > > > HBASE-18754 change the serialization of TimeRangeTracker from "manual way" to > protobuf. However, the change breaks the backward compatibility of hfile. We > should revert the change ASAP. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21008) HBase 1.x can not read HBase2 hfiles due to TimeRangeTracker
[ https://issues.apache.org/jira/browse/HBASE-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai reassigned HBASE-21008: -- Assignee: Chia-Ping Tsai > HBase 1.x can not read HBase2 hfiles due to TimeRangeTracker > > > Key: HBASE-21008 > URL: https://issues.apache.org/jira/browse/HBASE-21008 > Project: HBase > Issue Type: Bug > Components: compatibility, HFile >Affects Versions: 2.1.0, 1.4.6 >Reporter: Jerry He >Assignee: Chia-Ping Tsai >Priority: Critical > > It looks like HBase 1.x can not open hfiiles written by HBase2 still. > I tested the latest HBase 1.4.6 and 2.1.0. 1.4.6 tried to read and open > regions written by 2.1.0. > {code} > 2018-07-30 16:01:31,274 ERROR [StoreFileOpenerThread-info-1] > regionserver.StoreFile: Error reading timestamp range data from meta -- > proceeding without > java.lang.IllegalArgumentException: Timestamp cannot be negative. > minStamp:5783278630776778969, maxStamp:-4698050386518222402 > at org.apache.hadoop.hbase.io.TimeRange.check(TimeRange.java:112) > at org.apache.hadoop.hbase.io.TimeRange.(TimeRange.java:100) > at > org.apache.hadoop.hbase.regionserver.TimeRangeTracker.toTimeRange(TimeRangeTracker.java:214) > at > org.apache.hadoop.hbase.regionserver.TimeRangeTracker.getTimeRange(TimeRangeTracker.java:198) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:507) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:531) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:521) > at > org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:679) > at > org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:122) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:538) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:535) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > {code} > Or: > {code} > 2018-07-30 16:01:31,305 ERROR [RS_OPEN_REGION-throb1:34004-0] > handler.OpenRegionHandler: Failed open of > region=janusgraph,,1532630557542.b0fa15cb0bf1b0bf740997b7056c., starting > to roll back the global memstore size. > java.io.IOException: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1033) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:908) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:876) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6995) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6956) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6927) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6883) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6834) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:364) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:131) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hbase.regionserver.HStore.openStoreFiles(HStore.java:564) > at > org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:518) > at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:281) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5378) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1007) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1004) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at java.io.DataInputStream.readLong(DataInputStream.java:416)
[jira] [Commented] (HBASE-21008) HBase 1.x can not read HBase2 hfiles due to TimeRangeTracker
[ https://issues.apache.org/jira/browse/HBASE-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569741#comment-16569741 ] Jerry He commented on HBASE-21008: -- I had the same question for you :) But go ahead. You will be faster than me. Thanks for the quick response on this issue! > HBase 1.x can not read HBase2 hfiles due to TimeRangeTracker > > > Key: HBASE-21008 > URL: https://issues.apache.org/jira/browse/HBASE-21008 > Project: HBase > Issue Type: Bug > Components: compatibility, HFile >Affects Versions: 2.1.0, 1.4.6 >Reporter: Jerry He >Priority: Critical > > It looks like HBase 1.x can not open hfiiles written by HBase2 still. > I tested the latest HBase 1.4.6 and 2.1.0. 1.4.6 tried to read and open > regions written by 2.1.0. > {code} > 2018-07-30 16:01:31,274 ERROR [StoreFileOpenerThread-info-1] > regionserver.StoreFile: Error reading timestamp range data from meta -- > proceeding without > java.lang.IllegalArgumentException: Timestamp cannot be negative. > minStamp:5783278630776778969, maxStamp:-4698050386518222402 > at org.apache.hadoop.hbase.io.TimeRange.check(TimeRange.java:112) > at org.apache.hadoop.hbase.io.TimeRange.(TimeRange.java:100) > at > org.apache.hadoop.hbase.regionserver.TimeRangeTracker.toTimeRange(TimeRangeTracker.java:214) > at > org.apache.hadoop.hbase.regionserver.TimeRangeTracker.getTimeRange(TimeRangeTracker.java:198) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:507) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:531) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:521) > at > org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:679) > at > org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:122) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:538) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:535) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > {code} > Or: > {code} > 2018-07-30 16:01:31,305 ERROR [RS_OPEN_REGION-throb1:34004-0] > handler.OpenRegionHandler: Failed open of > region=janusgraph,,1532630557542.b0fa15cb0bf1b0bf740997b7056c., starting > to roll back the global memstore size. > java.io.IOException: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1033) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:908) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:876) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6995) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6956) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6927) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6883) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6834) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:364) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:131) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hbase.regionserver.HStore.openStoreFiles(HStore.java:564) > at > org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:518) > at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:281) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5378) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1007) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1004) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: java.io.EOFException > at
[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569728#comment-16569728 ] Kuan-Po Tseng commented on HBASE-18201: --- [~reidchan] Could you take a look ? > add UT and docs for DataBlockEncodingTool > - > > Key: HBASE-18201 > URL: https://issues.apache.org/jira/browse/HBASE-18201 > Project: HBase > Issue Type: Sub-task > Components: tooling >Reporter: Chia-Ping Tsai >Assignee: Kuan-Po Tseng >Priority: Minor > Labels: beginner > Attachments: HBASE-18201.master.001.patch, > HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, > HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, > HBASE-18201.master.005.patch, HBASE-18201.master.005.patch, > HBASE-18201.master.005.patch > > > There is no example, documents, or tests for DataBlockEncodingTool. We should > have it friendly if any use case exists. Otherwise, we should just get rid of > it because DataBlockEncodingTool presumes that the implementation of cell > returned from DataBlockEncoder is KeyValue. The presume may obstruct the > cleanup of KeyValue references in the code base of read/write path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20985) add two attributes when we do normalization
[ https://issues.apache.org/jira/browse/HBASE-20985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569697#comment-16569697 ] Jingyun Tian commented on HBASE-20985: -- ping [~zghaobac] > add two attributes when we do normalization > --- > > Key: HBASE-20985 > URL: https://issues.apache.org/jira/browse/HBASE-20985 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Fix For: 2.1.0 > > Attachments: HBASE-20985.master.001.patch, > HBASE-20985.master.002.patch, HBASE-20985.master.003.patch, > HBASE-20985.master.004.patch > > > Currently when we turn on normalization switch, it will help balance the > whole table based on total region size / total region count. I add two > attributes so that we can set total region count or average region size we > want to achieve when normalization done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20986) Separate the config of block size when we do log splitting and write Hlog
[ https://issues.apache.org/jira/browse/HBASE-20986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569696#comment-16569696 ] Jingyun Tian commented on HBASE-20986: -- Thx for your review. Could you help me commit this patch? > Separate the config of block size when we do log splitting and write Hlog > - > > Key: HBASE-20986 > URL: https://issues.apache.org/jira/browse/HBASE-20986 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Fix For: 2.1.0 > > Attachments: HBASE-20986.master.001.patch, > HBASE-20986.master.002.patch, HBASE-20986.master.003.patch, > HBASE-20986.master.004.patch > > > Since the block size of recovered edits and hlog are the same right now, if > we set a large value to block size, name node may not able to assign enough > space when we do log splitting. But set a large value to hlog block size can > help reduce the number of region server asking for a new block. Thus I think > separate the config of block size is necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19008) Add missing equals or hashCode method(s) to stock Filter implementations
[ https://issues.apache.org/jira/browse/HBASE-19008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569678#comment-16569678 ] liubangchen commented on HBASE-19008: - Fix issues for Hadoop QA. > Add missing equals or hashCode method(s) to stock Filter implementations > > > Key: HBASE-19008 > URL: https://issues.apache.org/jira/browse/HBASE-19008 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: liubangchen >Priority: Major > Labels: filter > Attachments: Filters.png, HBASE-19008-1.patch, HBASE-19008-2.patch, > HBASE-19008-3.patch, HBASE-19008-4.patch, HBASE-19008.patch > > > In HBASE-15410, [~mdrob] reminded me that Filter implementations may not > write {{equals}} or {{hashCode}} method(s). > This issue is to add missing {{equals}} or {{hashCode}} method(s) to stock > Filter implementations such as KeyOnlyFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19008) Add missing equals or hashCode method(s) to stock Filter implementations
[ https://issues.apache.org/jira/browse/HBASE-19008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liubangchen updated HBASE-19008: Attachment: HBASE-19008-4.patch > Add missing equals or hashCode method(s) to stock Filter implementations > > > Key: HBASE-19008 > URL: https://issues.apache.org/jira/browse/HBASE-19008 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: liubangchen >Priority: Major > Labels: filter > Attachments: Filters.png, HBASE-19008-1.patch, HBASE-19008-2.patch, > HBASE-19008-3.patch, HBASE-19008-4.patch, HBASE-19008.patch > > > In HBASE-15410, [~mdrob] reminded me that Filter implementations may not > write {{equals}} or {{hashCode}} method(s). > This issue is to add missing {{equals}} or {{hashCode}} method(s) to stock > Filter implementations such as KeyOnlyFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21008) HBase 1.x can not read HBase2 hfiles due to TimeRangeTracker
[ https://issues.apache.org/jira/browse/HBASE-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569636#comment-16569636 ] Chia-Ping Tsai commented on HBASE-21008: [~jinghe] Are you preparing the patch? Or let me fix it? > HBase 1.x can not read HBase2 hfiles due to TimeRangeTracker > > > Key: HBASE-21008 > URL: https://issues.apache.org/jira/browse/HBASE-21008 > Project: HBase > Issue Type: Bug > Components: compatibility, HFile >Affects Versions: 2.1.0, 1.4.6 >Reporter: Jerry He >Priority: Critical > > It looks like HBase 1.x can not open hfiiles written by HBase2 still. > I tested the latest HBase 1.4.6 and 2.1.0. 1.4.6 tried to read and open > regions written by 2.1.0. > {code} > 2018-07-30 16:01:31,274 ERROR [StoreFileOpenerThread-info-1] > regionserver.StoreFile: Error reading timestamp range data from meta -- > proceeding without > java.lang.IllegalArgumentException: Timestamp cannot be negative. > minStamp:5783278630776778969, maxStamp:-4698050386518222402 > at org.apache.hadoop.hbase.io.TimeRange.check(TimeRange.java:112) > at org.apache.hadoop.hbase.io.TimeRange.(TimeRange.java:100) > at > org.apache.hadoop.hbase.regionserver.TimeRangeTracker.toTimeRange(TimeRangeTracker.java:214) > at > org.apache.hadoop.hbase.regionserver.TimeRangeTracker.getTimeRange(TimeRangeTracker.java:198) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:507) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:531) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:521) > at > org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:679) > at > org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:122) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:538) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:535) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > {code} > Or: > {code} > 2018-07-30 16:01:31,305 ERROR [RS_OPEN_REGION-throb1:34004-0] > handler.OpenRegionHandler: Failed open of > region=janusgraph,,1532630557542.b0fa15cb0bf1b0bf740997b7056c., starting > to roll back the global memstore size. > java.io.IOException: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1033) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:908) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:876) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6995) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6956) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6927) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6883) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6834) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:364) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:131) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hbase.regionserver.HStore.openStoreFiles(HStore.java:564) > at > org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:518) > at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:281) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5378) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1007) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1004) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at
[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters
[ https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569573#comment-16569573 ] Hudson commented on HBASE-18477: Results for branch HBASE-18477 [build #286 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/286/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/286//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/286//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/286//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/286//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Umbrella JIRA for HBase Read Replica clusters > - > > Key: HBASE-18477 > URL: https://issues.apache.org/jira/browse/HBASE-18477 > Project: HBase > Issue Type: New Feature >Reporter: Zach York >Assignee: Zach York >Priority: Major > Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase > Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope > doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf > > > Recently, changes (such as HBASE-17437) have unblocked HBase to run with a > root directory external to the cluster (such as in Amazon S3). This means > that the data is stored outside of the cluster and can be accessible after > the cluster has been terminated. One use case that is often asked about is > pointing multiple clusters to one root directory (sharing the data) to have > read resiliency in the case of a cluster failure. > > This JIRA is an umbrella JIRA to contain all the tasks necessary to create a > read-replica HBase cluster that is pointed at the same root directory. > > This requires making the Read-Replica cluster Read-Only (no metadata > operation or data operations). > Separating the hbase:meta table for each cluster (Otherwise HBase gets > confused with multiple clusters trying to update the meta table with their ip > addresses) > Adding refresh functionality for the meta table to ensure new metadata is > picked up on the read replica cluster. > Adding refresh functionality for HFiles for a given table to ensure new data > is picked up on the read replica cluster. > > This can be used with any existing cluster that is backed by an external > filesystem. > > Please note that this feature is still quite manual (with the potential for > automation later). > > More information on this particular feature can be found here: > https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20978) [amv2] Worker terminating UNNATURALLY during MoveRegionProcedure
[ https://issues.apache.org/jira/browse/HBASE-20978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569530#comment-16569530 ] stack commented on HBASE-20978: --- branch-2.0 has HBASE-20846 now. My long-run test failed with another instance of this issue. Notes below but just repeat of above. Still need to dig. Making this critical. The region was being moved because of a balance. The unassign had completed. Here are the log lines: {code} 2018-08-05 01:19:58,555 INFO [Thread-19] procedure.MasterProcedureScheduler: pid=4965, state=WAITING:MOVE_REGION_ASSIGN, hasLock=false; MoveRegionProcedure hri=e7c120b0eb913346e4bead908ebed468, source=ve0536.halxg.cloudera.com,16020,1533402328997, destination=ve0534.halxg.cloudera.com,16020,1533457059157 checking lock on e7c120b0eb913346e4bead908ebed468 ... 2018-08-05 01:19:58,555 INFO [Thread-19] procedure.MasterProcedureScheduler: pid=4967, ppid=4965, state=SUCCESS, hasLock=false; UnassignProcedure table=IntegrationTestBigLinkedList, region=e7c120b0eb913346e4bead908ebed468, server=ve0536.halxg.cloudera.com,16020,1533402328997 checking lock on e7c120b0eb913346e4bead908ebed468 ... CRASH! ... 2018-08-05 01:20:03,698 INFO [Thread-19] assignment.RegionStateStore: Load hbase:meta entry region=e7c120b0eb913346e4bead908ebed468, regionState=CLOSED, lastHost=ve0536.halxg.cloudera.com,16020,1533402328997, regionLocation=ve0536.halxg.cloudera.com,16020,1533402328997, openSeqNum=2267217 ... 2018-08-05 01:20:03,968 INFO [Thread-19] master.HMaster: Master has completed initialization 10.371sec 2018-08-05 01:20:03,969 WARN [PEWorker-8] procedure2.ProcedureExecutor: Worker terminating UNNATURALLY null java.lang.IllegalArgumentException: NOT RUNNABLE! pid=4965, state=WAITING:MOVE_REGION_ASSIGN, hasLock=true; MoveRegionProcedure hri=e7c120b0eb913346e4bead908ebed468, source=ve0536.halxg.cloudera.com,16020,1533402328997, destination=ve0534.halxg.cloudera.com,16020,1533457059157 at org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1502) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1298) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1805) 2018-08-05 01:20:03,972 INFO [Thread-19] quotas.MasterQuotaManager: Quota support disabled 2018-08-05 01:20:03,972 INFO [Thread-19] zookeeper.ZKWatcher: not a secure deployment, proceeding {code} > [amv2] Worker terminating UNNATURALLY during MoveRegionProcedure > > > Key: HBASE-20978 > URL: https://issues.apache.org/jira/browse/HBASE-20978 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.2 > > > Testing tip of branch-2.0, ran into this: > {code} > 2018-07-29 01:45:33,002 INFO [master/ve0524:16000] master.HMaster: Master > has completed initialization 13.854sec >2018-07-29 > 01:45:33,003 INFO [PEWorker-4] procedure.MasterProcedureScheduler: pid=1820, > state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure > hri=533fb79ba23b27e9e0715b51daeb30c1, > source=ve0538.halxg.cloudera.com,16020,1532847421672, > destination=ve0540.halxg.cloudera.com,16020,1532853151031 checking lock on > 533fb79ba23b27e9e0715b51daeb30c1 > 2018-07-29 01:45:33,003 > WARN [PEWorker-4] procedure2.ProcedureExecutor: Worker terminating > UNNATURALLY null > java.lang.IllegalArgumentException: pid=1820, > state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure
[jira] [Updated] (HBASE-20978) [amv2] Worker terminating UNNATURALLY during MoveRegionProcedure
[ https://issues.apache.org/jira/browse/HBASE-20978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20978: -- Priority: Critical (was: Major) > [amv2] Worker terminating UNNATURALLY during MoveRegionProcedure > > > Key: HBASE-20978 > URL: https://issues.apache.org/jira/browse/HBASE-20978 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.2 > > > Testing tip of branch-2.0, ran into this: > {code} > 2018-07-29 01:45:33,002 INFO [master/ve0524:16000] master.HMaster: Master > has completed initialization 13.854sec >2018-07-29 > 01:45:33,003 INFO [PEWorker-4] procedure.MasterProcedureScheduler: pid=1820, > state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure > hri=533fb79ba23b27e9e0715b51daeb30c1, > source=ve0538.halxg.cloudera.com,16020,1532847421672, > destination=ve0540.halxg.cloudera.com,16020,1532853151031 checking lock on > 533fb79ba23b27e9e0715b51daeb30c1 > 2018-07-29 01:45:33,003 > WARN [PEWorker-4] procedure2.ProcedureExecutor: Worker terminating > UNNATURALLY null > java.lang.IllegalArgumentException: pid=1820, > state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure > hri=533fb79ba23b27e9e0715b51daeb30c1, > source=ve0538.halxg.cloudera.com,16020,1532847421672, > destination=ve0540.halxg.cloudera.com,16020,1532853151031 > at > org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134) > > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1249) > > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1763) > {code} > It then shows as the below in the UI: > {code} > IdParent State Owner TypeStart Time Last Update Errors > Parameters > 1820 WAITING stack MoveRegionProcedure > hri=533fb79ba23b27e9e0715b51daeb30c1, > source=ve0538.halxg.cloudera.com,16020,1532847421672, > destination=ve0540.halxg.cloudera.com,16020,1532853151031 Sun Jul 29 > 01:33:37 PDT 2018Sun Jul 29 01:33:38 PDT 2018[ { state => [ > '1', '2' ] }, { regionId => '1532851768240', tableName => { namespace => > 'ZGVmYXVsdA==', qualifier => 'SW50ZWdyYXRpb25UZXN0QmlnTGlua2VkTGlzdA==' }, > startKey => 'VttDLvXHdcmzwqNdrNoUFg==', endKey => 'WGFV8k+hFqhcIJGiKZ8L4Q==', > offline => 'false', split => 'false', replicaId => '0' }, { sourceServer => { > hostName => 've0538.halxg.cloudera.com', port => '16020', startCode => > '1532847421672' }, destinationServer => { hostName => > 've0540.halxg.cloudera.com', port => '16020', startCode => '1532853151031' } > } ] > {code} > This is what we'd just read from hbase:meta: > {code} > 2018-07-29 01:45:32,802 INFO [master/ve0524:16000] > assignment.RegionStateStore: Load hbase:meta entry > region=533fb79ba23b27e9e0715b51daeb30c1, regionState=CLOSED, > lastHost=ve0538.halxg.cloudera.com,16020,1532847421672, > regionLocation=ve0538.halxg.cloudera.com,16020,1532847421672, > openSeqNum=1544600 > {code} > Before this, we'd just logged this: > 2018-07-29 01:33:39,786 INFO [PEWorker-14] assignment.RegionStateStore: > pid=1823 updating hbase:meta row=533fb79ba23b27e9e0715b51daeb30c1, > regionState=CLOSED > Going back in history, we do the above each time the Master gets restarted so > the region is offlined and never brought back online. > It is failing here: > {code} > private void execProcedure(final RootProcedureState procStack, > final Procedure procedure) { > Preconditions.checkArgument(procedure.getState() == > ProcedureState.RUNNABLE, > procedure.toString()); > {code} > Its the parent move region that is trying to run and failing. It is not > RUNNABLE? Because the subprocedure was 'done' but not fully? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569525#comment-16569525 ] Hadoop QA commented on HBASE-20881: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 16 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} hbase-client: The patch generated 0 new + 2 unchanged - 84 fixed = 2 total (was 86) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} hbase-procedure: The patch generated 0 new + 20 unchanged - 1 fixed = 20 total (was 21) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 14s{color} | {color:red} hbase-server: The patch generated 6 new + 252 unchanged - 49 fixed = 258 total (was 301) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch hbase-rsgroup passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 0s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 6s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 30s{color} | {color:red} hbase-server generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m
[jira] [Commented] (HBASE-20749) Upgrade our use of checkstyle to 8.6+
[ https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569524#comment-16569524 ] Hudson commented on HBASE-20749: Results for branch HBASE-20749 [build #14 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/14/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/14//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/14//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/14//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Upgrade our use of checkstyle to 8.6+ > - > > Key: HBASE-20749 > URL: https://issues.apache.org/jira/browse/HBASE-20749 > Project: HBase > Issue Type: Improvement > Components: build, community >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Minor > Attachments: HBASE-20749.master.001.patch, > HBASE-20749.master.002.patch, HBASE-20749.master.003.patch > > > We should upgrade our checkstyle version to 8.6 or later so we can use the > "match violation message to this regex" feature for suppression. That will > allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in > HBASE-20332). > We're currently blocked on upgrading to 8.3+ by [checkstyle > #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression > that flags our use of both the "separate import groups" and "put static > imports over here" configs as an error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569515#comment-16569515 ] stack commented on HBASE-20846: --- bq. Duo Zhang, is there any problem for this fix before it can be back-ported branch-2.0? What [~allan163] says... Yeah, what test is failing. I can take a look. > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, > HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, > HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, > HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20997) rebuildUserRegions() does not build ReplicaMapping during master switchover
[ https://issues.apache.org/jira/browse/HBASE-20997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569512#comment-16569512 ] Ted Yu commented on HBASE-20997: Can you refactor the test to fit master branch ? It would be good to prevent regression in the future. Thanks > rebuildUserRegions() does not build ReplicaMapping during master switchover > --- > > Key: HBASE-20997 > URL: https://issues.apache.org/jira/browse/HBASE-20997 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.2.6, 1.3.2, 1.5.0, 1.4.6 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Major > Attachments: HBASE-20997-branch-1-v1.patch, > HBASE-20997-branch-1-v2.patch, HBASE-20997-branch-1-v4.patch, > HBASE-20997-branch-1-v5.patch, HBASE-20997-branch-1-v6.patch > > > During master switchover, rebuildUserRegions() does not rebuild master > in-memory defaultReplicaToOtherReplicas map. This puts the cluster in an > inconsistent state. In read replica case, it causes replica parent region > stay online without being unassigned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20997) rebuildUserRegions() does not build ReplicaMapping during master switchover
[ https://issues.apache.org/jira/browse/HBASE-20997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569492#comment-16569492 ] huaxiang sun commented on HBASE-20997: -- Hi [~yuzhih...@gmail.com], I checked the branch-2 code, the AMv2 handling of this logic is different. It does not maintain the in-memory map and rely on this map to do unassignment. I think it is a no-issue for branch-2 +. Thanks. > rebuildUserRegions() does not build ReplicaMapping during master switchover > --- > > Key: HBASE-20997 > URL: https://issues.apache.org/jira/browse/HBASE-20997 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.2.6, 1.3.2, 1.5.0, 1.4.6 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Major > Attachments: HBASE-20997-branch-1-v1.patch, > HBASE-20997-branch-1-v2.patch, HBASE-20997-branch-1-v4.patch, > HBASE-20997-branch-1-v5.patch, HBASE-20997-branch-1-v6.patch > > > During master switchover, rebuildUserRegions() does not rebuild master > in-memory defaultReplicaToOtherReplicas map. This puts the cluster in an > inconsistent state. In read replica case, it causes replica parent region > stay online without being unassigned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20881: -- Component/s: proc-v2 amv2 > Introduce a region transition procedure to handle all the state transition > for a region > --- > > Key: HBASE-20881 > URL: https://issues.apache.org/jira/browse/HBASE-20881 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20881-v1.patch, HBASE-20881.patch > > > Now have an AssignProcedure, an UnssignProcedure, and also a > MoveRegionProcedure which schedules an AssignProcedure and an > UnssignProcedure to move a region. This makes the logic a bit complicated, as > MRP is not a RIT, so when SCP can not interrupt it directly... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20881: -- Attachment: HBASE-20881-v1.patch > Introduce a region transition procedure to handle all the state transition > for a region > --- > > Key: HBASE-20881 > URL: https://issues.apache.org/jira/browse/HBASE-20881 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20881-v1.patch, HBASE-20881.patch > > > Now have an AssignProcedure, an UnssignProcedure, and also a > MoveRegionProcedure which schedules an AssignProcedure and an > UnssignProcedure to move a region. This makes the logic a bit complicated, as > MRP is not a RIT, so when SCP can not interrupt it directly... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569443#comment-16569443 ] Hadoop QA commented on HBASE-20881: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 15 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 33s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 44s{color} | {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 total (was 188) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} hbase-client: The patch generated 0 new + 2 unchanged - 84 fixed = 2 total (was 86) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} hbase-procedure: The patch generated 0 new + 20 unchanged - 1 fixed = 20 total (was 21) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 16s{color} | {color:red} hbase-server: The patch generated 14 new + 264 unchanged - 34 fixed = 278 total (was 298) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch hbase-rsgroup passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 2s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 14s{color} | {color:red} hbase-server generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 30s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | |
[jira] [Commented] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569430#comment-16569430 ] Duo Zhang commented on HBASE-20881: --- A big patch... Haven't finished yet, especially the UTs. And I've added some TODOs in the code, we can open follow-on issue to address them, as the patch here is already big enough... > Introduce a region transition procedure to handle all the state transition > for a region > --- > > Key: HBASE-20881 > URL: https://issues.apache.org/jira/browse/HBASE-20881 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20881.patch > > > Now have an AssignProcedure, an UnssignProcedure, and also a > MoveRegionProcedure which schedules an AssignProcedure and an > UnssignProcedure to move a region. This makes the logic a bit complicated, as > MRP is not a RIT, so when SCP can not interrupt it directly... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20881: -- Fix Version/s: 2.2.0 3.0.0 Status: Patch Available (was: Open) > Introduce a region transition procedure to handle all the state transition > for a region > --- > > Key: HBASE-20881 > URL: https://issues.apache.org/jira/browse/HBASE-20881 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20881.patch > > > Now have an AssignProcedure, an UnssignProcedure, and also a > MoveRegionProcedure which schedules an AssignProcedure and an > UnssignProcedure to move a region. This makes the logic a bit complicated, as > MRP is not a RIT, so when SCP can not interrupt it directly... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region
[ https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20881: -- Attachment: HBASE-20881.patch > Introduce a region transition procedure to handle all the state transition > for a region > --- > > Key: HBASE-20881 > URL: https://issues.apache.org/jira/browse/HBASE-20881 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20881.patch > > > Now have an AssignProcedure, an UnssignProcedure, and also a > MoveRegionProcedure which schedules an AssignProcedure and an > UnssignProcedure to move a region. This makes the logic a bit complicated, as > MRP is not a RIT, so when SCP can not interrupt it directly... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569422#comment-16569422 ] Hudson commented on HBASE-20846: Results for branch branch-2.0 [build #634 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, > HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, > HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, > HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20924) Backport "HBASE-20846 Restore procedure locks when master restarts"
[ https://issues.apache.org/jira/browse/HBASE-20924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569421#comment-16569421 ] Hudson commented on HBASE-20924: Results for branch branch-2.0 [build #634 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Backport "HBASE-20846 Restore procedure locks when master restarts" > --- > > Key: HBASE-20924 > URL: https://issues.apache.org/jira/browse/HBASE-20924 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.2 > > Attachments: HBASE-20924.branch-2.0.001.patch, > HBASE-20924.branch-2.0.002.patch, HBASE-20924.branch-2.0.003.patch > > > Backport "HBASE-20846 Restore procedure locks when master restarts" to > branch-2.0 but only after testing. The fix is a significant change to master > startup but should eliminate a whole class of possible problem types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled
[ https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569384#comment-16569384 ] Tak Lon (Stephen) Wu commented on HBASE-21011: -- Appreciate your quick response, I understood operator can use {{Admin#setCleanerChoreRunning(true)}} to make cleaner running if cleaner chore disabled. But again, sorry for repeating myself several times and might be wasted your time as well, that {{cleaner_chore_run}} introduced in HBASE-17280 is used for the case when cleaner chore set disabled. I updated [PR #89|https://github.com/apache/hbase/pull/89] to only allow {{cleaner_chore_run}} to be ran when cleaner chore disabled. To move forward, the question should be, should we remove {{cleaner_chore_run}} command because CLI `{{cleaner_chore_switch true}}` / {{Admin#setCleanerChoreRunning(true)}} are in fact doing the same feature to clean the HFiles and oldwals ? Also [~busbey] and [~zyork], looks like PR is not working for QA-bot (I didn't attach patch but only a link to PR), need some advice from you guys before I attached the patch. Thanks again. > Provide CLI option to run oldwals and hfiles cleaner separately when cleaner > chore is disabled > -- > > Key: HBASE-21011 > URL: https://issues.apache.org/jira/browse/HBASE-21011 > Project: HBase > Issue Type: Improvement > Components: Admin, Client >Affects Versions: 3.0.0, 1.4.6, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Minor > Fix For: 3.0.0 > > > There is a corner case when cleaner chore for HFiles and oldwals is disabled, > admin/user needs to manually execute admin command {{cleaner_chore_run}} to > clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is > to [firstly trigger the HFiles cleaner and then oldwals > cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420], > and only return succeed if both completes. > but when running this {{cleaner_chore_run}} command, there is a potential use > case that admin would like trigger the cleaner for only oldwals or hfiles but > still keep the automatic cleaner chore disabled. So, this change aims to > provide support for this corner case, and provide flexibility for those user > with cleaner chore disabled by default to execute admin CLI to run oldwals > and HFiles cleaning procedure individually. > NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch > added options 'hfiles' and 'oldwals' to it. Also fix default behavior of > {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, > e.g. the proposed admin CLI options are > {noformat} > hbase> cleaner_chore_run # this was introduced in HBASE-17280, > but changed the behavior to only ran when cleaner chore is set to disabled > hbase> cleaner_chore_run 'hfiles' # added, ran when cleaner chore is set > to disabled > hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set > to disabled > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)