[jira] [Created] (HBASE-21217) Revisit the executeProcedure method for open/close region
Duo Zhang created HBASE-21217: - Summary: Revisit the executeProcedure method for open/close region Key: HBASE-21217 URL: https://issues.apache.org/jira/browse/HBASE-21217 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang Fix For: 3.0.0, 2.2.0 Currently we just call openRegion and closeRegion directly, which is a bit buggy. For example, in order to not fail all the open region requests while there is only on failure, we will catch the exception and set a flag in the return value. But for executeProcedures call, the return value will be ignored, and we expect the openRegion method will always call reportRegionStateTransition to report the failure but in fact it does not... And after HBASE-20881, we can confirm that the race could happen, where we send a close request to a region which is opening(HBASE-21199), and vice visa. So I think here we need to revisit the implementation of executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
Ted Yu created HBASE-21216: -- Summary: TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky Key: HBASE-21216 URL: https://issues.apache.org/jira/browse/HBASE-21216 Project: HBase Issue Type: Test Reporter: Ted Yu >From >https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/ > : {code} java.lang.AssertionError: Archived hfiles [] and table hfiles [9ca09392705f425f9c916beedc10d63c] is missing snapshot file:6739a09747e54189a4112a6d8f37e894 at org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370) {code} The file appeared in archive dir before hfile cleaners were run: {code} 2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): |-archive/ 2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): |data/ 2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): |---default/ 2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): |--test/ 2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): |-1237d57b63a7bdf067a930441a02514a/ 2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): |recovered.edits/ 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): |---4.seqid 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): |-29e1700e09b51223ad2f5811105a4d51/ 2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): |fam/ 2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): |---2c66a18f6c1a4074b84ffbb3245268c4 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): |---45bb396c6a5e49629e45a4d56f1e9b14 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): |---6739a09747e54189a4112a6d8f37e894 {code} However, the archive dir became empty after hfile cleaners were run: {code} 2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): |-archive/ 2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): |-corrupt/ {code} Leading to the assertion failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-10342) RowKey Prefix Bloom Filter
[ https://issues.apache.org/jira/browse/HBASE-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-10342. Resolution: Duplicate Duped by HBASE-20636 > RowKey Prefix Bloom Filter > -- > > Key: HBASE-10342 > URL: https://issues.apache.org/jira/browse/HBASE-10342 > Project: HBase > Issue Type: New Feature >Reporter: Liyin Tang >Priority: Major > > When designing HBase schema for some use cases, it is quite common to combine > multiple information within the RowKey. For instance, assuming that rowkey is > constructed as md5(id1) + id1 + id2, and user wants to scan all the rowkeys > which starting by id1. In such case, the rowkey bloom filter is able to cut > more unnecessary seeks during the scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20995) Clean Up Manual Array Copies, trivial
[ https://issues.apache.org/jira/browse/HBASE-20995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Leach resolved HBASE-20995. Resolution: Won't Fix > Clean Up Manual Array Copies, trivial > - > > Key: HBASE-20995 > URL: https://issues.apache.org/jira/browse/HBASE-20995 > Project: HBase > Issue Type: Improvement >Reporter: John Leach >Assignee: John Leach >Priority: Trivial > Attachments: HBASE-20995.patch > > > Clean up manual array copies in code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21215) Figure how to invoke hbck2; make it easy to find
stack created HBASE-21215: - Summary: Figure how to invoke hbck2; make it easy to find Key: HBASE-21215 URL: https://issues.apache.org/jira/browse/HBASE-21215 Project: HBase Issue Type: Sub-task Components: amv2, hbck2 Reporter: stack Fix For: 2.1.1 In https://docs.google.com/document/d/1Oun4G3M5fyrM0OxXcCKYF8td0KD7gJQjnU9Ad-2t-uk/edit#, the doc on hbck2 'form', one item to figure is how to invoke hbck2. Related, how to make it easy to find? [~busbey] has some ideas (posted in doc). This issue is for implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21214) [hbck2] setTableState just sets hbase:meta state, not in-memory state
stack created HBASE-21214: - Summary: [hbck2] setTableState just sets hbase:meta state, not in-memory state Key: HBASE-21214 URL: https://issues.apache.org/jira/browse/HBASE-21214 Project: HBase Issue Type: Sub-task Components: amv2, hbck2 Reporter: stack Fix For: 2.1.1 Means that we have to go get another Master to see the table state change because in-memory state is still pegged at the old value. TODO: Check the is_enabled/is_disabled shell commands to make sure they are reading from the right place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21213) [hbck2] Need more cleanup needed on bypass; old Procedure left in RegionStateNodes
stack created HBASE-21213: - Summary: [hbck2] Need more cleanup needed on bypass; old Procedure left in RegionStateNodes Key: HBASE-21213 URL: https://issues.apache.org/jira/browse/HBASE-21213 Project: HBase Issue Type: Bug Components: amv2, hbck2 Reporter: stack Assignee: stack Fix For: 2.1.1 This is a follow-on from HBASE-21083 which added the 'bypass' functionality. On bypass, there is more state to be cleared if we are allow new Procedures to be scheduled. For example, here is a bypass: {code} 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null to finish it 2018-09-20 05:45:44,022 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec {code} ... but then when I try to assign the bypassed region later, I get this: {code} 2018-09-20 05:46:31,435 WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is already another procedure running on this region this=pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, location=ve1233.halxg.cloudera.com,22101,1537397961664 2018-09-20 05:46:31,510 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: There is already another procedure running on this region this=pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 exec-time=473msec {code} ... which is a long-winded way of saying the Unassign Procedure still exists still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21212) Wrong flush time when update flush metric
Allan Yang created HBASE-21212: -- Summary: Wrong flush time when update flush metric Key: HBASE-21212 URL: https://issues.apache.org/jira/browse/HBASE-21212 Project: HBase Issue Type: Bug Affects Versions: 2.0.2, 2.1.0, 3.0.0 Reporter: Allan Yang Assignee: Allan Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005)