[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626786#comment-16626786 ] Xu Cang commented on HBASE-20690: - Thanks for your great explanation. [~andrewcheng] non-binding +1 to your patch! > Moving table to target rsgroup needs to handle TableStateNotFoundException > -- > > Key: HBASE-20690 > URL: https://issues.apache.org/jira/browse/HBASE-20690 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-20690.master.001.patch, > HBASE-20690.master.002.patch > > > This is related code: > {code} > if (targetGroup != null) { > for (TableName table: tables) { > if (master.getAssignmentManager().isTableDisabled(table)) { > LOG.debug("Skipping move regions because the table" + table + " > is disabled."); > continue; > } > {code} > In a stack trace [~rmani] showed me: > {code} > 2018-06-06 07:10:44,893 ERROR > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] > master.TableStateManager: Unable to get table demo:tbl1 state > org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: > demo:tbl1 > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331) > at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768) > at > org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {code} > The logic should take potential TableStateNotFoundException into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626773#comment-16626773 ] stack commented on HBASE-21213: --- .006 addresses [~allan163] review changes attribute 'force' name to 'override' instead. > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21213: -- Attachment: HBASE-21213.branch-2.1.006.patch > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20636) Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED
[ https://issues.apache.org/jira/browse/HBASE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-20636: -- Release Note: Add two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED 1. ROWPREFIX_FIXED_LENGTH: specify the length of the prefix 2. ROWPREFIX_DELIMITED: specify the delimiter of the prefix Need to specify parameters for these two types of bloomfilter, otherwise the table will fail to create Example: create 't1', {NAME => 'f1', BLOOMFILTER => 'ROWPREFIX_FIXED_LENGTH', CONFIGURATION => {'RowPrefixBloomFilter.prefix_length' => '10'}} create 't1', {NAME => 'f1', BLOOMFILTER => 'ROWPREFIX_DELIMITED', CONFIGURATION => {'RowPrefixDelimitedBloomFilter.delimiter' => '#'}} Summary: Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED (was: Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) Add Release Note.Thanks [~anoop.hbase]. Thank [~apurtell] for commit. Thanks for all the reviews. > Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and > ROWPREFIX_DELIMITED > > > Key: HBASE-20636 > URL: https://issues.apache.org/jira/browse/HBASE-20636 > Project: HBase > Issue Type: New Feature > Components: HFile, regionserver, scan >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20636.master.001.patch, > HBASE-20636.master.002.patch, HBASE-20636.master.003.patch, > HBASE-20636.master.004.patch, HBASE-20636.master.005.patch > > > As we all know, HBase uses BloomFilter(ROW and ROWCOL) to filter unnecessary > files to improve read performance. But they only support Get and do not > support Scan. > In our company(Tencent), many users need to scan all rows with the same > prefix, such as Tencent Game. Game user's some operational record will be > written into HBase, each game user will have a lot of records, the rowkey is > constructed as userid+'#'+timestamps. So we can scan all records for a given > user for a specified period. > For this scenario, we designed the prefix Bloom filter. If the startRow and > stopRow of the Scan has a valid common prefix, the scan will be allowed to > use BloomFilter to filter files which will enhance the performance of the > scan. > Now, this feature has been running on our cluster over a year, and scan > performance for this scenario has been improved by more than one times than > before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work
[ https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626698#comment-16626698 ] Reid Chan edited comment on HBASE-20993 at 9/25/18 3:41 AM: It is a pretty useful test, SGTM. May i ask one question about implementation: how does it run an older version client against a patch-applied(may change wire protocol) pseudo distributed cluster? It seems `hbase_version` can be used for controlling the client version, but it is from `bin/hbase version`, will it be the same version as the cluster? was (Author: reidchan): It is a pretty useful test, SGTM. May i ask one question about implementation: how does it run an older version client against a patch-applied(may change wire protocol) pseudo distributed cluster? It seems `hbase_version` can be used for controlling the client version, but it is from `hbase version`, will it be the same version as the cluster? > [Auth] IPC client fallback to simple auth allowed doesn't work > -- > > Key: HBASE-20993 > URL: https://issues.apache.org/jira/browse/HBASE-20993 > Project: HBase > Issue Type: Bug > Components: Client, IPC/RPC, security >Affects Versions: 1.2.6, 1.3.2, 1.2.7, 1.4.7 >Reporter: Reid Chan >Assignee: Jack Bearden >Priority: Critical > Fix For: 1.5.0, 1.4.8 > > Attachments: HBASE-20993.001.patch, > HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, > HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, > HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, > HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, > HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, > HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, > HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt > > > It is easily reproducible. > client's hbase-site.xml: hadoop.security.authentication:kerberos, > hbase.security.authentication:kerberos, > hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal > are right set > A simple auth hbase cluster, a kerberized hbase client application. > application trying to r/w/c/d table will have following exception: > {code} > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > at > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336) > at > org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581) > at >
[jira] [Updated] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21221: --- Attachment: 21221.v12.txt > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v10.txt, 21221.v11.txt, 21221.v12.txt, > 21221.v7.txt, 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626704#comment-16626704 ] Guangxu Cheng commented on HBASE-20690: --- Attach 002 patch to move operations in {{preCreateTable}} to {{preCreateTableAction}}. > Moving table to target rsgroup needs to handle TableStateNotFoundException > -- > > Key: HBASE-20690 > URL: https://issues.apache.org/jira/browse/HBASE-20690 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > Attachments: HBASE-20690.master.001.patch, > HBASE-20690.master.002.patch > > > This is related code: > {code} > if (targetGroup != null) { > for (TableName table: tables) { > if (master.getAssignmentManager().isTableDisabled(table)) { > LOG.debug("Skipping move regions because the table" + table + " > is disabled."); > continue; > } > {code} > In a stack trace [~rmani] showed me: > {code} > 2018-06-06 07:10:44,893 ERROR > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] > master.TableStateManager: Unable to get table demo:tbl1 state > org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: > demo:tbl1 > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331) > at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768) > at > org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {code} > The logic should take potential TableStateNotFoundException into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-20690: -- Assignee: Guangxu Cheng Status: Patch Available (was: Open) > Moving table to target rsgroup needs to handle TableStateNotFoundException > -- > > Key: HBASE-20690 > URL: https://issues.apache.org/jira/browse/HBASE-20690 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-20690.master.001.patch, > HBASE-20690.master.002.patch > > > This is related code: > {code} > if (targetGroup != null) { > for (TableName table: tables) { > if (master.getAssignmentManager().isTableDisabled(table)) { > LOG.debug("Skipping move regions because the table" + table + " > is disabled."); > continue; > } > {code} > In a stack trace [~rmani] showed me: > {code} > 2018-06-06 07:10:44,893 ERROR > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] > master.TableStateManager: Unable to get table demo:tbl1 state > org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: > demo:tbl1 > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331) > at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768) > at > org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {code} > The logic should take potential TableStateNotFoundException into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-20690: -- Attachment: HBASE-20690.master.002.patch > Moving table to target rsgroup needs to handle TableStateNotFoundException > -- > > Key: HBASE-20690 > URL: https://issues.apache.org/jira/browse/HBASE-20690 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > Attachments: HBASE-20690.master.001.patch, > HBASE-20690.master.002.patch > > > This is related code: > {code} > if (targetGroup != null) { > for (TableName table: tables) { > if (master.getAssignmentManager().isTableDisabled(table)) { > LOG.debug("Skipping move regions because the table" + table + " > is disabled."); > continue; > } > {code} > In a stack trace [~rmani] showed me: > {code} > 2018-06-06 07:10:44,893 ERROR > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] > master.TableStateManager: Unable to get table demo:tbl1 state > org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: > demo:tbl1 > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331) > at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768) > at > org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {code} > The logic should take potential TableStateNotFoundException into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626699#comment-16626699 ] Guangxu Cheng commented on HBASE-20690: --- Thanks for your review.[~xucang] # Let's look at the difference between {{groupAdminServer.moveTables}} and {{groupInfoManager.moveTables}} first. #* {{groupAdminServer.moveTables}} will update the group information and move the regions of the table to the specified group. #* {{groupInfoManager.moveTables}} just updates the group information. The {{preCreateTable}} is executed before the table is built successfully. At this time, the related information of the region has not been generated and assigned yet, so there is no need to move the region. It is because the table has not been created successfully, so there will be TableStateNotFoundException. After we add the table to the specified group in advance, during the process of creating the table, {{CREATE_TABLE_ASSIGN_REGIONS}} will assign the regions to the specified regionservers according to the group information. # In the process of creating a table, you can roll back only if an exception occurs during {{CREATE_TABLE_PRE_OPERATION}}. {{CREATE_TABLE_PRE_OPERATION}} is to determine whether a table exists. If the table already exists, it is possible to change the grouping to which the table belongs, and this place really needs to be rolled back. We only need to move operations in {{preCreateTable}} to {{preCreateTableAction}}. > Moving table to target rsgroup needs to handle TableStateNotFoundException > -- > > Key: HBASE-20690 > URL: https://issues.apache.org/jira/browse/HBASE-20690 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > Attachments: HBASE-20690.master.001.patch > > > This is related code: > {code} > if (targetGroup != null) { > for (TableName table: tables) { > if (master.getAssignmentManager().isTableDisabled(table)) { > LOG.debug("Skipping move regions because the table" + table + " > is disabled."); > continue; > } > {code} > In a stack trace [~rmani] showed me: > {code} > 2018-06-06 07:10:44,893 ERROR > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] > master.TableStateManager: Unable to get table demo:tbl1 state > org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: > demo:tbl1 > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540) > at > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614) > at > org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331) > at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768) > at > org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {code} > The logic should take potential TableStateNotFoundException into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work
[ https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626698#comment-16626698 ] Reid Chan commented on HBASE-20993: --- It is a pretty useful test, SGTM. May i ask one question about implementation: how does it run an older version client against a patch-applied(may change wire protocol) pseudo distributed cluster? It seems `hbase_version` can be used for controlling the client version, but it is from `hbase version`, will it be the same version as the cluster? > [Auth] IPC client fallback to simple auth allowed doesn't work > -- > > Key: HBASE-20993 > URL: https://issues.apache.org/jira/browse/HBASE-20993 > Project: HBase > Issue Type: Bug > Components: Client, IPC/RPC, security >Affects Versions: 1.2.6, 1.3.2, 1.2.7, 1.4.7 >Reporter: Reid Chan >Assignee: Jack Bearden >Priority: Critical > Fix For: 1.5.0, 1.4.8 > > Attachments: HBASE-20993.001.patch, > HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, > HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, > HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, > HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, > HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, > HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, > HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt > > > It is easily reproducible. > client's hbase-site.xml: hadoop.security.authentication:kerberos, > hbase.security.authentication:kerberos, > hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal > are right set > A simple auth hbase cluster, a kerberized hbase client application. > application trying to r/w/c/d table will have following exception: > {code} > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > at > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336) > at > org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738) > at > org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134) > at > org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4297) > at > org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4289) > at >
[jira] [Commented] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626686#comment-16626686 ] Hadoop QA commented on HBASE-21221: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for instructions. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 21s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 14s{color} | {color:green} hbase-server: The patch generated 0 new + 20 unchanged - 1 fixed = 20 total (was 21) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 20s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}125m 28s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21221 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941123/21221.v11.txt | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 2dd1e3b82aa3 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7ab77518a2 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14487/testReport/ | |
[jira] [Updated] (HBASE-21208) Bytes#toShort doesn't work without unsafe
[ https://issues.apache.org/jira/browse/HBASE-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-21208: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Bytes#toShort doesn't work without unsafe > - > > Key: HBASE-21208 > URL: https://issues.apache.org/jira/browse/HBASE-21208 > Project: HBase > Issue Type: Bug >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21208.v0.patch, HBASE-21208.v1.patch, > HBASE-21208.v2.patch > > > seems we put the brackets in the wrong place. > {code} > short n = 0; > n = (short) ((n ^ bytes[offset]) & 0xFF); > n = (short) (n << 8); > n = (short) ((n ^ bytes[offset+1]) & 0xFF); // this one > return n; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21219) Hbase incremental backup fails with null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-21219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626653#comment-16626653 ] Hadoop QA commented on HBASE-21219: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 21s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 16s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 36s{color} | {color:green} hbase-backup in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21219 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941132/HBASE-21219-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 77fb748a30ce 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / 7ab77518a2 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | whitespace | https://builds.apache.org/job/PreCommit-HBASE-Build/14488/artifact/patchprocess/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14488/testReport/ | |
[jira] [Commented] (HBASE-21219) Hbase incremental backup fails with null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-21219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626634#comment-16626634 ] Ted Yu commented on HBASE-21219: {code} 307 LOG.warn("Known hosts (from newestTimestamps):"); 308 for (String s: newestTimestamps.keySet()) { 309 LOG.warn(s); {code} The above seems to be for debug purpose. Use LOG.debug ? > Hbase incremental backup fails with null pointer exception > -- > > Key: HBASE-21219 > URL: https://issues.apache.org/jira/browse/HBASE-21219 > Project: HBase > Issue Type: Bug > Components: backuprestore >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21219-v1.patch > > > hbase backup create incremental hdfs:///bkpHbase_Test/bkpHbase_Test2 -t > bkpHbase_Test2 > 2018-09-21 15:35:31,421 INFO [main] impl.TableBackupClient: Backup > backup_1537524313995 started at 1537524331419. 2018-09-21 15:35:31,454 INFO > [main] impl.IncrementalBackupManager: Execute roll log procedure for > incremental backup ... 2018-09-21 15:35:32,985 ERROR [main] > impl.TableBackupClient: Unexpected Exception : java.lang.NullPointerException > java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309) > at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601) > at > org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347) > at > org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) > at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) > at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) > 2018-09-21 15:35:32,989 ERROR [main] impl.TableBackupClient: > BackupId=backup_1537524313995,startts=1537524331419,failedts=1537524332989,failedphase=PREPARE_INCREMENTAL,failedmessage=null > 2018-09-21 15:35:57,167 ERROR [main] impl.TableBackupClient: Backup > backup_1537524313995 failed. > Backup session finished. Status: FAILURE 2018-09-21 15:35:57,175 ERROR [main] > backup.BackupDriver: Error running > command-line tool java.io.IOException: java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:281) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601) > at > org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347) > at > org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) > at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) > at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) > Caused by: java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309) > at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276) > ... 7 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21219) Hbase incremental backup fails with null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-21219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-21219: -- Status: Patch Available (was: Open) > Hbase incremental backup fails with null pointer exception > -- > > Key: HBASE-21219 > URL: https://issues.apache.org/jira/browse/HBASE-21219 > Project: HBase > Issue Type: Bug > Components: backuprestore >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21219-v1.patch > > > hbase backup create incremental hdfs:///bkpHbase_Test/bkpHbase_Test2 -t > bkpHbase_Test2 > 2018-09-21 15:35:31,421 INFO [main] impl.TableBackupClient: Backup > backup_1537524313995 started at 1537524331419. 2018-09-21 15:35:31,454 INFO > [main] impl.IncrementalBackupManager: Execute roll log procedure for > incremental backup ... 2018-09-21 15:35:32,985 ERROR [main] > impl.TableBackupClient: Unexpected Exception : java.lang.NullPointerException > java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309) > at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601) > at > org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347) > at > org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) > at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) > at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) > 2018-09-21 15:35:32,989 ERROR [main] impl.TableBackupClient: > BackupId=backup_1537524313995,startts=1537524331419,failedts=1537524332989,failedphase=PREPARE_INCREMENTAL,failedmessage=null > 2018-09-21 15:35:57,167 ERROR [main] impl.TableBackupClient: Backup > backup_1537524313995 failed. > Backup session finished. Status: FAILURE 2018-09-21 15:35:57,175 ERROR [main] > backup.BackupDriver: Error running > command-line tool java.io.IOException: java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:281) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601) > at > org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347) > at > org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) > at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) > at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) > Caused by: java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309) > at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276) > ... 7 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21219) Hbase incremental backup fails with null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-21219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-21219: -- Attachment: HBASE-21219-v1.patch > Hbase incremental backup fails with null pointer exception > -- > > Key: HBASE-21219 > URL: https://issues.apache.org/jira/browse/HBASE-21219 > Project: HBase > Issue Type: Bug > Components: backuprestore >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21219-v1.patch > > > hbase backup create incremental hdfs:///bkpHbase_Test/bkpHbase_Test2 -t > bkpHbase_Test2 > 2018-09-21 15:35:31,421 INFO [main] impl.TableBackupClient: Backup > backup_1537524313995 started at 1537524331419. 2018-09-21 15:35:31,454 INFO > [main] impl.IncrementalBackupManager: Execute roll log procedure for > incremental backup ... 2018-09-21 15:35:32,985 ERROR [main] > impl.TableBackupClient: Unexpected Exception : java.lang.NullPointerException > java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309) > at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601) > at > org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347) > at > org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) > at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) > at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) > 2018-09-21 15:35:32,989 ERROR [main] impl.TableBackupClient: > BackupId=backup_1537524313995,startts=1537524331419,failedts=1537524332989,failedphase=PREPARE_INCREMENTAL,failedmessage=null > 2018-09-21 15:35:57,167 ERROR [main] impl.TableBackupClient: Backup > backup_1537524313995 failed. > Backup session finished. Status: FAILURE 2018-09-21 15:35:57,175 ERROR [main] > backup.BackupDriver: Error running > command-line tool java.io.IOException: java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:281) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601) > at > org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347) > at > org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) > at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) > at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) > Caused by: java.lang.NullPointerException at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309) > at > org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276) > ... 7 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626610#comment-16626610 ] Mingliang Liu commented on HBASE-21221: --- # {code:java} if (!exceptionDuringMutateRows.get()) { fail("This cp should fail because the target lock is blocked by previous put"); } {code} Can be {{assertTrue()}}. # {code} WaitingForMultiMutationsObserver observer = find(tableName,... {code} The change can be reverted. # {code} LOG.debug("encountered " + ex); {code} Can be {code} LOG.error("encountered unexpected exception", ex); {code} Otherwise +1 (non-binding) > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v10.txt, 21221.v11.txt, 21221.v7.txt, > 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21221: --- Attachment: 21221.v11.txt > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v10.txt, 21221.v11.txt, 21221.v7.txt, > 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626570#comment-16626570 ] Hadoop QA commented on HBASE-21221: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for instructions. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s{color} | {color:red} HBASE-21221 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21221 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941120/21221.v10.txt | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14486/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v10.txt, 21221.v7.txt, 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626563#comment-16626563 ] Ted Yu commented on HBASE-21221: bq. the ex in RecordingEndpoint should be safely published The custom endpoint is removed in patch v10. So the above is not needed. > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v10.txt, 21221.v7.txt, 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21221: --- Attachment: 21221.v10.txt > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v10.txt, 21221.v7.txt, 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626498#comment-16626498 ] Mingliang Liu commented on HBASE-21221: --- [~yuzhih...@gmail.com], the root cause analysis makes sense to me. For the patch, I think the {{ex}} in {{RecordingEndpoint}} should be safely published across multiple threads, e.g. using volatile keyword. I'm thinking if the test would be simpler by checking the controller exception in the {{callable}} we created and passed to the {{table.coprocessorService}}. {code:java} CoprocessorRpcUtils.BlockingRpcCallback rpcCallback = new CoprocessorRpcUtils.BlockingRpcCallback<>(); exe.mutateRows(controller, request, rpcCallback); + if (controller.failedOnException()) { +exceptionOnMutateRows.set(true); // exceptionOnMutateRows is AtomicBoolean declared in the test method; will assert this value. + } return rpcCallback.get(); }); {code} > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Attachments: 21221.v7.txt, 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20940) HStore.cansplit should not allow split to happen if it has references
[ https://issues.apache.org/jira/browse/HBASE-20940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626333#comment-16626333 ] Hudson commented on HBASE-20940: FAILURE: Integrated in Jenkins build PreCommit-PHOENIX-Build #2059 (See [https://builds.apache.org/job/PreCommit-PHOENIX-Build/2059/]) After HBASE-20940 any local index query will open all HFiles of every (larsh: rev df998e6d7840db4669a395fa6460c42c434af633) * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/RegionScannerFactory.java > HStore.cansplit should not allow split to happen if it has references > - > > Key: HBASE-20940 > URL: https://issues.apache.org/jira/browse/HBASE-20940 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 1.4.7, 2.0.3 > > Attachments: HBASE-20940-branch-1-addendum.patch, > HBASE-20940.branch-1.3.v1.patch, HBASE-20940.branch-1.3.v2.patch, > HBASE-20940.branch-1.v1.patch, HBASE-20940.branch-1.v2.patch, > HBASE-20940.branch-1.v3.patch, HBASE-20940.branch-1.v5.patch, > HBASE-20940.v1.patch, HBASE-20940.v2.patch, HBASE-20940.v3.patch, > HBASE-20940.v4.patch, result_HBASE-20940.branch-1.v2.log > > > When split happens and immediately another split happens, it may result into > a split of a region who still has references to its parent. More details > about scenario can be found here HBASE-20933 > HStore.hasReferences should check from fs.storefile rather than in memory > objects. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626122#comment-16626122 ] Hadoop QA commented on HBASE-21217: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 16s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s{color} | {color:green} hbase-server: The patch generated 0 new + 315 unchanged - 6 fixed = 315 total (was 321) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 15s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}120m 28s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941052/HBASE-21217-v2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 85dae2e34597 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7ab77518a2 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14485/testReport/ | | Max. process+thread count | 5184 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14485/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message
[jira] [Commented] (HBASE-21208) Bytes#toShort doesn't work without unsafe
[ https://issues.apache.org/jira/browse/HBASE-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626115#comment-16626115 ] Chia-Ping Tsai commented on HBASE-21208: v2 will be merged tomorrow if no objections. > Bytes#toShort doesn't work without unsafe > - > > Key: HBASE-21208 > URL: https://issues.apache.org/jira/browse/HBASE-21208 > Project: HBase > Issue Type: Bug >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21208.v0.patch, HBASE-21208.v1.patch, > HBASE-21208.v2.patch > > > seems we put the brackets in the wrong place. > {code} > short n = 0; > n = (short) ((n ^ bytes[offset]) & 0xFF); > n = (short) (n << 8); > n = (short) ((n ^ bytes[offset+1]) & 0xFF); // this one > return n; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626047#comment-16626047 ] stack commented on HBASE-21217: --- Ugh. Just saw this. Thanks lads. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-16627) AssignmentManager#isDisabledorDisablingRegionInRIT should check whether table exists
[ https://issues.apache.org/jira/browse/HBASE-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-16627. Resolution: Later > AssignmentManager#isDisabledorDisablingRegionInRIT should check whether table > exists > > > Key: HBASE-16627 > URL: https://issues.apache.org/jira/browse/HBASE-16627 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Stephen Yuan Jiang >Priority: Minor > > [~stack] first reported this issue when he played with backup feature. > The following exception can be observed in backup unit tests: > {code} > 2016-09-13 16:21:57,661 ERROR [ProcedureExecutor-3] > master.TableStateManager(134): Unable to get table hbase:backup state > org.apache.hadoop.hbase.TableNotFoundException: hbase:backup > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:174) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:131) > at > org.apache.hadoop.hbase.master.AssignmentManager.isDisabledorDisablingRegionInRIT(AssignmentManager.java:1221) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:739) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1567) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1546) > at > org.apache.hadoop.hbase.util.ModifyRegionUtils.assignRegions(ModifyRegionUtils.java:254) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.assignRegions(CreateTableProcedure.java:430) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:127) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:57) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:119) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:452) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1066) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:855) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:808) > {code} > AssignmentManager#isDisabledorDisablingRegionInRIT should take table > existence into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-18580) Allow gdb to attach to selected process in docker
[ https://issues.apache.org/jira/browse/HBASE-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-18580: --- Labels: debugger (was: ) > Allow gdb to attach to selected process in docker > - > > Key: HBASE-18580 > URL: https://issues.apache.org/jira/browse/HBASE-18580 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Priority: Major > Labels: debugger > > In the current docker image, if you want to attach gdb to a process, you > would see: > bq. ptrace: Operation not permitted > We should provide better support for gdb in docker. > This article is a start: > https://thirld.com/blog/2016/08/15/war-stories-debugging-julia-gdb-docker/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625924#comment-16625924 ] Mike Drob commented on HBASE-18451: --- {noformat} + LOG.info(getName() + " requesting flush of " + + r.getRegionInfo().getRegionNameAsString() + " because " + + whyFlush.toString() + + " after random delay " + randomDelay + "ms"); {noformat} nit: can we switch this to parameterized logging? {noformat} @Override public boolean requestDelayedFlush(HRegion r, long delay, boolean forceFlushAllStores) { r.incrementFlushesQueuedCount(); synchronized (regionsInQueue) { if (!regionsInQueue.containsKey(r)) { // This entry has some delay FlushRegionEntry fqe = new FlushRegionEntry(r, forceFlushAllStores, FlushLifeCycleTracker.DUMMY); fqe.requeue(delay); this.regionsInQueue.put(r, fqe); this.flushQueue.add(fqe); return true; } return false; } } {noformat} Is that call to {{incrementFlushesQueuedCount}} correct? We attempt to queue one, but don't always add one to the queue, so the metric is going to be over-inflated. Same thing happens in {{requestFlush}}. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old
[jira] [Updated] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21217: -- Attachment: HBASE-21217-v2.patch > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625867#comment-16625867 ] Allan Yang commented on HBASE-18451: +1 for the patch. The compaction queue has the same issue. [~xucang] maybe you can take a look. We may queue the same Store in the compaction queue to compact over and over again, making the compaction queue very big. But it may not cause big trouble like this one here. So it is not a urgent thing to do. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: >
[jira] [Created] (HBASE-21223) [amv2] Remove abort_procedure from shell
stack created HBASE-21223: - Summary: [amv2] Remove abort_procedure from shell Key: HBASE-21223 URL: https://issues.apache.org/jira/browse/HBASE-21223 Project: HBase Issue Type: Bug Components: amv2, hbck2 Reporter: stack Assignee: stack Remove this command. It will cause more damage than it could ever solve. It should exist, it should be out in hbck2, not here in user-space. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21222) [amv2] Closing region on a non-existent server creates STUCK regions
[ https://issues.apache.org/jira/browse/HBASE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625816#comment-16625816 ] stack commented on HBASE-21222: --- Yes. A workaround is clearing out the old location. That seems to work. I'll write it up. > [amv2] Closing region on a non-existent server creates STUCK regions > > > Key: HBASE-21222 > URL: https://issues.apache.org/jira/browse/HBASE-21222 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > > Ran into this one where a Region had been on a server but after a bunch of > crashing and meddling in Master Proc WALs, any attempt at unassign has the > procedure fail (see below) and then report the region as STUCK. > I broke the lock w/ new hbck2 tooling and then tried to offline again but > same thing happened. Bug. Fix. > {code} > 2018-09-22 18:36:41,900 INFO > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch > pid=138650, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, > locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558 > 2018-09-22 18:36:41,899 INFO > org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: > pid=138646, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=0780467efe4c5901887fb12bfa406fa7, > server=vc1228.halxg.cloudera.com,22101,1537578279837 checking lock on > 0780467efe4c5901887fb12bfa406fa7 > 2018-09-22 18:36:41,900 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote > call failed vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, > ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > org.apache.hadoop.hbase.procedure2.NoServerDispatchException: > vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 > at > org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher.addOperationToNode(RemoteProcedureDispatcher.java:177) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.addToRemoteDispatcher(RegionTransitionProcedure.java:277) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:202) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:370) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:924) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1684) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1471) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:77) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1983) > 2018-09-22 18:36:41,903 WARN > org.apache.hadoop.hbase.master.assignment.UnassignProcedure: Expiring > vd0637.halxg.cloudera.com,22101,1537397969558, pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21222) [amv2] Closing region on a non-existent server creates STUCK regions
[ https://issues.apache.org/jira/browse/HBASE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625812#comment-16625812 ] Duo Zhang commented on HBASE-21222: --- Got it. So we need a tool in HBCK2 to handle this case. > [amv2] Closing region on a non-existent server creates STUCK regions > > > Key: HBASE-21222 > URL: https://issues.apache.org/jira/browse/HBASE-21222 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > > Ran into this one where a Region had been on a server but after a bunch of > crashing and meddling in Master Proc WALs, any attempt at unassign has the > procedure fail (see below) and then report the region as STUCK. > I broke the lock w/ new hbck2 tooling and then tried to offline again but > same thing happened. Bug. Fix. > {code} > 2018-09-22 18:36:41,900 INFO > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch > pid=138650, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, > locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558 > 2018-09-22 18:36:41,899 INFO > org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: > pid=138646, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=0780467efe4c5901887fb12bfa406fa7, > server=vc1228.halxg.cloudera.com,22101,1537578279837 checking lock on > 0780467efe4c5901887fb12bfa406fa7 > 2018-09-22 18:36:41,900 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote > call failed vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, > ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > org.apache.hadoop.hbase.procedure2.NoServerDispatchException: > vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 > at > org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher.addOperationToNode(RemoteProcedureDispatcher.java:177) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.addToRemoteDispatcher(RegionTransitionProcedure.java:277) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:202) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:370) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:924) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1684) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1471) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:77) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1983) > 2018-09-22 18:36:41,903 WARN > org.apache.hadoop.hbase.master.assignment.UnassignProcedure: Expiring > vd0637.halxg.cloudera.com,22101,1537397969558, pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21222) [amv2] Closing region on a non-existent server creates STUCK regions
[ https://issues.apache.org/jira/browse/HBASE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625801#comment-16625801 ] stack commented on HBASE-21222: --- In this case it is because WALs were deleted. Been thinking about this. It should not happen during usual operation but we should have some defense in place just in case it does manage to bubble-up. > [amv2] Closing region on a non-existent server creates STUCK regions > > > Key: HBASE-21222 > URL: https://issues.apache.org/jira/browse/HBASE-21222 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > > Ran into this one where a Region had been on a server but after a bunch of > crashing and meddling in Master Proc WALs, any attempt at unassign has the > procedure fail (see below) and then report the region as STUCK. > I broke the lock w/ new hbck2 tooling and then tried to offline again but > same thing happened. Bug. Fix. > {code} > 2018-09-22 18:36:41,900 INFO > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch > pid=138650, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, > locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558 > 2018-09-22 18:36:41,899 INFO > org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: > pid=138646, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=0780467efe4c5901887fb12bfa406fa7, > server=vc1228.halxg.cloudera.com,22101,1537578279837 checking lock on > 0780467efe4c5901887fb12bfa406fa7 > 2018-09-22 18:36:41,900 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote > call failed vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, > ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > org.apache.hadoop.hbase.procedure2.NoServerDispatchException: > vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 > at > org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher.addOperationToNode(RemoteProcedureDispatcher.java:177) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.addToRemoteDispatcher(RegionTransitionProcedure.java:277) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:202) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:370) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:924) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1684) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1471) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:77) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1983) > 2018-09-22 18:36:41,903 WARN > org.apache.hadoop.hbase.master.assignment.UnassignProcedure: Expiring > vd0637.halxg.cloudera.com,22101,1537397969558, pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21193) Retrying Callable doesn't take max retries from current context; uses defaults instead
[ https://issues.apache.org/jira/browse/HBASE-21193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-21193. --- Resolution: Cannot Reproduce > Retrying Callable doesn't take max retries from current context; uses > defaults instead > -- > > Key: HBASE-21193 > URL: https://issues.apache.org/jira/browse/HBASE-21193 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: Xu Cang >Priority: Major > > This makes it hard to change retry count on a read of meta for instance. > I noticed this when trying to change the defaults for a meta read. I made a > customer Connection inside in the master with a new Configuration that had > rpc retries and timings upped radically. My reads nonetheless were finishing > at the usual retry point (31 tries after 60 seconds or so) because it looked > like the Retrying Callable that does the read was taking max retries from > defaults rather than reading the passed in Configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21193) Retrying Callable doesn't take max retries from current context; uses defaults instead
[ https://issues.apache.org/jira/browse/HBASE-21193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625771#comment-16625771 ] stack commented on HBASE-21193: --- Thank you [~xucang] for taking a look. I'm glad the maxattempts is working. I need to do better explanation of the phenomenon I saw why I thought it wasn't working. I've lost my context though now and rather than leave this issue open, let me resolve it. I'll assign it to you since you did some nice work. Thank you. > Retrying Callable doesn't take max retries from current context; uses > defaults instead > -- > > Key: HBASE-21193 > URL: https://issues.apache.org/jira/browse/HBASE-21193 > Project: HBase > Issue Type: Bug >Reporter: stack >Priority: Major > > This makes it hard to change retry count on a read of meta for instance. > I noticed this when trying to change the defaults for a meta read. I made a > customer Connection inside in the master with a new Configuration that had > rpc retries and timings upped radically. My reads nonetheless were finishing > at the usual retry point (31 tries after 60 seconds or so) because it looked > like the Retrying Callable that does the read was taking max retries from > defaults rather than reading the passed in Configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21193) Retrying Callable doesn't take max retries from current context; uses defaults instead
[ https://issues.apache.org/jira/browse/HBASE-21193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reassigned HBASE-21193: - Assignee: Xu Cang > Retrying Callable doesn't take max retries from current context; uses > defaults instead > -- > > Key: HBASE-21193 > URL: https://issues.apache.org/jira/browse/HBASE-21193 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: Xu Cang >Priority: Major > > This makes it hard to change retry count on a read of meta for instance. > I noticed this when trying to change the defaults for a meta read. I made a > customer Connection inside in the master with a new Configuration that had > rpc retries and timings upped radically. My reads nonetheless were finishing > at the usual retry point (31 tries after 60 seconds or so) because it looked > like the Retrying Callable that does the read was taking max retries from > defaults rather than reading the passed in Configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21222) [amv2] Closing region on a non-existent server creates STUCK regions
[ https://issues.apache.org/jira/browse/HBASE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625769#comment-16625769 ] Duo Zhang commented on HBASE-21222: --- Is this because you delete all the master proc wals? Or it could happen after crashing and failover? If the latter I think there are critical bugs? > [amv2] Closing region on a non-existent server creates STUCK regions > > > Key: HBASE-21222 > URL: https://issues.apache.org/jira/browse/HBASE-21222 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > > Ran into this one where a Region had been on a server but after a bunch of > crashing and meddling in Master Proc WALs, any attempt at unassign has the > procedure fail (see below) and then report the region as STUCK. > I broke the lock w/ new hbck2 tooling and then tried to offline again but > same thing happened. Bug. Fix. > {code} > 2018-09-22 18:36:41,900 INFO > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch > pid=138650, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, > locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558 > 2018-09-22 18:36:41,899 INFO > org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: > pid=138646, ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=0780467efe4c5901887fb12bfa406fa7, > server=vc1228.halxg.cloudera.com,22101,1537578279837 checking lock on > 0780467efe4c5901887fb12bfa406fa7 > 2018-09-22 18:36:41,900 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote > call failed vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, > ppid=121871, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; > UnassignProcedure table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558; rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > org.apache.hadoop.hbase.procedure2.NoServerDispatchException: > vd0637.halxg.cloudera.com,22101,1537397969558; pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 > at > org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher.addOperationToNode(RemoteProcedureDispatcher.java:177) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.addToRemoteDispatcher(RegionTransitionProcedure.java:277) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:202) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:370) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:924) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1684) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1471) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:77) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1983) > 2018-09-22 18:36:41,903 WARN > org.apache.hadoop.hbase.master.assignment.UnassignProcedure: Expiring > vd0637.halxg.cloudera.com,22101,1537397969558, pid=138650, ppid=121871, > state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure > table=IntegrationTestBigLinkedList_20180614072614, > region=51cdade76ca7217ec191f39e5f56c61c, > server=vd0637.halxg.cloudera.com,22101,1537397969558 rit=CLOSING, > location=vd0637.halxg.cloudera.com,22101,1537397969558; > exception=NoServerDispatchException > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work
[ https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625584#comment-16625584 ] Sean Busbey commented on HBASE-20993: - bq. Good reminder that we lack a unit test for wire compatibility. I wonder how hard it would be to grab the 1.2 shaded client artifact and use it to talk with the server code at head of branch. We could add a nightly test that did this pretty easily. Essentially we could just add it as an additional step in [the test that starts up a 1-node cluster and runs an example program|https://github.com/apache/hbase/blob/master/dev-support/hbase_nightly_pseudo-distributed-test.sh]. > [Auth] IPC client fallback to simple auth allowed doesn't work > -- > > Key: HBASE-20993 > URL: https://issues.apache.org/jira/browse/HBASE-20993 > Project: HBase > Issue Type: Bug > Components: Client, IPC/RPC, security >Affects Versions: 1.2.6, 1.3.2, 1.2.7, 1.4.7 >Reporter: Reid Chan >Assignee: Jack Bearden >Priority: Critical > Fix For: 1.5.0, 1.4.8 > > Attachments: HBASE-20993.001.patch, > HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, > HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, > HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, > HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, > HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, > HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, > HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt > > > It is easily reproducible. > client's hbase-site.xml: hadoop.security.authentication:kerberos, > hbase.security.authentication:kerberos, > hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal > are right set > A simple auth hbase cluster, a kerberized hbase client application. > application trying to r/w/c/d table will have following exception: > {code} > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > at > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873) > at > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336) > at > org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738) > at > org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134) > at >
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625557#comment-16625557 ] Hadoop QA commented on HBASE-18451: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 34s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 34s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 19s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}136m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.util.TestHBaseFsck | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941012/HBASE-18451.branch-1.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 324b4fa3dcc5 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64
[jira] [Commented] (HBASE-21193) Retrying Callable doesn't take max retries from current context; uses defaults instead
[ https://issues.apache.org/jira/browse/HBASE-21193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625534#comment-16625534 ] Xu Cang commented on HBASE-21193: - [~stack] tried a bit I think it's working for me unless I misunderstand what you mean. I added a logline here for debugging: [https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerImpl.java#L79] LOG.info("maxAttemps is : " + maxAttempts); Then I modified hbas-site.config with as below 37 38 hbase.client.retries.number 39 *12* 40 Got this in log 276913 2018-09-24 01:39:56,012 INFO [master/192.168.0.9:16000:becomeActiveMaster] client.RpcRetryingCallerImpl: *maxAttemps is : 37* Then change config to this 37 38 hbase.client.retries.number *39 5* 40 Got maxAttemps changed as below: 279046 2018-09-24 01:41:55,552 INFO [master/192.168.0.9:16000:becomeActiveMaster] client.RpcRetryingCallerImpl: *maxAttemps is : 16* > Retrying Callable doesn't take max retries from current context; uses > defaults instead > -- > > Key: HBASE-21193 > URL: https://issues.apache.org/jira/browse/HBASE-21193 > Project: HBase > Issue Type: Bug >Reporter: stack >Priority: Major > > This makes it hard to change retry count on a read of meta for instance. > I noticed this when trying to change the defaults for a meta read. I made a > customer Connection inside in the master with a new Configuration that had > rpc retries and timings upped radically. My reads nonetheless were finishing > at the usual retry point (31 tries after 60 seconds or so) because it looked > like the Retrying Callable that does the read was taking max retries from > defaults rather than reading the passed in Configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20940) HStore.cansplit should not allow split to happen if it has references
[ https://issues.apache.org/jira/browse/HBASE-20940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625533#comment-16625533 ] Hudson commented on HBASE-20940: SUCCESS: Integrated in Jenkins build Phoenix-4.x-HBase-1.3 #210 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-1.3/210/]) After HBASE-20940 any local index query will open all HFiles of every (larsh: rev ed4366063b983270767757d12daf3a8f4b126897) * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/RegionScannerFactory.java > HStore.cansplit should not allow split to happen if it has references > - > > Key: HBASE-20940 > URL: https://issues.apache.org/jira/browse/HBASE-20940 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 1.4.7, 2.0.3 > > Attachments: HBASE-20940-branch-1-addendum.patch, > HBASE-20940.branch-1.3.v1.patch, HBASE-20940.branch-1.3.v2.patch, > HBASE-20940.branch-1.v1.patch, HBASE-20940.branch-1.v2.patch, > HBASE-20940.branch-1.v3.patch, HBASE-20940.branch-1.v5.patch, > HBASE-20940.v1.patch, HBASE-20940.v2.patch, HBASE-20940.v3.patch, > HBASE-20940.v4.patch, result_HBASE-20940.branch-1.v2.log > > > When split happens and immediately another split happens, it may result into > a split of a region who still has references to its parent. More details > about scenario can be found here HBASE-20933 > HStore.hasReferences should check from fs.storefile rather than in memory > objects. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21212) Wrong flush time when update flush metric
[ https://issues.apache.org/jira/browse/HBASE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625445#comment-16625445 ] Xu Cang commented on HBASE-21212: - [~allan163] I checked branch-1 code, your fix could be perfectly applied to branch-1 too. Thanks. > Wrong flush time when update flush metric > - > > Key: HBASE-21212 > URL: https://issues.apache.org/jira/browse/HBASE-21212 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.0, 2.0.2 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Minor > Attachments: HBASE-21212.branch-2.0.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang reassigned HBASE-18451: --- Assignee: Xu Cang > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: >
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang updated HBASE-18451: Attachment: HBASE-18451.branch-1.001.patch HBASE-18451.master.002.patch Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625428#comment-16625428 ] Xu Cang commented on HBASE-18451: - rebased [~nihed] 's patch and uploaded for both master and branch-1. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO >