[jira] [Commented] (HBASE-19973) Implement a procedure to replay sync replication wal for standby cluster
[ https://issues.apache.org/jira/browse/HBASE-19973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361922#comment-16361922 ] Guanghao Zhang commented on HBASE-19973: Need find a better name for RecoverStandbyProcedure... > Implement a procedure to replay sync replication wal for standby cluster > > > Key: HBASE-19973 > URL: https://issues.apache.org/jira/browse/HBASE-19973 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-19973.HBASE-19064.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19844) Shell should support to flush by regionserver
[ https://issues.apache.org/jira/browse/HBASE-19844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361921#comment-16361921 ] Reid Chan commented on HBASE-19844: --- Thank you for reminder [~yuzhih...@gmail.com]. Please take a look at {{rubocop}} and {{ruby-lint}} warning, i think it's unrelated, but not sure. > Shell should support to flush by regionserver > - > > Key: HBASE-19844 > URL: https://issues.apache.org/jira/browse/HBASE-19844 > Project: HBase > Issue Type: New Feature > Components: shell >Reporter: Chia-Ping Tsai >Assignee: Reid Chan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-19844.master.001.patch, > HBASE-19844.master.002.patch, HBASE-19844.master.003.patch, > HBASE-19844.master.004.patch > > > HBASE-4224 add a method to admin that can do the flush by regionserver. As > with other Admin methods, we should enable shell to use the flush method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19973) Implement a procedure to replay sync replication wal for standby cluster
[ https://issues.apache.org/jira/browse/HBASE-19973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361918#comment-16361918 ] Guanghao Zhang commented on HBASE-19973: Add a initial version patch. > Implement a procedure to replay sync replication wal for standby cluster > > > Key: HBASE-19973 > URL: https://issues.apache.org/jira/browse/HBASE-19973 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-19973.HBASE-19064.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19973) Implement a procedure to replay sync replication wal for standby cluster
[ https://issues.apache.org/jira/browse/HBASE-19973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-19973: --- Status: Patch Available (was: Open) > Implement a procedure to replay sync replication wal for standby cluster > > > Key: HBASE-19973 > URL: https://issues.apache.org/jira/browse/HBASE-19973 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-19973.HBASE-19064.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19973) Implement a procedure to replay sync replication wal for standby cluster
[ https://issues.apache.org/jira/browse/HBASE-19973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-19973: --- Attachment: HBASE-19973.HBASE-19064.001.patch > Implement a procedure to replay sync replication wal for standby cluster > > > Key: HBASE-19973 > URL: https://issues.apache.org/jira/browse/HBASE-19973 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-19973.HBASE-19064.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used
[ https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361916#comment-16361916 ] Sergey Soldatov commented on HBASE-19863: - [~ram_krish] thank you, sir! I will prepare the final patch tomorrow (it would require some changes in test harness to make bloom filter configurable). > java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter > is used > - > > Key: HBASE-19863 > URL: https://issues.apache.org/jira/browse/HBASE-19863 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 1.4.1 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov >Priority: Major > Attachments: HBASE-19863-branch1.patch, HBASE-19863-test.patch > > > Under some circumstances scan with SingleColumnValueFilter may fail with an > exception > {noformat} > java.lang.IllegalStateException: isDelete failed: deleteBuffer=C3, > qualifier=C2, timestamp=1516433595543, comparison result: 1 > at > org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:149) > at > org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:386) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:545) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5814) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2552) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) > {noformat} > Conditions: > table T with a single column family 0 that uses ROWCOL bloom filter > (important) and column qualifiers C1,C2,C3,C4,C5. > When we fill the table for every row we put deleted cell for C3. > The table has a single region with two HStore: > A: start row: 0, stop row: 99 > B: start row: 10 stop row: 99 > B has newer versions of rows 10-99. Store files have several blocks each > (important). > Store A is the result of major compaction, so it doesn't have any deleted > cells (important). > So, we are running a scan like: > {noformat} > scan 'T', { COLUMNS => ['0:C3','0:C5'], FILTER => "SingleColumnValueFilter > ('0','C5',=,'binary:whatever')"} > {noformat} > How the scan performs: > First, we iterate A for rows 0 and 1 without any problems. > Next, we start to iterate A for row 10, so read the first cell and set hfs > scanner to A : > 10:0/C1/0/Put/x but found that we have a newer version of the cell in B : > 10:0/C1/1/Put/x, > so we make B as our current store scanner. Since we are looking for > particular columns > C3 and C5, we perform the optimization StoreScanner.seekOrSkipToNextColumn > which > would run reseek for all store scanners. > For store A the following magic would happen in requestSeek: > 1. bloom filter check passesGeneralBloomFilter would set haveToSeek to > false because row 10 doesn't have C3 qualifier in store A. > 2. Since we don't have to seek we just create a fake row > 10:0/C3/OLDEST_TIMESTAMP/Maximum, an optimization that is quite important for > us and it commented with : > {noformat} > // Multi-column Bloom filter optimization. > // Create a fake key/value, so that this scanner only bubbles up to the > top > // of the KeyValueHeap in StoreScanner after we scanned this row/column in > // all other store files. The query matcher will then just skip this fake > // key/value and the store scanner will progress to the next column. This > // is obviously not a "real real" seek, but unlike the fake KV earlier in > // this method, we want this to be propagated to ScanQueryMatcher. > {noformat} > > For store B we would set it to fake 10:0/C3/createFirstOnRowColTS()/Maximum > to skip C3 entirely. > After that we start searching for qualifier C5 using seekOrSkipToNextColumn > which run first trySkipToNextColumn: > {noformat} > protected boolean trySkipToNextColumn(Cell cell) throws IOException { > Cell nextCell = null; > do { > Cell
[jira] [Commented] (HBASE-19844) Shell should support to flush by regionserver
[ https://issues.apache.org/jira/browse/HBASE-19844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361908#comment-16361908 ] Hadoop QA commented on HBASE-19844: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 1s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 28s{color} | {color:red} The patch generated 5 new + 719 unchanged - 5 fixed = 724 total (was 724) {color} | | {color:red}-1{color} | {color:red} ruby-lint {color} | {color:red} 0m 19s{color} | {color:red} The patch generated 5 new + 1260 unchanged - 2 fixed = 1265 total (was 1262) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 25s{color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 19m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19844 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910330/HBASE-19844.master.004.patch | | Optional Tests | asflicense javac javadoc unit rubocop ruby_lint | | uname | Linux b6c65ba1effd 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / cf57ea15f1 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | rubocop | v0.52.1 | | rubocop | https://builds.apache.org/job/PreCommit-HBASE-Build/11502/artifact/patchprocess/diff-patch-rubocop.txt | | ruby-lint | v2.3.1 | | ruby-lint | https://builds.apache.org/job/PreCommit-HBASE-Build/11502/artifact/patchprocess/diff-patch-ruby-lint.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11502/testReport/ | | Max. process+thread count | 1851 (vs. ulimit of 1) | | modules | C: hbase-shell U: hbase-shell | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11502/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Shell should support to flush by regionserver > - > > Key: HBASE-19844 > URL: https://issues.apache.org/jira/browse/HBASE-19844 > Project: HBase > Issue Type: New Feature > Components: shell >Reporter: Chia-Ping Tsai >Assignee: Reid Chan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-19844.master.001.patch, > HBASE-19844.master.002.patch, HBASE-19844.master.003.patch, > HBASE-19844.master.004.patch > > > HBASE-4224 add a method to admin that can do the flush by regionserver. As > with other Admin methods, we should enable shell to use the flush method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used
[ https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361906#comment-16361906 ] ramkrishna.s.vasudevan commented on HBASE-19863: I think this patch is fine. We are seeing a scanner lags behind and we are forcing a seek here. In my opinion we can avoid the looping also as I said in earlier comment but if that seems risky or if we feel it could affect other cases then this fix is fine. We are doing a seek probably in the same block only here (correct me if am wrong) but with the help of the column tracker. So not in this patch - I think internally if we try to do a seek on the same block we actually don fetch the block again I believe. So it should not be costly too. > java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter > is used > - > > Key: HBASE-19863 > URL: https://issues.apache.org/jira/browse/HBASE-19863 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 1.4.1 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov >Priority: Major > Attachments: HBASE-19863-branch1.patch, HBASE-19863-test.patch > > > Under some circumstances scan with SingleColumnValueFilter may fail with an > exception > {noformat} > java.lang.IllegalStateException: isDelete failed: deleteBuffer=C3, > qualifier=C2, timestamp=1516433595543, comparison result: 1 > at > org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:149) > at > org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:386) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:545) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5814) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2552) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) > {noformat} > Conditions: > table T with a single column family 0 that uses ROWCOL bloom filter > (important) and column qualifiers C1,C2,C3,C4,C5. > When we fill the table for every row we put deleted cell for C3. > The table has a single region with two HStore: > A: start row: 0, stop row: 99 > B: start row: 10 stop row: 99 > B has newer versions of rows 10-99. Store files have several blocks each > (important). > Store A is the result of major compaction, so it doesn't have any deleted > cells (important). > So, we are running a scan like: > {noformat} > scan 'T', { COLUMNS => ['0:C3','0:C5'], FILTER => "SingleColumnValueFilter > ('0','C5',=,'binary:whatever')"} > {noformat} > How the scan performs: > First, we iterate A for rows 0 and 1 without any problems. > Next, we start to iterate A for row 10, so read the first cell and set hfs > scanner to A : > 10:0/C1/0/Put/x but found that we have a newer version of the cell in B : > 10:0/C1/1/Put/x, > so we make B as our current store scanner. Since we are looking for > particular columns > C3 and C5, we perform the optimization StoreScanner.seekOrSkipToNextColumn > which > would run reseek for all store scanners. > For store A the following magic would happen in requestSeek: > 1. bloom filter check passesGeneralBloomFilter would set haveToSeek to > false because row 10 doesn't have C3 qualifier in store A. > 2. Since we don't have to seek we just create a fake row > 10:0/C3/OLDEST_TIMESTAMP/Maximum, an optimization that is quite important for > us and it commented with : > {noformat} > // Multi-column Bloom filter optimization. > // Create a fake key/value, so that this scanner only bubbles up to the > top > // of the KeyValueHeap in StoreScanner after we scanned this row/column in > // all other store files. The query matcher will then just skip this fake > // key/value and the store scanner will progress to the next column. This > // is obviously not a "real real" seek, but unlike the fake KV earlier in > // this method, we want this to be propagated
[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
[ https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361901#comment-16361901 ] stack commented on HBASE-19965: --- Pushed an addendum that gives TestAsyncTableAdminApi the same treatment, breaking it in two. > Fix flaky TestAsyncRegionAdminApi > - > > Key: HBASE-19965 > URL: https://issues.apache.org/jira/browse/HBASE-19965 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: stack >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19965.branch-2.001.patch > > > See > [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/] > > java.lang.AssertionError: expected:<2> but was:<3> at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359) > > Merge regions not work. The table still have 3 regions after the > MergeRegionsProcedure finished. > The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because > the MergeRegionsProcedure pid=138 start work first, so the balance need wait > for the lock. But after merge regions finished, the MoveRegionProcedure > pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new > region server. This is not right. The MoveRegionProcedure should skip to > assign a region which was marked as offline. Or we should clear the merged > regions' procedure when MergeRegionsProcedure finished. > > Logs: > 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] > master.HMaster(1454): balance > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 > 2018-02-08 16:24:44,608 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] > procedure2.ProcedureExecutor(868): Stored pid=138, > state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure > table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, > 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false > .. > 2018-02-08 16:24:50,111 INFO [PEWorker-13] > procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; > MergeTableRegionsProcedure table=testMergeRegions, > regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], > forcibly=false in 5.5710sec > 2018-02-08 16:24:50,113 INFO [PEWorker-13] > procedure.MasterProcedureScheduler(813): pid=139, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 testMergeRegions > testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
[ https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361886#comment-16361886 ] stack commented on HBASE-19965: --- --- Test set: org.apache.hadoop.hbase.client.TestAsyncTableAdminApi --- Tests run: 34, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 572.494 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncTableAdminApi org.apache.hadoop.hbase.client.TestAsyncTableAdminApi Time elapsed: 19.573 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds at org.apache.hadoop.hbase.client.TestAsyncTableAdminApi.testListTables(TestAsyncTableAdminApi.java:122) org.apache.hadoop.hbase.client.TestAsyncTableAdminApi Time elapsed: 19.604 s <<< ERROR! java.lang.Exception: Appears to be stuck in thread DataXceiver for client DFSClient_NONMAPREDUCE_-1429371430_23 at /127.0.0.1:60612 [Receiving block BP-217569409-172.17.0.2-1518490564363:blk_1073741829_1005] .. has the same pattern. Parameterized. No particular test taking a while: {code} 2018-02-13 02:56:30,183 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testCreateTableWithEmptyRowInTheSplitKeys[0] Thread=302, OpenFileDescriptor=1610, MaxFileDescriptor=1048576, SystemLoadAverage=1755, ProcessCount=17, AvailableMemoryMB=10328 2018-02-13 02:56:30,294 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testDeleteTable[0] Thread=304, OpenFileDescriptor=1612, MaxFileDescriptor=1048576, SystemLoadAverage=1755, ProcessCount=17, AvailableMemoryMB=10317 2018-02-13 02:56:41,992 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testDisableAndEnableTables[0] Thread=322, OpenFileDescriptor=1613, MaxFileDescriptor=1048576, SystemLoadAverage=1763, ProcessCount=17, AvailableMemoryMB=9131 2018-02-13 02:57:21,465 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testCreateTable[0] Thread=349, OpenFileDescriptor=1596, MaxFileDescriptor=1048576, SystemLoadAverage=1841, ProcessCount=17, AvailableMemoryMB=9816 2018-02-13 02:57:35,108 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testModifyColumnFamily[0] Thread=345, OpenFileDescriptor=1571, MaxFileDescriptor=1048576, SystemLoadAverage=1801, ProcessCount=17, AvailableMemoryMB=10954 2018-02-13 02:57:50,562 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testDisableCatalogTable[0] Thread=343, OpenFileDescriptor=1585, MaxFileDescriptor=1048576, SystemLoadAverage=1867, ProcessCount=17, AvailableMemoryMB=11009 2018-02-13 02:58:03,013 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testCreateTableWithRegions[0] Thread=341, OpenFileDescriptor=1574, MaxFileDescriptor=1048576, SystemLoadAverage=1876, ProcessCount=17, AvailableMemoryMB=9139 2018-02-13 02:58:41,708 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testIsTableEnabledAndDisabled[0] Thread=454, OpenFileDescriptor=1569, MaxFileDescriptor=1048576, SystemLoadAverage=1901, ProcessCount=20, AvailableMemoryMB=9379 2018-02-13 02:58:51,803 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testListTables[0] Thread=459, OpenFileDescriptor=1578, MaxFileDescriptor=1048576, SystemLoadAverage=1886, ProcessCount=17, AvailableMemoryMB=9836 2018-02-13 02:59:28,658 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testTruncateTablePreservingSplits[0] Thread=370, OpenFileDescriptor=1570, MaxFileDescriptor=1048576, SystemLoadAverage=1823, ProcessCount=17, AvailableMemoryMB=8980 2018-02-13 02:59:52,804 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testCompactionTimestamps[0] Thread=385, OpenFileDescriptor=1575, MaxFileDescriptor=1048576, SystemLoadAverage=1805, ProcessCount=17, AvailableMemoryMB=8672 2018-02-13 03:00:11,715 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testTableAvailableWithRandomSplitKeys[0] Thread=385, OpenFileDescriptor=1593, MaxFileDescriptor=1048576, SystemLoadAverage=1863, ProcessCount=17, AvailableMemoryMB=8835 2018-02-13 03:00:24,545 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncTableAdminApi#testAddSameColumnFamilyTwice[0] Thread=388, OpenFileDescriptor=1584, MaxFileDescriptor=1048576, SystemLoadAverage=2003, ProcessCount=17, AvailableMemoryMB=7541 2018-02-13 03:00:35,379 INFO [Time-limited test] hbase.ResourceChecker(148): before:
[jira] [Updated] (HBASE-19844) Shell should support to flush by regionserver
[ https://issues.apache.org/jira/browse/HBASE-19844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-19844: -- Attachment: HBASE-19844.master.004.patch > Shell should support to flush by regionserver > - > > Key: HBASE-19844 > URL: https://issues.apache.org/jira/browse/HBASE-19844 > Project: HBase > Issue Type: New Feature > Components: shell >Reporter: Chia-Ping Tsai >Assignee: Reid Chan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-19844.master.001.patch, > HBASE-19844.master.002.patch, HBASE-19844.master.003.patch, > HBASE-19844.master.004.patch > > > HBASE-4224 add a method to admin that can do the flush by regionserver. As > with other Admin methods, we should enable shell to use the flush method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
[ https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361881#comment-16361881 ] stack commented on HBASE-19965: --- Pushed to master and branch-2. Leaving open for now to see if this fixes it. > Fix flaky TestAsyncRegionAdminApi > - > > Key: HBASE-19965 > URL: https://issues.apache.org/jira/browse/HBASE-19965 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: stack >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19965.branch-2.001.patch > > > See > [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/] > > java.lang.AssertionError: expected:<2> but was:<3> at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359) > > Merge regions not work. The table still have 3 regions after the > MergeRegionsProcedure finished. > The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because > the MergeRegionsProcedure pid=138 start work first, so the balance need wait > for the lock. But after merge regions finished, the MoveRegionProcedure > pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new > region server. This is not right. The MoveRegionProcedure should skip to > assign a region which was marked as offline. Or we should clear the merged > regions' procedure when MergeRegionsProcedure finished. > > Logs: > 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] > master.HMaster(1454): balance > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 > 2018-02-08 16:24:44,608 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] > procedure2.ProcedureExecutor(868): Stored pid=138, > state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure > table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, > 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false > .. > 2018-02-08 16:24:50,111 INFO [PEWorker-13] > procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; > MergeTableRegionsProcedure table=testMergeRegions, > regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], > forcibly=false in 5.5710sec > 2018-02-08 16:24:50,113 INFO [PEWorker-13] > procedure.MasterProcedureScheduler(813): pid=139, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 testMergeRegions > testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
[ https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361879#comment-16361879 ] stack commented on HBASE-19965: --- .001 just breaks up the test into two pieces. Other reason test takes a while is that it is parameterized. Let me push this to see how it does overnight. > Fix flaky TestAsyncRegionAdminApi > - > > Key: HBASE-19965 > URL: https://issues.apache.org/jira/browse/HBASE-19965 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: stack >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19965.branch-2.001.patch > > > See > [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/] > > java.lang.AssertionError: expected:<2> but was:<3> at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359) > > Merge regions not work. The table still have 3 regions after the > MergeRegionsProcedure finished. > The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because > the MergeRegionsProcedure pid=138 start work first, so the balance need wait > for the lock. But after merge regions finished, the MoveRegionProcedure > pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new > region server. This is not right. The MoveRegionProcedure should skip to > assign a region which was marked as offline. Or we should clear the merged > regions' procedure when MergeRegionsProcedure finished. > > Logs: > 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] > master.HMaster(1454): balance > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 > 2018-02-08 16:24:44,608 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] > procedure2.ProcedureExecutor(868): Stored pid=138, > state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure > table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, > 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false > .. > 2018-02-08 16:24:50,111 INFO [PEWorker-13] > procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; > MergeTableRegionsProcedure table=testMergeRegions, > regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], > forcibly=false in 5.5710sec > 2018-02-08 16:24:50,113 INFO [PEWorker-13] > procedure.MasterProcedureScheduler(813): pid=139, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 testMergeRegions > testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
[ https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19965: -- Attachment: HBASE-19965.branch-2.001.patch > Fix flaky TestAsyncRegionAdminApi > - > > Key: HBASE-19965 > URL: https://issues.apache.org/jira/browse/HBASE-19965 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: stack >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19965.branch-2.001.patch > > > See > [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/] > > java.lang.AssertionError: expected:<2> but was:<3> at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359) > > Merge regions not work. The table still have 3 regions after the > MergeRegionsProcedure finished. > The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because > the MergeRegionsProcedure pid=138 start work first, so the balance need wait > for the lock. But after merge regions finished, the MoveRegionProcedure > pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new > region server. This is not right. The MoveRegionProcedure should skip to > assign a region which was marked as offline. Or we should clear the merged > regions' procedure when MergeRegionsProcedure finished. > > Logs: > 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] > master.HMaster(1454): balance > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 > 2018-02-08 16:24:44,608 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] > procedure2.ProcedureExecutor(868): Stored pid=138, > state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure > table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, > 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false > .. > 2018-02-08 16:24:50,111 INFO [PEWorker-13] > procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; > MergeTableRegionsProcedure table=testMergeRegions, > regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], > forcibly=false in 5.5710sec > 2018-02-08 16:24:50,113 INFO [PEWorker-13] > procedure.MasterProcedureScheduler(813): pid=139, > state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure > hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., > source=cd4730e3eae2,39077,1518106776411, > destination=cd4730e3eae2,40578,1518106776318 testMergeRegions > testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19945) Separate tests of TestRSGroups into two classes
[ https://issues.apache.org/jira/browse/HBASE-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361868#comment-16361868 ] stack commented on HBASE-19945: --- Or, to be more precise, there is no timeout based off categories anymore, only the absolute ten minutes upper bound. HBASE-19960 is the doc of how test timeout and category has changed so it is strange that this is 'timing out' unless it is taking > ten minutes to run. > Separate tests of TestRSGroups into two classes > --- > > Key: HBASE-19945 > URL: https://issues.apache.org/jira/browse/HBASE-19945 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Attachments: 19945.v1.txt, 19945.v2.txt > > > TestRSGroups is annotated as MediumTests. It times out on Jenkins: > https://builds.apache.org/job/HBase-Trunk_matrix/4537/jdk=JDK%201.8%20(latest),label=(Hadoop%20&&%20!H5)/testReport/junit/org.apache.hadoop.hbase.rsgroup/TestRSGroups/org_apache_hadoop_hbase_rsgroup_TestRSGroups/ > {code} > org.junit.runners.model.TestTimedOutException: test timed out after 180 > seconds > {code} > The above is reproducible on Linux locally. > The tests from TestRSGroups should be separated into two classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19945) Separate tests of TestRSGroups into two classes
[ https://issues.apache.org/jira/browse/HBASE-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361867#comment-16361867 ] stack commented on HBASE-19945: --- Jenkins report is gone. Why does it not time out here? https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html > Separate tests of TestRSGroups into two classes > --- > > Key: HBASE-19945 > URL: https://issues.apache.org/jira/browse/HBASE-19945 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Attachments: 19945.v1.txt, 19945.v2.txt > > > TestRSGroups is annotated as MediumTests. It times out on Jenkins: > https://builds.apache.org/job/HBase-Trunk_matrix/4537/jdk=JDK%201.8%20(latest),label=(Hadoop%20&&%20!H5)/testReport/junit/org.apache.hadoop.hbase.rsgroup/TestRSGroups/org_apache_hadoop_hbase_rsgroup_TestRSGroups/ > {code} > org.junit.runners.model.TestTimedOutException: test timed out after 180 > seconds > {code} > The above is reproducible on Linux locally. > The tests from TestRSGroups should be separated into two classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361855#comment-16361855 ] Hudson commented on HBASE-19986: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4575 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4575/]) HBASE-19986 If HBaseTestClassRule timesout a test, thread dump (stack: rev c2ee82c9091a721e22a0eb69be17cd0217739099) * (edit) pom.xml * (edit) hbase-common/src/test/java/org/apache/hadoop/hbase/HBaseClassTestRule.java * (edit) hbase-common/src/test/java/org/apache/hadoop/hbase/TestTimeout.java * (add) hbase-common/src/test/java/org/apache/hadoop/hbase/TimedOutTestsListener.java * (delete) hbase-server/src/test/java/org/apache/hadoop/hbase/TimedOutTestsListener.java > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch, HBASE-19986.branch-2.003.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361856#comment-16361856 ] Hudson commented on HBASE-19970: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4575 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4575/]) HBASE-19970 Remove unused functions from TableAuthManager. (appy: rev 7cc239fb5ac0ce3f22d93d1dbf7e80609427710a) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessControlLists.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestZKPermissionWatcher.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestTablePermissions.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
[ https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361854#comment-16361854 ] stack commented on HBASE-19965: --- It looks like the test can go on longer than our ten minute timeout. Here is complaint from last two failures in nightlies. --- Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi --- Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds at org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 s <<< ERROR! java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port 35917 --- Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi --- Tests run: 24, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 574.354 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 7.735 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds at org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testSplitSwitch(TestAsyncRegionAdminApi.java:285) org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 7.772 s <<< ERROR! java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port 43069 See how there are 20 odd tests run. See how last one was running for 14 seconds yet we timed out after 600 seconds. Here are start times for the latest run: {code} 1 2018-02-13 02:58:52,380 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testCompactRegionServer[0] Thread=302, OpenFileDescriptor=1612, MaxFileDescriptor=1048576, SystemLoadAverage=1886, ProcessCount=20, AvailableMemoryMB=9729 2 2018-02-13 02:59:21,149 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testGetOnlineRegions[0] Thread=324, OpenFileDescriptor=1625, MaxFileDescriptor=1048576, SystemLoadAverage=1887, ProcessCount=17, AvailableMemoryMB=9766 3 2018-02-13 02:59:32,669 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testCompactMob[0] Thread=367, OpenFileDescriptor=1616, MaxFileDescriptor=1048576, SystemLoadAverage=1814, ProcessCount=17, AvailableMemoryMB=8805 4 2018-02-13 02:59:50,492 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testSplitTable[0] Thread=373, OpenFileDescriptor=1630, MaxFileDescriptor=1048576, SystemLoadAverage=1763, ProcessCount=17, AvailableMemoryMB=8683 5 2018-02-13 03:01:18,069 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testAssignRegionAndUnassignRegion[0] Thread=380, OpenFileDescriptor=1581, MaxFileDescriptor=1048576, SystemLoadAverage=1761, ProcessCount=17, AvailableMemoryMB=9252 6 2018-02-13 03:01:28,706 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testCompact[0] Thread=386, OpenFileDescriptor=1586, MaxFileDescriptor=1048576, SystemLoadAverage=1804, ProcessCount=17, AvailableMemoryMB=10421 7 2018-02-13 03:02:36,046 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testGetRegionByStateOfTable[0] Thread=360, OpenFileDescriptor=1573, MaxFileDescriptor=1048576, SystemLoadAverage=1778, ProcessCount=17, AvailableMemoryMB=10053 8 2018-02-13 03:02:51,851 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testFlushTableAndRegion[0] Thread=387, OpenFileDescriptor=1581, MaxFileDescriptor=1048576, SystemLoadAverage=1773, ProcessCount=20, AvailableMemoryMB=9379 9 2018-02-13 03:03:09,421 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testSplitSwitch[0] Thread=405, OpenFileDescriptor=1590, MaxFileDescriptor=1048576, SystemLoadAverage=1809, ProcessCount=17, AvailableMemoryMB=8787 10 2018-02-13 03:03:33,784 INFO [Time-limited test] hbase.ResourceChecker(148): before: client.TestAsyncRegionAdminApi#testMergeRegions[0] Thread=413, OpenFileDescriptor=1595, MaxFileDescriptor=1048576, SystemLoadAverage=1795, ProcessCount=17, AvailableMemoryMB=9152 11 2018-02-13 03:04:01,365 INFO
[jira] [Commented] (HBASE-19952) Find tests which are declared with wrong category
[ https://issues.apache.org/jira/browse/HBASE-19952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361827#comment-16361827 ] Duo Zhang commented on HBASE-19952: --- It’s fine. This is not critical, should not block beta-2. > Find tests which are declared with wrong category > - > > Key: HBASE-19952 > URL: https://issues.apache.org/jira/browse/HBASE-19952 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-19948) Since HBASE-19873, HBaseClassTestRule, Small/Medium/Large has different semantic
[ https://issues.apache.org/jira/browse/HBASE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-19948. --- Resolution: Fixed Release Note: In subtask, fixed doc and annotations to be more explicit that test timings are for the whole Test Fixture/Test Class/Test Suite NOT the test method only as we'd measuring up to this (tother subtasks untethered Categorization and test timeout such that all categories now have a ten minute timeout -- no test can run longer than ten minutes or it gets killed/timedout). Resolving. All subtasks done. Let me know if I should keep it open lads ([~appy], [~Apache9]) > Since HBASE-19873, HBaseClassTestRule, Small/Medium/Large has different > semantic > > > Key: HBASE-19948 > URL: https://issues.apache.org/jira/browse/HBASE-19948 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19948.branch-2.001.patch > > > I was confused on how SmallTest/MediumTest/LargeTest were being interpreted > since HBASE-19873 where we added HBaseClassTestRule enforcing a ClassRule. > Small/Medium/Large are defined up in the refguide here: > [http://hbase.apache.org/book.html#hbase.unittests] > E.g: "Small test cases are executed in a shared JVM and individual test cases > should run in 15 seconds or less..." > I've always read the above as each method in a test suite/class should take > 15 seconds (see below for finding by [~appy] [1]). > The old CategoryBasedTimeout annotation used to try and enforce a test method > taking only its designated category amount of time. > The JUnit Timeout Rule talks about enforcing the timeout per test method: > [https://junit.org/junit4/javadoc/4.12/org/junit/rules/Timeout.html] > The above meant that you could have as many tests as you wanted in a > class/suite and it could run as along as you liked as along as each > individual test stayed within its category-based elapsed amount of time (and > the whole suite completed inside the surefire fork timeout of 15mins). > Then came HBASE-19873 which addressed an awkward issue around accounting for > time spent in startup/shutdown – i.e. time taken outside of a test method run > – and trying to have a timeout that cuts in before the surefire fork one > does. It ended up adding a ClassRule that set a timeout on the whole test > *suite/class* – Good – but the timeout set varies dependent upon the test > category. A suite/class with 60 small tests that each take a second to > complete now times out if you add one more test to the suite (61 seconds > 60 > seconds timeout – give or take vagaries of the platform you run the test on). > This latter change I have trouble with. It changes how small/medium/large > have classically been understood. I think it will confuse too as now devs > must do careful counting of test methods per class; one fat one (i.e. > 'large') is same as N small ones. Could we set a single timeout on the whole > test suite/class, one that was well less than the surefire fork kill timeout > of 900seconds but keep the old timeout on each method as we used to have with > the category-based annotation? > (Am just looking for agreement that we have a problem here and that we want > categories to be per test method as it used be; how to do it doesn't look > easy and is for later). > 1. @appy pointed out that the actual SmallTest annotation says something > other than what is in the refguide: "Tag a test as 'small', meaning that the > test class has the following characteristics: ideally, last less than 15 > seconds" > [https://github.com/apache/hbase/blob/master/hbase-annotations/src/test/java/org/apache/hadoop/hbase/testclassification/SmallTests.java#L22] > 2. Here is code to show how timeout has changed now... previous the below > would have 'run' without timing out. > {noformat} > @Category({SmallTests.class}) > public class TestTimingOut { > @ClassRule > public static final HBaseClassTestRule CLASS_RULE = > HBaseClassTestRule.forClass(TestTimingOut.class); > @Test > public void oneTest() { Threads.sleep(14000); } > > @Test > public void twoTest() { Threads.sleep(14000); } > @Test > public void threeTest() { Threads.sleep(14000); } > > @Test > public void fourTest() { Threads.sleep(14000); } > @Test > public void fiveTest() { Threads.sleep(14000); } > } > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19952) Find tests which are declared with wrong category
[ https://issues.apache.org/jira/browse/HBASE-19952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361823#comment-16361823 ] stack commented on HBASE-19952: --- I made this into an issue and moved it to 2.0.0. It seems less important now we have unhitched category and timeout. Correct me if I am wrong [~Apache9] and pull it back in. Thanks. > Find tests which are declared with wrong category > - > > Key: HBASE-19952 > URL: https://issues.apache.org/jira/browse/HBASE-19952 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19952) Find tests which are declared with wrong category
[ https://issues.apache.org/jira/browse/HBASE-19952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19952: -- Issue Type: Bug (was: Sub-task) Parent: (was: HBASE-19948) > Find tests which are declared with wrong category > - > > Key: HBASE-19952 > URL: https://issues.apache.org/jira/browse/HBASE-19952 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-19952) Find tests which are declared with wrong category
[ https://issues.apache.org/jira/browse/HBASE-19952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reassigned HBASE-19952: - Assignee: Duo Zhang > Find tests which are declared with wrong category > - > > Key: HBASE-19952 > URL: https://issues.apache.org/jira/browse/HBASE-19952 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19952) Find tests which are declared with wrong category
[ https://issues.apache.org/jira/browse/HBASE-19952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19952: -- Fix Version/s: 2.0.0 > Find tests which are declared with wrong category > - > > Key: HBASE-19952 > URL: https://issues.apache.org/jira/browse/HBASE-19952 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-19960) Doc test timeouts and test categories in hbase2
[ https://issues.apache.org/jira/browse/HBASE-19960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-19960. --- Resolution: Fixed Hadoop Flags: Reviewed Purged a bunch of stale stuff in the doc. Added note on no test allowed take more than ten minutes. Added note on what ClassRule is. Clarified that Small/Medium/Large timings are for the whole Test Fixture, not just a test method (updated the actual doc in the Annotations themselves), etc. Pushed to branch-2 and master. > Doc test timeouts and test categories in hbase2 > --- > > Key: HBASE-19960 > URL: https://issues.apache.org/jira/browse/HBASE-19960 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19960.branch-2.001.patch > > > Write up that Categories are no longer acted upon, that we no longer timeout > test methods. Write up that if a test goes longer than ten minutes, it is > killed. Make passing reference to how it used to be but don't spend much time > on it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19960) Doc test timeouts and test categories in hbase2
[ https://issues.apache.org/jira/browse/HBASE-19960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361817#comment-16361817 ] stack commented on HBASE-19960: --- .001 Doc on timeouts and ClassRule as of hbase-2.0.0. Just going to push. > Doc test timeouts and test categories in hbase2 > --- > > Key: HBASE-19960 > URL: https://issues.apache.org/jira/browse/HBASE-19960 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19960.branch-2.001.patch > > > Write up that Categories are no longer acted upon, that we no longer timeout > test methods. Write up that if a test goes longer than ten minutes, it is > killed. Make passing reference to how it used to be but don't spend much time > on it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19960) Doc test timeouts and test categories in hbase2
[ https://issues.apache.org/jira/browse/HBASE-19960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19960: -- Attachment: HBASE-19960.branch-2.001.patch > Doc test timeouts and test categories in hbase2 > --- > > Key: HBASE-19960 > URL: https://issues.apache.org/jira/browse/HBASE-19960 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19960.branch-2.001.patch > > > Write up that Categories are no longer acted upon, that we no longer timeout > test methods. Write up that if a test goes longer than ten minutes, it is > killed. Make passing reference to how it used to be but don't spend much time > on it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock
[ https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361795#comment-16361795 ] Hadoop QA commented on HBASE-19988: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 11s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 33s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 6m 32s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.5. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 8m 29s{color} | {color:red} The patch causes 10 errors with Hadoop v2.7.4. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 10m 38s{color} | {color:red} The patch causes 10 errors with Hadoop v3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 39s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}125m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.TestReplicationDroppedTables | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19988 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910308/hbase-19988.master.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1f75f120fc96 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7cc239fb5a | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | unit |
[jira] [Reopened] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-19970: --- Reopening to address the failing test. > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361771#comment-16361771 ] stack commented on HBASE-19970: --- [~appy] It looks like this is failing since this went in java.lang.AssertionError at org.apache.hadoop.hbase.security.access.TestZKPermissionWatcher.testPermissionsWatcher(TestZKPermissionWatcher.java:168) It is plain to see here: https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361749#comment-16361749 ] Hadoop QA commented on HBASE-19986: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 31s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 11s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 43s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 7m 49s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s{color} | {color:red} hbase-common: The patch generated 7 new + 5 unchanged - 2 fixed = 12 total (was 7) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 16s{color} | {color:red} root: The patch generated 7 new + 5 unchanged - 2 fixed = 12 total (was 7) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 14s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}167m 21s{color} | {color:green} root in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}216m 38s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19986 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910295/HBASE-19986.branch-2.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile xml | | uname | Linux c5b6d0e4c6a2 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-19987) update error-prone to 2.2.0
[ https://issues.apache.org/jira/browse/HBASE-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361735#comment-16361735 ] Hadoop QA commented on HBASE-19987: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 1s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 8m 24s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 45s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 19m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}133m 0s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestAvoidCellReferencesIntoShippedBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19987 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910294/HBASE-19987.patch | | Optional Tests | asflicense javac javadoc unit shadedjars hadoopcheck xml compile findbugs hbaseanti checkstyle | | uname | Linux 9fb938d6c45a 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 00f8877323 | | maven | version: Apache Maven
[jira] [Commented] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
[ https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361727#comment-16361727 ] Ted Yu commented on HBASE-19989: Is it possible to add a test ? > READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly > -- > > Key: HBASE-19989 > URL: https://issues.apache.org/jira/browse/HBASE-19989 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.4.1 >Reporter: Ben Lau >Assignee: Ben Lau >Priority: Major > Attachments: HBASE-19989.patch > > > Region state transitions do not work correctly for READY_TO_MERGE/SPLIT. > [~thiruvel] and I noticed this is due to break statements being in the wrong > place in AssignmentManager. This allows a race condition for example in > which one of the regions being merged could be moved concurrently, resulting > in the merge transaction failing and then double assignment and/or dataloss. > This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not > branch-2 as the relevant code in AM has since been rewritten. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361726#comment-16361726 ] stack commented on HBASE-19986: --- Pushed addendum to address white-space, checkstyle and some comments left by [~Apache9] up on rb. Test passes locally. Will keep an eye on it. > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch, HBASE-19986.branch-2.003.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
[ https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Lau updated HBASE-19989: Attachment: HBASE-19989.patch > READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly > -- > > Key: HBASE-19989 > URL: https://issues.apache.org/jira/browse/HBASE-19989 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.4.1 >Reporter: Ben Lau >Assignee: Ben Lau >Priority: Major > Attachments: HBASE-19989.patch > > > Region state transitions do not work correctly for READY_TO_MERGE/SPLIT. > [~thiruvel] and I noticed this is due to break statements being in the wrong > place in AssignmentManager. This allows a race condition for example in > which one of the regions being merged could be moved concurrently, resulting > in the merge transaction failing and then double assignment and/or dataloss. > This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not > branch-2 as the relevant code in AM has since been rewritten. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
Ben Lau created HBASE-19989: --- Summary: READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly Key: HBASE-19989 URL: https://issues.apache.org/jira/browse/HBASE-19989 Project: HBase Issue Type: Bug Affects Versions: 1.4.1, 1.3.1 Reporter: Ben Lau Assignee: Ben Lau Region state transitions do not work correctly for READY_TO_MERGE/SPLIT. [~thiruvel] and I noticed this is due to break statements being in the wrong place in AssignmentManager. This allows a race condition for example in which one of the regions being merged could be moved concurrently, resulting in the merge transaction failing and then double assignment and/or dataloss. This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not branch-2 as the relevant code in AM has since been rewritten. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock
[ https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361712#comment-16361712 ] stack commented on HBASE-19988: --- What was it logging? > HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while > waiting for a row lock > --- > > Key: HBASE-19988 > URL: https://issues.apache.org/jira/browse/HBASE-19988 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-beta-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0-beta-2 > > Attachments: hbase-19988.master.001.patch > > > See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361711#comment-16361711 ] Umesh Agashe commented on HBASE-19970: -- [~appy], I have created HBASE-19988 and submitted patch to reduce chattiness of HRegion#lockRowsAndBuildMiniBatch(). > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock
[ https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361707#comment-16361707 ] Umesh Agashe commented on HBASE-19988: -- Handling InterruptedIOException same as Timeout. Re-throwing without logging. Message is logged once from getRowLockInternal(). > HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while > waiting for a row lock > --- > > Key: HBASE-19988 > URL: https://issues.apache.org/jira/browse/HBASE-19988 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-beta-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0-beta-2 > > Attachments: hbase-19988.master.001.patch > > > See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock
[ https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-19988: - Status: Patch Available (was: Open) > HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while > waiting for a row lock > --- > > Key: HBASE-19988 > URL: https://issues.apache.org/jira/browse/HBASE-19988 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-beta-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0-beta-2 > > Attachments: hbase-19988.master.001.patch > > > See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock
[ https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-19988: - Attachment: hbase-19988.master.001.patch > HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while > waiting for a row lock > --- > > Key: HBASE-19988 > URL: https://issues.apache.org/jira/browse/HBASE-19988 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-beta-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0-beta-2 > > Attachments: hbase-19988.master.001.patch > > > See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361705#comment-16361705 ] Duo Zhang commented on HBASE-19976: --- I think yield is not a easy way for the developers, the retry is in the HTable implementation... And as I said above, the ServerCrashProcedure which carries meta should be high priority and it is in the server queue, the RecoverMeta should also be high priority but it is in the table queue... > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361698#comment-16361698 ] stack commented on HBASE-19976: --- bq. This is a very typical dead lock problem in computer science. Smile. We see it in many forms. Usual response is special channel to handle the 'exception'. Then the number of exceptional behaviors builds up and then we up the number of 'meta' handlers to avoid deadlock in the meta handlers or we add a meta-meta handler. I was wondering if you had a thread dump that showed all handlers occupied. I was thinking all threads blocked occupying procedures so the meta procedure was unable to run was an ugly situation. They should yield. We have dedicated queues -- queues for server tasks, queues for table tasks -- and then within these notions of priority such that high priority are scheduled more frequently than low priority and server tasks before table tasks. As long as Procedures yield, it should work out fine? You think we need to add a new priority dimension to the mix [~Apache9]? The RecoverMetaProcedure is made up of multiple steps (log splitting, assign) and subprocedures. All would run in a single high-priority thread? Thanks. > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19945) Separate tests of TestRSGroups into two classes
[ https://issues.apache.org/jira/browse/HBASE-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361697#comment-16361697 ] Ted Yu commented on HBASE-19945: As far as I can tell, the test still times out on Jenkins: https://builds.apache.org/job/HBase-TRUNK_matrix/lastCompletedBuild/jdk=JDK%201.8%20(latest),label=(Hadoop%20&&%20!H5)/testReport/org.apache.hadoop.hbase.rsgroup/TestRSGroups/org_apache_hadoop_hbase_rsgroup_TestRSGroups/ > Separate tests of TestRSGroups into two classes > --- > > Key: HBASE-19945 > URL: https://issues.apache.org/jira/browse/HBASE-19945 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Attachments: 19945.v1.txt, 19945.v2.txt > > > TestRSGroups is annotated as MediumTests. It times out on Jenkins: > https://builds.apache.org/job/HBase-Trunk_matrix/4537/jdk=JDK%201.8%20(latest),label=(Hadoop%20&&%20!H5)/testReport/junit/org.apache.hadoop.hbase.rsgroup/TestRSGroups/org_apache_hadoop_hbase_rsgroup_TestRSGroups/ > {code} > org.junit.runners.model.TestTimedOutException: test timed out after 180 > seconds > {code} > The above is reproducible on Linux locally. > The tests from TestRSGroups should be separated into two classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361694#comment-16361694 ] Hadoop QA commented on HBASE-19986: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 26s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 6s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 7m 1s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s{color} | {color:red} hbase-common: The patch generated 8 new + 5 unchanged - 2 fixed = 13 total (was 7) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 14 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 57s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 14m 39s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 19s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}123m 4s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestBlockEvictionFromClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19986 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910289/HBASE-19986.branch-2.002.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 112b2ee93482 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 15:49:21 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 1ae64ccee0 | | maven
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361685#comment-16361685 ] Duo Zhang commented on HBASE-19976: --- Seems I added the thread dump to wrong place so there is no thread dump when failure... Anyway, see here https://builds.apache.org/job/HBASE-Flaky-Tests/25832/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.TestDLSFSHLog-output.txt/*view*/ {noformat} 2018-02-12 04:56:54,563 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-1(pid=139) run time 31.6840sec 2018-02-12 04:56:54,563 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-2(pid=146) run time 29.5870sec 2018-02-12 04:56:54,563 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-3(pid=150) run time 29.5880sec 2018-02-12 04:56:54,563 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-4(pid=142) run time 31.6870sec 2018-02-12 04:56:54,563 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-5(pid=138) run time 31.6830sec 2018-02-12 04:56:54,563 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-6(pid=140) run time 31.6840sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-7(pid=141) run time 31.6880sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-8(pid=143) run time 31.6890sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-9(pid=137) run time 31.6840sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-10(pid=136) run time 31.6840sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-11(pid=149) run time 29.5880sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-12(pid=148) run time 29.5880sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-13(pid=144) run time 29.5870sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-14(pid=145) run time 29.5870sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-15(pid=147) run time 29.5880sec 2018-02-12 04:56:54,564 WARN [ProcExecTimeout] procedure2.ProcedureExecutor$WorkerMonitor(1985): Worker stuck PEWorker-16(pid=151) run time 29.5890sec {noformat} All procedures are stuck. And let's check all the procedures. {noformat} 2018-02-12 04:56:22,879 INFO [PEWorker-1] procedure.MasterProcedureScheduler(883): pid=139, ppid=130, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=testThreeRSAbort, region=d36808157b0edc272844a07587e3630e testThreeRSAbort testThreeRSAbort,o@\x17\xAB\xCE,1518411364183.d36808157b0edc272844a07587e3630e. 2018-02-12 04:56:24,976 INFO [PEWorker-2] procedure.MasterProcedureScheduler(883): pid=146, ppid=131, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=testThreeRSAbort, region=7726f3d31204e2e60fc38582fefddfdb testThreeRSAbort testThreeRSAbort,f\xAA\x08Y),1518411364183.7726f3d31204e2e60fc38582fefddfdb. 2018-02-12 04:56:24,975 INFO [PEWorker-3] procedure.MasterProcedureScheduler(883): pid=150, ppid=131, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=testThreeRSAbort, region=693d6cddbd1f127dd087ee20def3f081 testThreeRSAbort testThreeRSAbort,s6\x94\xE5\xA4,1518411364183.693d6cddbd1f127dd087ee20def3f081. 2018-02-12 04:56:22,880 INFO [PEWorker-4] procedure.MasterProcedureScheduler(883): pid=142, ppid=130, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=testThreeRSAbort, region=0d0f98adccc5c3430f2981524a9cdd12 testThreeRSAbort testThreeRSAbort,u\xDA\xE8a\x88,1518411364183.0d0f98adccc5c3430f2981524a9cdd12. 2018-02-12 04:56:22,880 INFO [PEWorker-5] procedure.MasterProcedureScheduler(883): pid=138, ppid=130, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=testThreeRSAbort, region=15f6ffa81a9f6f469f917050199b8a8c testThreeRSAbort testThreeRSAbort,g\xFC2\x17\x1B,1518411364183.15f6ffa81a9f6f469f917050199b8a8c. 2018-02-12 04:56:22,880 INFO [PEWorker-6] procedure.MasterProcedureScheduler(883): pid=140, ppid=130, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=testThreeRSAbort, region=82b1ce3cc98b041a73162d359366df5d testThreeRSAbort testThreeRSAbort,p\x92Ai\xC0,1518411364183.82b1ce3cc98b041a73162d359366df5d.
[jira] [Commented] (HBASE-19972) Should rethrow the RetriesExhaustedWithDetailsException when failed to apply the batch in ReplicationSink
[ https://issues.apache.org/jira/browse/HBASE-19972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361684#comment-16361684 ] Hudson commented on HBASE-19972: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4574 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4574/]) HBASE-19972 Should rethrow the RetriesExhaustedWithDetailsException when (stack: rev 00f88773239b96e256c585fae98d846e2b65b4a4) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSink.java > Should rethrow the RetriesExhaustedWithDetailsException when failed to apply > the batch in ReplicationSink > -- > > Key: HBASE-19972 > URL: https://issues.apache.org/jira/browse/HBASE-19972 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Critical > Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2 > > Attachments: HBASE-19972-branch-1.4.patch, HBASE-19972.v1.patch, > HBASE-19972.v1.patch > > > As [~Apache9] said in HBASE-12091. > In ReplicationSink#batch,we swallow the RetriesExhaustedWithDetailsException > except > TableNotFoundException, actually, should rethrow the exception. > {code:java} > try { > Connection connection = getConnection(); > table = connection.getTable(tableName); > for (List rows : allRows) { > table.batch(rows); > } > } catch (RetriesExhaustedWithDetailsException rewde) { > for (Throwable ex : rewde.getCauses()) { > if (ex instanceof TableNotFoundException) { > throw new TableNotFoundException("'"+tableName+"'"); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19968) MapReduce test fails with NoClassDefFoundError against hadoop3
[ https://issues.apache.org/jira/browse/HBASE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361683#comment-16361683 ] Hudson commented on HBASE-19968: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4574 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4574/]) HBASE-19968 MapReduce test fails with NoClassDefFoundError against (tedyu: rev 1c67d8a46f644275484d0ae3554cb892e81882ba) * (edit) hbase-mapreduce/pom.xml > MapReduce test fails with NoClassDefFoundError against hadoop3 > -- > > Key: HBASE-19968 > URL: https://issues.apache.org/jira/browse/HBASE-19968 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: 19968.v1.txt > > > When running mapreduce tests against hadoop3, I observed the following: > {code} > [ERROR] > testWithMockedMapReduceSingleRegion(org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat) > Time elapsed: 0.024 s <<< ERROR! > java.lang.NoClassDefFoundError: > org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo > at > org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat.testWithMockedMapReduce(TestTableSnapshotInputFormat.java:178) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo > at > org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat.testWithMockedMapReduce(TestTableSnapshotInputFormat.java:178) > {code} > This was due to lack of dependency on hadoop-hdfs module in the hadoop-3.0 > profile of hbase-mapreduce/pom.xml -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock
Umesh Agashe created HBASE-19988: Summary: HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock Key: HBASE-19988 URL: https://issues.apache.org/jira/browse/HBASE-19988 Project: HBase Issue Type: Improvement Components: amv2 Affects Versions: 2.0.0-beta-1 Reporter: Umesh Agashe Assignee: Umesh Agashe Fix For: 2.0.0-beta-2 See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361678#comment-16361678 ] Duo Zhang commented on HBASE-19976: --- And here, since the procedure executor does not know much about the priority we defined in MasterProcesureScheduler since it is general and in another module, I plan to abstract the priority like this: The ProcedureScheduler will return an int number to tell that how many priority levels it has, and in ProcedureExecutor, we will reserve one thread for each of the level except the lowest one. And when polling, we pass the priority level to the ProcedureScheduler to only fetch the procedures which priority is higher. And we can introduce 3 levels in MasterProcedureScheduler, one for meta, one for other system table, and one for all other procedures. Notice that the ServerCrashProcedure should have different priority if it carries meta region or other system regions. Thanks. > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361674#comment-16361674 ] Duo Zhang commented on HBASE-19976: --- This is a very typical dead lock problem in computer science. The resources are all held by some processes so we have no chance to schedule other processes, but the running processes need the result of another process to complete, then dead lock. Here the resource is thread. One way to solve this is to reserve a thread to only execute high priority procedures. > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361666#comment-16361666 ] stack commented on HBASE-19976: --- No, I'm wrong, RecoverMetaProcedure implements TableProcedureInterface. Do you have thread dump of all stuck Procedures waiting on Master [~Apache9] ? Can I break up this step so RecoverMetaProcedure has a chance to run? > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361612#comment-16361612 ] Appy edited comment on HBASE-19970 at 2/13/18 12:34 AM: No, the failure doesn't seem related. Ran it locally too, passed. However, it was interesting to see 4.2 GB log from single test. [~uagashe] It's something around batchMutate and getRowLockInternal. Mind taking a look sir since you were around these parts recently (in separate jira). At the very least, we should nerf the logging...4.2 gb logs is crazy! :) Pushing to master and backporting --all the way back.-- stopped at branch-1. Didn't backport to maintenance release since it's not a bug. was (Author: appy): No, the failure doesn't seem related. Ran it locally too, passed. However, it was interesting to see 4.2 GB log from single test. [~uagashe] It's something around batchMutate and getRowLockInternal. Mind taking a look sir since you were around these parts recently (in separate jira). At the very least, we should nerf the logging...4.2 gb logs is crazy! :) Pushing to master and backporting all the way back. > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361649#comment-16361649 ] Umesh Agashe commented on HBASE-19970: -- Sure, [~appy]! Let me see... > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361648#comment-16361648 ] Appy commented on HBASE-19970: -- Thanks for review stack. Closing. > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-19970: - Resolution: Fixed Status: Resolved (was: Patch Available) > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-19970: - Fix Version/s: 2.0.0-beta-2 1.5.0 > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 1.5.0, 2.0.0-beta-2 > > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361621#comment-16361621 ] stack commented on HBASE-19976: --- Ideally we'd schedule RecoverMetaProcedure at the front of the queue. > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361619#comment-16361619 ] stack commented on HBASE-19976: --- [~Apache9] RecoverMetaProcedure is relatively new. It is neither a TableProcedure nor a ServerProcedure (oversight?). Could this be the problem? If it were a ServerProcedure, it would be run ahead of everyone? > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361617#comment-16361617 ] stack commented on HBASE-19976: --- [~Apache9] HBASE-18109 is on about the prioritization we currently have and how it is server procedures > table procedures and meta > system > user-space tables. You seeing that RecoverMetaProcedure is not being scheduled though it in its guts is about assigning meta? > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361612#comment-16361612 ] Appy commented on HBASE-19970: -- No, the failure doesn't seem related. Ran it locally too, passed. However, it was interesting to see 4.2 GB log from single test. [~uagashe] It's something around batchMutate and getRowLockInternal. Mind taking a look sir since you were around these parts recently (in separate jira). At the very least, we should nerf the logging...4.2 gb logs is crazy! :) Pushing to master and backporting all the way back. > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361608#comment-16361608 ] stack commented on HBASE-19976: --- Issue on procedure priority; mostly about how how system tables get assigned of user-space tables. Talks about how priority procedures are scheduled more frequently than lower priority procedures and that high priority procedures get scheduled at the front of the queues. > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19976) Dead lock if the worker threads in procedure executor are exhausted
[ https://issues.apache.org/jira/browse/HBASE-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361605#comment-16361605 ] Duo Zhang commented on HBASE-19976: --- I can give it a shot. Used to modify the executor and scheduler when implementing procedure based replication so I think I’m familiar enough. So do you have other ideas in mind sir? If not, let me try the priority approach first? Thanks. > Dead lock if the worker threads in procedure executor are exhausted > --- > > Key: HBASE-19976 > URL: https://issues.apache.org/jira/browse/HBASE-19976 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > > See the comments in HBASE-19554. If all the worker threads are stuck in > AssignProcdure since meta region is offline, then the RecoverMetaProcedure > can not be executed and cause dead lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19972) Should rethrow the RetriesExhaustedWithDetailsException when failed to apply the batch in ReplicationSink
[ https://issues.apache.org/jira/browse/HBASE-19972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361603#comment-16361603 ] Duo Zhang commented on HBASE-19972: --- Thanks sir. > Should rethrow the RetriesExhaustedWithDetailsException when failed to apply > the batch in ReplicationSink > -- > > Key: HBASE-19972 > URL: https://issues.apache.org/jira/browse/HBASE-19972 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Critical > Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2 > > Attachments: HBASE-19972-branch-1.4.patch, HBASE-19972.v1.patch, > HBASE-19972.v1.patch > > > As [~Apache9] said in HBASE-12091. > In ReplicationSink#batch,we swallow the RetriesExhaustedWithDetailsException > except > TableNotFoundException, actually, should rethrow the exception. > {code:java} > try { > Connection connection = getConnection(); > table = connection.getTable(tableName); > for (List rows : allRows) { > table.batch(rows); > } > } catch (RetriesExhaustedWithDetailsException rewde) { > for (Throwable ex : rewde.getCauses()) { > if (ex instanceof TableNotFoundException) { > throw new TableNotFoundException("'"+tableName+"'"); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager
[ https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361599#comment-16361599 ] stack commented on HBASE-19970: --- +1 Failure related? > Remove unused functions from TableAuthManager > - > > Key: HBASE-19970 > URL: https://issues.apache.org/jira/browse/HBASE-19970 > Project: HBase > Issue Type: Task >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-19970.master.001.patch > > > Functions deleted in TableAuthManager: > - setTableUserPermissions > - setTableGroupPermissions > - setNamespaceUserPermissions > - setNamespaceGroupPermissions > - writeTableToZooKeeper > - writeNamespaceToZooKeeper > To make sure it was not a bug, and that relevant functionality moved to some > alternate code path, tried to find out why and when these functions went out > of use. But just couldn't figure out...until i reached the patch which added > them. Looks like they were dead functions to start with :) > Jira which added them: HBASE-8409. Commit id: > ac10b3c13d6b66e12d0c9601204b01dfa525ed19 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19986: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.0.0-beta-2 Status: Resolved (was: Patch Available) Pushed. Thanks for the pointer and review [~appy] > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch, HBASE-19986.branch-2.003.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361593#comment-16361593 ] Appy commented on HBASE-19986: -- Looks nice. +1 > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch, HBASE-19986.branch-2.003.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19920) TokenUtil.obtainToken unnecessarily creates a local directory
[ https://issues.apache.org/jira/browse/HBASE-19920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361587#comment-16361587 ] Mike Drob commented on HBASE-19920: --- Looking for a review please > TokenUtil.obtainToken unnecessarily creates a local directory > - > > Key: HBASE-19920 > URL: https://issues.apache.org/jira/browse/HBASE-19920 > Project: HBase > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Mike Drob >Priority: Major > Fix For: 2.0 > > Attachments: HBASE-19920.patch, HBASE-19920.v2.patch, > HBASE-19920.v3.patch, HBASE-19920.v4.patch, HBASE-19920.v5.patch, > HBASE-19920.v6.patch, HBASE-19920.v7.patch, HBASE-19920.v8.patch > > > On client code, when one calls TokenUtil.obtainToken it loads ProtobufUtil > which in its static block initializes DynamicClassLoader and that creates the > directory ${hbase.local.dir}/jars/ and also instantiates a filesystem class > to access hbase.dynamic.jars.dir. > https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/util/DynamicClassLoader.java#L109-L127 > Since this is region server specific code, not expecting this to happen when > one accesses hbase as a client. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361585#comment-16361585 ] stack commented on HBASE-19986: --- .003 does the @appy suggestion Here is what it looks like when it trips: {code} 2018-02-12 15:08:31,261 INFO [Time-limited test] hbase.ResourceChecker(148): before: TestTimeout#run1 Thread=8, OpenFileDescriptor=89, MaxFileDescriptor=10240, SystemLoadAverage=478, ProcessCount=372, AvailableMemoryMB=327 2018-02-12 15:08:31,444 INFO [Time-limited test] hbase.ResourceChecker(172): after: TestTimeout#run1 Thread=8 (was 8), OpenFileDescriptor=89 (was 89), MaxFileDescriptor=10240 (was 10240), SystemLoadAverage=478 (was 478), ProcessCount=372 (was 372), AvailableMemoryMB=327 (was 327) 2018-02-12 15:08:31,523 INFO [Time-limited test] hbase.ResourceChecker(148): before: TestTimeout#infiniteLoop Thread=8, OpenFileDescriptor=89, MaxFileDescriptor=10240, SystemLoadAverage=478, ProcessCount=372, AvailableMemoryMB=325 > TEST TIMED OUT. PRINTING THREAD DUMP. < Timestamp: 2018-02-12 03:08:41,125 "surefire-forkedjvm-command-thread" daemon prio=5 tid=10 runnable java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.maven.surefire.booter.MasterProcessCommand.decode(MasterProcessCommand.java:115) at org.apache.maven.surefire.booter.CommandReader$CommandRunnable.run(CommandReader.java:391) at java.lang.Thread.run(Thread.java:745) "Reference Handler" daemon prio=10 tid=2 in Object.wait() java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153) "main" prio=5 tid=1 runnable java.lang.Thread.State: RUNNABLE {code} > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch, HBASE-19986.branch-2.003.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19987) update error-prone to 2.2.0
[ https://issues.apache.org/jira/browse/HBASE-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-19987: -- Attachment: HBASE-19987.patch > update error-prone to 2.2.0 > --- > > Key: HBASE-19987 > URL: https://issues.apache.org/jira/browse/HBASE-19987 > Project: HBase > Issue Type: Bug >Reporter: Mike Drob >Priority: Major > Attachments: HBASE-19987.patch > > > keep ourselves healthy and up to date -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19986: -- Attachment: HBASE-19986.branch-2.003.patch > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch, HBASE-19986.branch-2.003.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19987) update error-prone to 2.2.0
[ https://issues.apache.org/jira/browse/HBASE-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-19987: -- Status: Patch Available (was: Open) > update error-prone to 2.2.0 > --- > > Key: HBASE-19987 > URL: https://issues.apache.org/jira/browse/HBASE-19987 > Project: HBase > Issue Type: Bug >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Attachments: HBASE-19987.patch > > > keep ourselves healthy and up to date -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-19987) update error-prone to 2.2.0
[ https://issues.apache.org/jira/browse/HBASE-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob reassigned HBASE-19987: - Assignee: Mike Drob > update error-prone to 2.2.0 > --- > > Key: HBASE-19987 > URL: https://issues.apache.org/jira/browse/HBASE-19987 > Project: HBase > Issue Type: Bug >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Attachments: HBASE-19987.patch > > > keep ourselves healthy and up to date -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19987) update error-prone to 2.2.0
Mike Drob created HBASE-19987: - Summary: update error-prone to 2.2.0 Key: HBASE-19987 URL: https://issues.apache.org/jira/browse/HBASE-19987 Project: HBase Issue Type: Bug Reporter: Mike Drob keep ourselves healthy and up to date -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361583#comment-16361583 ] stack commented on HBASE-19986: --- It looks like it works if I throw in a flush What I had also thread-dumped but it'd soon prove obnoxious.. It was wherever we printed out the exception. Your pointer should be better [~appy] > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361561#comment-16361561 ] Appy edited comment on HBASE-19986 at 2/12/18 10:54 PM: -you need to add it to surefire configuration here [https://github.com/apache/hbase/blob/master/pom.xml#L677]- Edit: oh nvm, i see you extended ResourceCheckerJUnitListener directly. was (Author: appy): --you need to add it to surefire configuration here https://github.com/apache/hbase/blob/master/pom.xml#L677-- Edit: oh nvm, i see you extended ResourceCheckerJUnitListener directly. > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361561#comment-16361561 ] Appy edited comment on HBASE-19986 at 2/12/18 10:54 PM: --you need to add it to surefire configuration here https://github.com/apache/hbase/blob/master/pom.xml#L677-- Edit: oh nvm, i see you extended ResourceCheckerJUnitListener directly. was (Author: appy): you need to add it to surefire configuration here https://github.com/apache/hbase/blob/master/pom.xml#L677 > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361561#comment-16361561 ] Appy commented on HBASE-19986: -- you need to add it to surefire configuration here https://github.com/apache/hbase/blob/master/pom.xml#L677 > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361557#comment-16361557 ] stack commented on HBASE-19986: --- Yuck. We had TimedOutTestsListener in hbase-server already. Mostly unused. .002 tries to use it. It doesn't get triggered in my mickey mouse program. Let me try some more. Meantime, .001 is nice in that it thread dumps on timeout but I think it will prove too obnoxious It does full thread dump in the exception toString which shows up in a few places.. Might be ok for a while though till we figure stuff in failed tests. > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19986: -- Attachment: HBASE-19986.branch-2.002.patch > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch, > HBASE-19986.branch-2.002.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361556#comment-16361556 ] Hadoop QA commented on HBASE-19986: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 43s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 3s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s{color} | {color:red} hbase-common: The patch generated 7 new + 5 unchanged - 2 fixed = 12 total (was 7) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 57s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 14m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19986 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910283/HBASE-19986.branch-2.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux b5afea9eee8f 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 15:49:21 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 1ae64ccee0 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11497/artifact/patchprocess/diff-checkstyle-hbase-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11497/testReport/ | | Max. process+thread count | 330 (vs. ulimit of 1) | | modules | C: hbase-common U: hbase-common | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11497/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This
[jira] [Commented] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361531#comment-16361531 ] Appy commented on HBASE-19986: -- Let's use this instead - https://github.com/apache/hadoop/blob/d1c6accb6f87b08975175580e15f1ff1fe29ab04/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/TimedOutTestsListener.java ? It also seems to be doing some deadlock detection...looks nice! Let's extend it in our hbase-common instead of copying it :) > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19986: -- Assignee: stack Status: Patch Available (was: Open) > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19986: -- Attachment: HBASE-19986.branch-2.001.patch > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Priority: Major > Attachments: HBASE-19986.branch-2.001.patch > > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
[ https://issues.apache.org/jira/browse/HBASE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19986: -- Description: We set look for stuck thread in our timeout rule but it is super conservative in what it prints.. it looks for a RUNNABLE thread and prints first found ONLY. Pretty useless for us. If a test timesout, often the printing has stopped in the stderr/stdout. I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 minutes but we've stopped printing to the logs and here is what junit prints: --- Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi --- Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds at org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 s <<< ERROR! java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port 35917 was:We set look for stuck thread in our timeout rule but it is super conservative in what it prints.. it looks for a RUNNABLE thread and prints first found ONLY. Pretty useless for us. If a test timesout, often the printing has stopped in the stderr/stdout. > If HBaseTestClassRule timesout a test, thread dump. > --- > > Key: HBASE-19986 > URL: https://issues.apache.org/jira/browse/HBASE-19986 > Project: HBase > Issue Type: Bug >Reporter: stack >Priority: Major > > We set look for stuck thread in our timeout rule but it is super conservative > in what it prints.. it looks for a RUNNABLE thread and prints first found > ONLY. Pretty useless for us. If a test timesout, often the printing has > stopped in the stderr/stdout. > I'm trying to debug TestAsyncRegionAdminApi. It says test timed out after 10 > minutes but we've stopped printing to the logs and here is what junit prints: > --- > Test set: org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > --- > Tests run: 25, Failures: 0, Errors: 2, Skipped: 2, Time elapsed: 572.508 s > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 600 > seconds > at > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:363) > org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi Time elapsed: 14.642 > s <<< ERROR! > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 35917 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19986) If HBaseTestClassRule timesout a test, thread dump.
stack created HBASE-19986: - Summary: If HBaseTestClassRule timesout a test, thread dump. Key: HBASE-19986 URL: https://issues.apache.org/jira/browse/HBASE-19986 Project: HBase Issue Type: Bug Reporter: stack We set look for stuck thread in our timeout rule but it is super conservative in what it prints.. it looks for a RUNNABLE thread and prints first found ONLY. Pretty useless for us. If a test timesout, often the printing has stopped in the stderr/stdout. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19842) Cell ACLs v2
[ https://issues.apache.org/jira/browse/HBASE-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361481#comment-16361481 ] Andrew Purtell commented on HBASE-19842: Improved the description a bit. > Cell ACLs v2 > > > Key: HBASE-19842 > URL: https://issues.apache.org/jira/browse/HBASE-19842 > Project: HBase > Issue Type: New Feature > Components: security >Reporter: Andrew Purtell >Assignee: Thomas D'Silva >Priority: Major > > Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL > in a tag stored with each cell. This was done for performance. This has some > drawbacks, most significantly unnecessary duplication and to grant or revoke > requires a rewrite of every affected cell. We could implement them in a space > efficient (and management efficient) way at the cost of some performance like > so: > First, allow storage of cell level ACLs in the ACL table. Rowkey would be a > generic identifier of some kind that can be distinguished from existing > rowkeys that associate the ACL with a cf, or table, or namespace. Existing > code for cf/table/namespace ACLs should ignore rows that do not conform to > today's keying strategy. > Then provide the option for storing the rowkey of an entry in the ACL table > in the cell ACL tag instead of the complete serialization. Allocate a new > cell tag ID to distinguish v2 ACL references from v1 embedded ACL > serializations. > The advantages would be reduction of unnecessary duplication, and, like ACLs > at other granularities, a GRANT or REVOKE which updates the ACL table will > update access control rules for all affected cells. The disadvantage would be > in order to process the reference to the ACL for each cell with an ACL > reference in a tag we will need to look up the ACL in the ACL table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19842) Cell ACLs v2
[ https://issues.apache.org/jira/browse/HBASE-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-19842: --- Description: Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL in a tag stored with each cell. This was done for performance. This has some drawbacks, most significantly unnecessary duplication and to grant or revoke requires a rewrite of every affected cell. We could implement them in a space efficient (and management efficient) way at the cost of some performance like so: First, allow storage of cell level ACLs in the ACL table. Rowkey would be a generic identifier of some kind that can be distinguished from existing rowkeys that associate the ACL with a cf, or table, or namespace. Existing code for cf/table/namespace ACLs should ignore rows that do not conform to today's keying strategy. Then provide the option for storing the rowkey of an entry in the ACL table in the cell ACL tag instead of the complete serialization. Allocate a new cell tag ID to distinguish v2 ACL references from v1 embedded ACL serializations. The advantages would be reduction of unnecessary duplication, and, like ACLs at other granularities, a GRANT or REVOKE which updates the ACL table will update access control rules for all affected cells. The disadvantage would be in order to process the reference to the ACL for each cell with an ACL reference in a tag we will need to look up the ACL in the ACL table. was: Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL in a tag stored with each cell. This was done for performance. This has some drawbacks, most significantly unnecessary duplication and to grant or revoke requires a rewrite of every affected cell. We could implement them in a space efficient (and management efficient) way at the cost of some performance like so: First, allow storage of cell level ACLs in the ACL table. Rowkey would be a generic identifier of some kind that can be distinguished from existing rowkeys that associate the ACL with a cf, or table, or namespace. Existing code for cf/table/namespace ACLs should ignore rows that do not conform to today's keying strategy. Then provide the option for storing the rowkey of an entry in the ACL table in the cell ACL tag instead of the complete serialization. The advantages would be reduction of unnecessary duplication, and, like ACLs at other granularities, a GRANT or REVOKE which updates the ACL table will update access control rules for all affected cells. The disadvantage would be in order to process the reference to the ACL for each cell with an ACL reference in a tag we will need to look up the ACL in the ACL table. > Cell ACLs v2 > > > Key: HBASE-19842 > URL: https://issues.apache.org/jira/browse/HBASE-19842 > Project: HBase > Issue Type: New Feature > Components: security >Reporter: Andrew Purtell >Assignee: Thomas D'Silva >Priority: Major > > Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL > in a tag stored with each cell. This was done for performance. This has some > drawbacks, most significantly unnecessary duplication and to grant or revoke > requires a rewrite of every affected cell. We could implement them in a space > efficient (and management efficient) way at the cost of some performance like > so: > First, allow storage of cell level ACLs in the ACL table. Rowkey would be a > generic identifier of some kind that can be distinguished from existing > rowkeys that associate the ACL with a cf, or table, or namespace. Existing > code for cf/table/namespace ACLs should ignore rows that do not conform to > today's keying strategy. > Then provide the option for storing the rowkey of an entry in the ACL table > in the cell ACL tag instead of the complete serialization. Allocate a new > cell tag ID to distinguish v2 ACL references from v1 embedded ACL > serializations. > The advantages would be reduction of unnecessary duplication, and, like ACLs > at other granularities, a GRANT or REVOKE which updates the ACL table will > update access control rules for all affected cells. The disadvantage would be > in order to process the reference to the ACL for each cell with an ACL > reference in a tag we will need to look up the ACL in the ACL table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19842) Cell ACLs v2
[ https://issues.apache.org/jira/browse/HBASE-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-19842: --- Description: Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL in a tag stored with each cell. This was done for performance. This has some drawbacks, most significantly unnecessary duplication and to grant or revoke requires a rewrite of every affected cell. We could implement them in a space efficient (and management efficient) way at the cost of some performance like so: First, allow storage of cell level ACLs in the ACL table. Rowkey would be a generic identifier of some kind that can be distinguished from existing rowkeys that associate the ACL with a cf, or table, or namespace. Existing code for cf/table/namespace ACLs should ignore rows that do not conform to today's keying strategy. Then provide the option for storing the rowkey of an entry in the ACL table in the cell ACL tag instead of the complete serialization. The advantages would be reduction of unnecessary duplication, and, like ACLs at other granularities, a GRANT or REVOKE which updates the ACL table will update access control rules for all affected cells. The disadvantage would be in order to process the reference to the ACL for each cell with an ACL reference in a tag we will need to look up the ACL in the ACL table. was: Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL in a tag stored with each cell. This was done for performance. This has some drawbacks, most significantly unnecessary duplication and to grant or revoke requires a rewrite of every affected cell. We could implement them in a space efficient (and management efficient) way at the cost of some performance like so: First, allow storage of cell level ACLs in the ACL table. Rowkey could be hash of serialized ACL format. Just have to avoid using rowkeys that associate the ACL with a cf, or table, or namespace... And handle entries in the ACL tables which don't conform to today's keying strategy. Then provide the option for storing the rowkey of an entry in the ACL table in the cell ACL tag instead of the complete serialization. The advantages would be reduction of unnecessary duplication, and, like ACLs at other granularities, a GRANT or REVOKE which updates the ACL table will update access control rules for all affected cells. The disadvantage would be in order to process the reference to the ACL for each cell with an ACL reference in a tag we will need to look up the ACL in the ACL table. > Cell ACLs v2 > > > Key: HBASE-19842 > URL: https://issues.apache.org/jira/browse/HBASE-19842 > Project: HBase > Issue Type: New Feature > Components: security >Reporter: Andrew Purtell >Assignee: Thomas D'Silva >Priority: Major > > Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL > in a tag stored with each cell. This was done for performance. This has some > drawbacks, most significantly unnecessary duplication and to grant or revoke > requires a rewrite of every affected cell. We could implement them in a space > efficient (and management efficient) way at the cost of some performance like > so: > First, allow storage of cell level ACLs in the ACL table. Rowkey would be a > generic identifier of some kind that can be distinguished from existing > rowkeys that associate the ACL with a cf, or table, or namespace. Existing > code for cf/table/namespace ACLs should ignore rows that do not conform to > today's keying strategy. > Then provide the option for storing the rowkey of an entry in the ACL table > in the cell ACL tag instead of the complete serialization. > The advantages would be reduction of unnecessary duplication, and, like ACLs > at other granularities, a GRANT or REVOKE which updates the ACL table will > update access control rules for all affected cells. The disadvantage would be > in order to process the reference to the ACL for each cell with an ACL > reference in a tag we will need to look up the ACL in the ACL table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19842) Cell ACLs v2
[ https://issues.apache.org/jira/browse/HBASE-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-19842: --- Description: Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL in a tag stored with each cell. This was done for performance. This has some drawbacks, most significantly unnecessary duplication and to grant or revoke requires a rewrite of every affected cell. We could implement them in a space efficient (and management efficient) way at the cost of some performance like so: First, allow storage of cell level ACLs in the ACL table. Rowkey could be hash of serialized ACL format. Just have to avoid using rowkeys that associate the ACL with a cf, or table, or namespace... And handle entries in the ACL tables which don't conform to today's keying strategy. Then provide the option for storing the rowkey of an entry in the ACL table in the cell ACL tag instead of the complete serialization. The advantages would be reduction of unnecessary duplication, and, like ACLs at other granularities, a GRANT or REVOKE which updates the ACL table will update access control rules for all affected cells. The disadvantage would be in order to process the reference to the ACL for each cell with an ACL reference in a tag we will need to look up the ACL in the ACL table. was: Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL in a tag stored with each cell. This was done for performance. This has some drawbacks, most significantly unnecessary duplication and to grant or revoke this requires a rewrite of every affected cell. We could implement them in a space efficient (and management efficient way) at the cost of some performance like so: First, allow storage of cell level ACLs in the ACL table. Rowkey could be hash of serialized ACL format. Just have to avoid using rowkeys that associate the ACL with a cf, or table, or namespace... And handle entries in the ACL tables which don't conform to today's keying strategy. Then provide the option for storing the rowkey of an entry in the ACL table in the cell ACL tag instead of the complete serialization. The advantages would be reduction of unnecessary duplication, and, like ACLs at other granularities, a GRANT or REVOKE which updates the ACL table will update access control rules for all affected cells. The disadvantage would be in order to process the reference to the ACL for each cell with an ACL reference in a tag we will need to look up the ACL in the ACL table. > Cell ACLs v2 > > > Key: HBASE-19842 > URL: https://issues.apache.org/jira/browse/HBASE-19842 > Project: HBase > Issue Type: New Feature > Components: security >Reporter: Andrew Purtell >Assignee: Thomas D'Silva >Priority: Major > > Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL > in a tag stored with each cell. This was done for performance. This has some > drawbacks, most significantly unnecessary duplication and to grant or revoke > requires a rewrite of every affected cell. We could implement them in a space > efficient (and management efficient) way at the cost of some performance like > so: > First, allow storage of cell level ACLs in the ACL table. Rowkey could be > hash of serialized ACL format. Just have to avoid using rowkeys that > associate the ACL with a cf, or table, or namespace... And handle entries in > the ACL tables which don't conform to today's keying strategy. > Then provide the option for storing the rowkey of an entry in the ACL table > in the cell ACL tag instead of the complete serialization. > The advantages would be reduction of unnecessary duplication, and, like ACLs > at other granularities, a GRANT or REVOKE which updates the ACL table will > update access control rules for all affected cells. The disadvantage would be > in order to process the reference to the ACL for each cell with an ACL > reference in a tag we will need to look up the ACL in the ACL table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-19842) Cell ACLs v2
[ https://issues.apache.org/jira/browse/HBASE-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell reassigned HBASE-19842: -- Assignee: Thomas D'Silva > Cell ACLs v2 > > > Key: HBASE-19842 > URL: https://issues.apache.org/jira/browse/HBASE-19842 > Project: HBase > Issue Type: New Feature > Components: security >Reporter: Andrew Purtell >Assignee: Thomas D'Silva >Priority: Major > > Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL > in a tag stored with each cell. This was done for performance. This has some > drawbacks, most significantly unnecessary duplication and to grant or revoke > this requires a rewrite of every affected cell. We could implement them in a > space efficient (and management efficient way) at the cost of some > performance like so: > First, allow storage of cell level ACLs in the ACL table. Rowkey could be > hash of serialized ACL format. Just have to avoid using rowkeys that > associate the ACL with a cf, or table, or namespace... And handle entries in > the ACL tables which don't conform to today's keying strategy. > Then provide the option for storing the rowkey of an entry in the ACL table > in the cell ACL tag instead of the complete serialization. > The advantages would be reduction of unnecessary duplication, and, like ACLs > at other granularities, a GRANT or REVOKE which updates the ACL table will > update access control rules for all affected cells. The disadvantage would be > in order to process the reference to the ACL for each cell with an ACL > reference in a tag we will need to look up the ACL in the ACL table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19842) Cell ACLs v2
[ https://issues.apache.org/jira/browse/HBASE-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361476#comment-16361476 ] Andrew Purtell commented on HBASE-19842: [~tdsilva] assigned to you! Thanks > Cell ACLs v2 > > > Key: HBASE-19842 > URL: https://issues.apache.org/jira/browse/HBASE-19842 > Project: HBase > Issue Type: New Feature > Components: security >Reporter: Andrew Purtell >Assignee: Thomas D'Silva >Priority: Major > > Per cell ACLs as currently implemented (HBASE-7662) embed the serialized ACL > in a tag stored with each cell. This was done for performance. This has some > drawbacks, most significantly unnecessary duplication and to grant or revoke > this requires a rewrite of every affected cell. We could implement them in a > space efficient (and management efficient way) at the cost of some > performance like so: > First, allow storage of cell level ACLs in the ACL table. Rowkey could be > hash of serialized ACL format. Just have to avoid using rowkeys that > associate the ACL with a cf, or table, or namespace... And handle entries in > the ACL tables which don't conform to today's keying strategy. > Then provide the option for storing the rowkey of an entry in the ACL table > in the cell ACL tag instead of the complete serialization. > The advantages would be reduction of unnecessary duplication, and, like ACLs > at other granularities, a GRANT or REVOKE which updates the ACL table will > update access control rules for all affected cells. The disadvantage would be > in order to process the reference to the ACL for each cell with an ACL > reference in a tag we will need to look up the ACL in the ACL table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19981) Boolean#getBoolean is used to parse value
[ https://issues.apache.org/jira/browse/HBASE-19981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-19981: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.4.2 Status: Resolved (was: Patch Available) Thanks for the patch, Janos. > Boolean#getBoolean is used to parse value > - > > Key: HBASE-19981 > URL: https://issues.apache.org/jira/browse/HBASE-19981 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Janos Gub >Priority: Major > Fix For: 1.4.2 > > Attachments: HBASE-19981.branch-1.001.patch > > > In HColumnDescriptor of branch-1: > {code} > value.set(Bytes.toBytes( > Boolean.getBoolean(Bytes.toString(value.get())) > {code} > According to > https://docs.oracle.com/javase/7/docs/api/java/lang/Boolean.html#getBoolean(java.lang.String): > {code} > Returns true if and only if the system property named by the argument exists > and is equal to the string "true" > {code} > This was not the intention of the quoted code. > This was discovered by Fortify. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19985) Redundant instanceof check in ProtobufUtil#getServiceException
Ted Yu created HBASE-19985: -- Summary: Redundant instanceof check in ProtobufUtil#getServiceException Key: HBASE-19985 URL: https://issues.apache.org/jira/browse/HBASE-19985 Project: HBase Issue Type: Bug Affects Versions: 1.4.1 Reporter: Ted Yu {code} public static IOException getServiceException(ServiceException e) { Throwable t = e; if (e instanceof ServiceException) { t = e.getCause(); {code} The instanceof check would always return true. This was reported by https://builds.apache.org/job/PreCommit-HBASE-Build/11495/artifact/patchprocess/branch-findbugs-hbase-client-warnings.html#Warnings_STYLE -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19984) Add hadoop 2.8 and 2.9 to precommit
[ https://issues.apache.org/jira/browse/HBASE-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361436#comment-16361436 ] Appy commented on HBASE-19984: -- Basically means - we're not confident if it's fine to use it, so use at your own peril. > Add hadoop 2.8 and 2.9 to precommit > --- > > Key: HBASE-19984 > URL: https://issues.apache.org/jira/browse/HBASE-19984 > Project: HBase > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-16060) 1.x clients cannot access table state talking to 2.0 cluster
[ https://issues.apache.org/jira/browse/HBASE-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361435#comment-16361435 ] Hudson commented on HBASE-16060: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4573 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4573/]) HBASE-16060 1.x clients cannot access table state talking to 2.0 cluster (stack: rev 67b69fb2c70d3a56ac45f59d57b7f2778094a566) * (delete) hbase-client/src/main/java/org/apache/hadoop/hbase/CoordinatedStateException.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZNodePaths.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/util/ZKDataMigrator.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestTableStateManager.java * (edit) hbase-protocol-shaded/src/main/protobuf/ZooKeeper.proto * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMirroringTableStateManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MirroringTableStateManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ProcedureSyncWait.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java > 1.x clients cannot access table state talking to 2.0 cluster > > > Key: HBASE-16060 > URL: https://issues.apache.org/jira/browse/HBASE-16060 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: stack >Priority: Blocker > Fix For: 2.0.0-beta-2 > > Attachments: > 0002-HBASE-16060-1.x-clients-cannot-access-table-state-ta.patch, > HBASE-16060.branch-2.001.patch, HBASE-16060.branch-2.002.patch, > HBASE-16060.branch-2.003.patch > > > Since table state is migrated to meta instead of zk in 2.0, 1.x clients > talking to 2.0 cluster cannot access the table state. This causes some weird > behavior since from a client perspective, {{Admin.isTableEnabled()}} and > {{Admin.isTableDisabled()}} both return false. > One option we can do is to add code in 1.x clients so that they can access > the table state in meta if needed. Otherwise, we can mirror the table state > in zk (while keeping meta as the source of truth) during 2.x lifecycle so > that any 1.x client can still work correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19984) Add hadoop 2.8 and 2.9 to precommit
[ https://issues.apache.org/jira/browse/HBASE-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361432#comment-16361432 ] Mike Drob commented on HBASE-19984: --- 2.8.3 is marked stable (as was 2.8.2) 2.9.0 is labeled bq. Please note: Although this release has been tested on fairly large clusters, production users can wait for a subsequent point release which will contain fixes from further stabilization and downstream adoption. I'm actually not sure what this means. > Add hadoop 2.8 and 2.9 to precommit > --- > > Key: HBASE-19984 > URL: https://issues.apache.org/jira/browse/HBASE-19984 > Project: HBase > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19984) Add hadoop 2.8 and 2.9 to precommit
[ https://issues.apache.org/jira/browse/HBASE-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361423#comment-16361423 ] Appy commented on HBASE-19984: -- Or maybe it's this bq. Hadoop version 2.8.0 and 2.8.1 are not tested or supported as the Hadoop PMC has explicitly labeled that releases as not being stable. So are the new hadoop versions are marked stable? > Add hadoop 2.8 and 2.9 to precommit > --- > > Key: HBASE-19984 > URL: https://issues.apache.org/jira/browse/HBASE-19984 > Project: HBase > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19984) Add hadoop 2.8 and 2.9 to precommit
[ https://issues.apache.org/jira/browse/HBASE-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361423#comment-16361423 ] Appy edited comment on HBASE-19984 at 2/12/18 8:47 PM: --- Or maybe it's this bq. Hadoop version 2.8.0 and 2.8.1 are not tested or supported as the Hadoop PMC has explicitly labeled that releases as not being stable. So are the new hadoop versions marked stable? was (Author: appy): Or maybe it's this bq. Hadoop version 2.8.0 and 2.8.1 are not tested or supported as the Hadoop PMC has explicitly labeled that releases as not being stable. So are the new hadoop versions are marked stable? > Add hadoop 2.8 and 2.9 to precommit > --- > > Key: HBASE-19984 > URL: https://issues.apache.org/jira/browse/HBASE-19984 > Project: HBase > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)