[jira] [Created] (HBASE-21511) Remove in progress snapshot check in SnapshotFileCache#getUnreferencedFiles
Ted Yu created HBASE-21511: -- Summary: Remove in progress snapshot check in SnapshotFileCache#getUnreferencedFiles Key: HBASE-21511 URL: https://issues.apache.org/jira/browse/HBASE-21511 Project: HBase Issue Type: Improvement Reporter: Ted Yu Attachments: 21511.v1.txt During review of HBASE-21387, [~Apache9] mentioned that the check for in progress snapshots in SnapshotFileCache#getUnreferencedFiles is no longer needed now that snapshot hfile cleaner and taking snapshot are mutually exclusive. This issue is to address the review comment by removing the check for in progress snapshots. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21387) Race condition surrounding in progress snapshot handling in snapshot cache leads to loss of snapshot files
[ https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21387: > Race condition surrounding in progress snapshot handling in snapshot cache > leads to loss of snapshot files > -- > > Key: HBASE-21387 > URL: https://issues.apache.org/jira/browse/HBASE-21387 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Labels: snapshot > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.0.3, 1.4.9, 2.1.2, 1.2.10 > > Attachments: 0001-UT.patch, 21387-suggest.txt, 21387.dbg.txt, > 21387.v10.txt, 21387.v11.txt, 21387.v12.txt, 21387.v2.txt, 21387.v3.txt, > 21387.v6.txt, 21387.v7.txt, 21387.v8.txt, 21387.v9.txt, > HBASE-21387.branch-1.2.patch, HBASE-21387.branch-1.3.patch, > HBASE-21387.branch-1.patch, HBASE-21387.v13.patch, HBASE-21387.v14.patch, > HBASE-21387.v15.patch, HBASE-21387.v16.patch, HBASE-21387.v17.patch, > two-pass-cleaner.v4.txt, two-pass-cleaner.v6.txt, two-pass-cleaner.v9.txt > > > During recent report from customer where ExportSnapshot failed: > {code} > 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] > snapshot.SnapshotReferenceUtil: Can't find hfile: > 44f6c3c646e84de6a63fe30da4fcb3aa in the real > (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa) > or archive > (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa) > directory for the primary table. > {code} > We found the following in log: > {code} > 2018-10-09 18:54:23,675 DEBUG > [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] > cleaner.HFileCleaner: Removing: > hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa > from archive > {code} > The root cause is race condition surrounding in progress snapshot(s) handling > between refreshCache() and getUnreferencedFiles(). > There are two callers of refreshCache: one from RefreshCacheTask#run and the > other from SnapshotHFileCleaner. > Let's look at the code of refreshCache: > {code} > if (!name.equals(SnapshotDescriptionUtils.SNAPSHOT_TMP_DIR_NAME)) { > {code} > whose intention is to exclude in progress snapshot(s). > Suppose when the RefreshCacheTask runs refreshCache, there is some in > progress snapshot (about to finish). > When SnapshotHFileCleaner calls getUnreferencedFiles(), it sees that > lastModifiedTime is up to date. So cleaner proceeds to check in progress > snapshot(s). However, the snapshot has completed by that time, resulting in > some file(s) deemed unreferenced. > Here is timeline given by Josh illustrating the scenario: > At time T0, we are checking if F1 is referenced. At time T1, there is a > snapshot S1 in progress that is referencing a file F1. refreshCache() is > called, but no completed snapshot references F1. At T2, the snapshot S1, > which references F1, completes. At T3, we check in-progress snapshots and S1 > is not included. Thus, F1 is marked as unreferenced even though S1 references > it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21482) TestHRegion fails due to 'Too many open files'
Ted Yu created HBASE-21482: -- Summary: TestHRegion fails due to 'Too many open files' Key: HBASE-21482 URL: https://issues.apache.org/jira/browse/HBASE-21482 Project: HBase Issue Type: Bug Reporter: Ted Yu TestHRegion fails due to 'Too many open files' in master branch. Here is one failed subtest : {code} testCheckAndDelete_ThatDeleteWasWritten(org.apache.hadoop.hbase.regionserver.TestHRegion) Time elapsed: 2.373 sec <<< ERROR! java.lang.IllegalStateException: failed to create a child event loop at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4853) at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4844) at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4835) at org.apache.hadoop.hbase.regionserver.TestHRegion.testCheckAndDelete_ThatDeleteWasWritten(TestHRegion.java:2034) Caused by: org.apache.hbase.thirdparty.io.netty.channel.ChannelException: failed to open a new selector at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4853) at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4844) at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4835) at org.apache.hadoop.hbase.regionserver.TestHRegion.testCheckAndDelete_ThatDeleteWasWritten(TestHRegion.java:2034) Caused by: java.io.IOException: Too many open files at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4853) at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4844) at org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4835) at org.apache.hadoop.hbase.regionserver.TestHRegion.testCheckAndDelete_ThatDeleteWasWritten(TestHRegion.java:2034) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21479) TestHRegionReplayEvents#testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent fails with IndexOutOfBoundsException
Ted Yu created HBASE-21479: -- Summary: TestHRegionReplayEvents#testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent fails with IndexOutOfBoundsException Key: HBASE-21479 URL: https://issues.apache.org/jira/browse/HBASE-21479 Project: HBase Issue Type: Bug Reporter: Ted Yu The test fails in both master branch and branch-2 : {code} testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent(org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents) Time elapsed: 3.74 sec <<< ERROR! java.lang.IndexOutOfBoundsException: Index: 2, Size: 1 at org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents.testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent(TestHRegionReplayEvents.java:1042) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New HBase committer Jingyun Tian
Congratulations, Jingyun! Original message From: Srinivas Reddy Date: 11/13/18 12:46 AM (GMT-08:00) To: dev@hbase.apache.org Cc: Hbase-User Subject: Re: [ANNOUNCE] New HBase committer Jingyun Tian Congratulations Jingyun-Srinivas- Typed on tiny keys. pls ignore typos.{mobile app}On Tue 13 Nov, 2018, 15:54 张铎(Duo Zhang) On behalf of the Apache HBase PMC, I am pleased to announce that Jingyun> Tian has accepted the PMC's invitation to become a committer on the> project. We appreciate all of Jingyun's generous contributions thus far and> look forward to his continued involvement.>> Congratulations and welcome, Jingyun!>
[jira] [Created] (HBASE-21466) WALProcedureStore uses wrong FileSystem if wal.dir is on different FileSystem as rootdir
Ted Yu created HBASE-21466: -- Summary: WALProcedureStore uses wrong FileSystem if wal.dir is on different FileSystem as rootdir Key: HBASE-21466 URL: https://issues.apache.org/jira/browse/HBASE-21466 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu In WALProcedureStore ctor , the fs field is initialized this way: {code} this.fs = walDir.getFileSystem(conf); {code} However, when wal.dir is on different FileSystem as rootdir, the above would return wrong FileSystem. In the modified TestMasterProcedureEvents, without fix, the master wouldn't initialize. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21457) BackupUtils#getWALFilesOlderThan refers to wrong FileSystem
Ted Yu created HBASE-21457: -- Summary: BackupUtils#getWALFilesOlderThan refers to wrong FileSystem Key: HBASE-21457 URL: https://issues.apache.org/jira/browse/HBASE-21457 Project: HBase Issue Type: Bug Reporter: Janos Gub Janos reported seeing backup test failure when testing a local HDFS for WALs while using WASB/ADLS only for store files. Janos spotted the code in BackupUtils#getWALFilesOlderThan which uses HBase root dir for retrieving WAL files. We should use the helper methods from CommonFSUtils. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21247) Custom Meta WAL Provider doesn't default to custom WAL Provider whose configuration value is outside the enums in Providers
[ https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21247: > Custom Meta WAL Provider doesn't default to custom WAL Provider whose > configuration value is outside the enums in Providers > --- > > Key: HBASE-21247 > URL: https://issues.apache.org/jira/browse/HBASE-21247 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0 > > Attachments: 21247.branch-2.patch, 21247.v1.txt, 21247.v10.txt, > 21247.v11.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, > 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt > > > Currently all the WAL Providers acceptable to hbase are specified in > Providers enum of WALFactory. > This restricts the ability for custom Meta WAL Provider to default to the > custom WAL Provider which is supplied by class name. > This issue fixes the bug by allowing the specification of new WAL Provider > class name using the config "hbase.wal.provider". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21438) TestAdmin2#testGetProcedures fails due to FailedProcedure inaccessible
Ted Yu created HBASE-21438: -- Summary: TestAdmin2#testGetProcedures fails due to FailedProcedure inaccessible Key: HBASE-21438 URL: https://issues.apache.org/jira/browse/HBASE-21438 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu >From >https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1863/testReport/org.apache.hadoop.hbase.client/TestAdmin2/testGetProcedures/ > : {code} Mon Nov 05 04:52:13 UTC 2018, RpcRetryingCaller{globalStartTime=1541393533029, pause=250, maxAttempts=7}, org.apache.hadoop.hbase.procedure2.BadProcedureException: org.apache.hadoop.hbase.procedure2.BadProcedureException: The procedure class org.apache.hadoop.hbase.procedure2.FailedProcedure must be accessible and have an empty constructor at org.apache.hadoop.hbase.procedure2.ProcedureUtil.validateClass(ProcedureUtil.java:82) at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProtoProcedure(ProcedureUtil.java:162) at org.apache.hadoop.hbase.master.MasterRpcServices.getProcedures(MasterRpcServices.java:1249) at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21416) Intermittent TestRegionInfoDisplay failure due to shift in relTime of RegionState#toDescriptiveString
Ted Yu created HBASE-21416: -- Summary: Intermittent TestRegionInfoDisplay failure due to shift in relTime of RegionState#toDescriptiveString Key: HBASE-21416 URL: https://issues.apache.org/jira/browse/HBASE-21416 Project: HBase Issue Type: Test Reporter: Ted Yu Over https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2.1/1799/testReport/junit/org.apache.hadoop.hbase.client/TestRegionInfoDisplay/testRegionDetailsForDisplay/ : {code} org.junit.ComparisonFailure: expected:<...:30 UTC 2018 (PT0.00[6]S ago), server=null> but was:<...:30 UTC 2018 (PT0.00[7]S ago), server=null> at org.apache.hadoop.hbase.client.TestRegionInfoDisplay.testRegionDetailsForDisplay(TestRegionInfoDisplay.java:78) {code} Here is how toDescriptiveString composes relTime: {code} long relTime = System.currentTimeMillis() - stamp; {code} In the test, state.toDescriptiveString() is called twice for the assertion where different return values from System.currentTimeMillis() caused the assertion to fail in the above occasion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module
[ https://issues.apache.org/jira/browse/HBASE-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-21180. Resolution: Cannot Reproduce > findbugs incurs DataflowAnalysisException for hbase-server module > - > > Key: HBASE-21180 > URL: https://issues.apache.org/jira/browse/HBASE-21180 > Project: HBase > Issue Type: Task > Reporter: Ted Yu >Priority: Minor > > Running findbugs, I noticed the following in hbase-server module: > {code} > [INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server > --- > [INFO] Fork Value is true > [java] The following errors occurred during analysis: > [java] Error generating derefs for > org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V > [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get > position -1 of stack > [java] At > edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250) > [java] At > edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743) > [java] At > edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141) > [java] At > edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50) > [java] At > edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31) > [java] At > edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369) > [java] At > edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322) > [java] At > edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005) > [java] At > edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325) > [java] At > edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510) > [java] At > edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361) > [java] At > edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266) > [java] At > edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164) > [java] At > edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278) > [java] At > edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209) > [java] At > edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76) > [java] At > edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089) > [java] At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283) > [java] At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393) > [java] At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200) > [java] The following classes needed for analysis were missing: > [java] accept > [java] apply > [java] run > [java] test > [java] call > [java] exec > [java] getAsInt > [java] applyAsLong > [java] storeFile > [java] get > [java] visit > [java] compare > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files
Ted Yu created HBASE-21387: -- Summary: Race condition in snapshot cache refreshing leads to loss of snapshot files Key: HBASE-21387 URL: https://issues.apache.org/jira/browse/HBASE-21387 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu During recent report from customer where ExportSnapshot failed: {code} 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] snapshot.SnapshotReferenceUtil: Can't find hfile: 44f6c3c646e84de6a63fe30da4fcb3aa in the real (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa) or archive (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa) directory for the primary table. {code} We found the following in log: {code} 2018-10-09 18:54:23,675 DEBUG [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] cleaner.HFileCleaner: Removing: hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa from archive {code} The root cause is race condition surrounding SnapshotFileCache#refreshCache(). There are two callers of refreshCache: one from RefreshCacheTask#run and the other from SnapshotHFileCleaner. Let's look at the code of refreshCache: {code} // if the snapshot directory wasn't modified since we last check, we are done if (dirStatus.getModificationTime() <= this.lastModifiedTime) return; // 1. update the modified time this.lastModifiedTime = dirStatus.getModificationTime(); // 2.clear the cache this.cache.clear(); {code} Suppose the RefreshCacheTask runs past the if check and sets this.lastModifiedTime The cleaner executes refreshCache and returns immediately since this.lastModifiedTime matches the modification time of the directory. Now RefreshCacheTask clears the cache. By the time the cleaner performs cache lookup, the cache is empty. Therefore cleaner puts the file into unReferencedFiles - leading to data loss. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21318) Make RefreshHFilesClient runnable
[ https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21318: > Make RefreshHFilesClient runnable > - > > Key: HBASE-21318 > URL: https://issues.apache.org/jira/browse/HBASE-21318 > Project: HBase > Issue Type: Improvement > Components: HFile >Affects Versions: 3.0.0, 1.5.0, 2.1.2 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-21318.master.001.patch, > HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, > HBASE-21318.master.004.patch > > > Other than when user enables hbase.coprocessor.region.classes with > RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI > and calls refresh HFiles directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure
[ https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-21149. Resolution: Duplicate Fix Version/s: (was: 3.0.0) > TestIncrementalBackupWithBulkLoad may fail due to file copy failure > --- > > Key: HBASE-21149 > URL: https://issues.apache.org/jira/browse/HBASE-21149 > Project: HBase > Issue Type: Test > Components: backuprestore > Reporter: Ted Yu >Assignee: Ted Yu >Priority: Critical > Attachments: 21149.v2.txt, HBASE-21149-v1.patch, > testIncrementalBackupWithBulkLoad-output.txt > > > From > https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/ > : > {code} > 2018-09-03 11:54:30,526 ERROR [Time-limited test] > impl.TableBackupClient(235): Unexpected Exception : Failed copy from > hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_ > to hdfs://localhost:53075/backupUT/backup_1535975655488 > java.io.IOException: Failed copy from > hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_ > to hdfs://localhost:53075/backupUT/backup_1535975655488 > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605) > at > org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > {code} > However, some part of the test output was lost: > {code} > 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions > ...[truncated 398396 chars]... > 8) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21381) Document the hadoop versions using which backup and restore feature works
Ted Yu created HBASE-21381: -- Summary: Document the hadoop versions using which backup and restore feature works Key: HBASE-21381 URL: https://issues.apache.org/jira/browse/HBASE-21381 Project: HBase Issue Type: Task Reporter: Ted Yu HADOOP-15850 fixes a bug where CopyCommitter#concatFileChunks unconditionally tried to concatenate the files being DistCp'ed to target cluster (though the files are independent). Following is the log snippet of the failed concatenation attempt: {code} 2018-10-13 14:09:25,351 WARN [Thread-936] mapred.LocalJobRunner$Job(590): job_local1795473782_0004 java.io.IOException: Inconsistent sequence file: current chunk file org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/ 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_ length = 5100 aclEntries = null, xAttrs = null} doesnt match prior entry org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e- 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_ length = 5142 aclEntries = null, xAttrs = null} at org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276) at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567) {code} Backup and Restore uses DistCp to transfer files between clusters. Without the fix from HADOOP-15850, the transfer would fail. This issue is to document the hadoop versions which contain HADOOP-15850 so that user of Backup and Restore feature knows which hadoop versions they can use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21353) TestHBCKCommandLineParsing#testCommandWithOptions hangs on call to HBCK2#checkHBCKSupport
Ted Yu created HBASE-21353: -- Summary: TestHBCKCommandLineParsing#testCommandWithOptions hangs on call to HBCK2#checkHBCKSupport Key: HBASE-21353 URL: https://issues.apache.org/jira/browse/HBASE-21353 Project: HBase Issue Type: Test Reporter: Ted Yu I noticed the following when running TestHBCKCommandLineParsing#testCommandWithOptions : {code} "main" #1 prio=5 os_prio=31 tid=0x7f851c80 nid=0x1703 waiting on condition [0x70216000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00076d3055d8> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:564) at org.apache.hadoop.hbase.client.ConnectionImplementation.(ConnectionImplementation.java:297) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:229) at org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$11/502838712.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:347) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:227) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:127) at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:93) at org.apache.hbase.HBCK2.run(HBCK2.java:352) at org.apache.hbase.TestHBCKCommandLineParsing.testCommandWithOptions(TestHBCKCommandLineParsing.java:62) {code} The test doesn't spin up hbase cluster. Hence the call to check hbck support hangs. In HBCK2#run, we can refactor the code such that argument parsing is done prior to calling HBCK2#checkHBCKSupport . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21281) Update bouncycastle dependency.
[ https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21281: > Update bouncycastle dependency. > --- > > Key: HBASE-21281 > URL: https://issues.apache.org/jira/browse/HBASE-21281 > Project: HBase > Issue Type: Task > Components: dependencies, test >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21281.addendum.patch, 21281.addendum2.patch, > HBASE-21281.001.branch-2.0.patch > > > Looks like we still depend on bcprov-jdk16 for some x509 certificate > generation in our tests. Bouncycastle has moved beyond this in 1.47, changing > the artifact names. > [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later] > There are some API changes too, but it looks like we don't use any of these. > It seems like we also have vestiges in the POMs from when we were depending > on a specific BC version that came in from Hadoop. We now have a > KeyStoreTestUtil class in HBase, which makes me think we can also clean up > some dependencies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21341) DeadServer shouldn't import unshaded Preconditions
Ted Yu created HBASE-21341: -- Summary: DeadServer shouldn't import unshaded Preconditions Key: HBASE-21341 URL: https://issues.apache.org/jira/browse/HBASE-21341 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu DeadServer currently imports unshaded Preconditions : {code} import com.google.common.base.Preconditions; {code} We should import shaded version of Preconditions. This is the only place where unshaded class from com.google.common is imported -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] Please welcome Zach York to the HBase PMC
Congratulations, Zach ! On Thu, Oct 11, 2018 at 1:01 PM Sean Busbey wrote: > On behalf of the Apache HBase PMC I am pleased to announce that Zach > York has accepted our invitation to become a PMC member on the Apache > HBase project. We appreciate Zach stepping up to take more > responsibility in the HBase project. > > Please join me in welcoming Zach to the HBase PMC! > > As a reminder, if anyone would like to nominate another person as a > committer or PMC member, even if you are not currently a committer or > PMC member, you can always drop a note to priv...@hbase.apache.org to > let us know. >
Re: Does Hbase backup process support encryption while transporting the data from one cluster to other cluster
bq. Does copyTable support hashing of data while copying? No. bq. Same for distcp utility ? The above would get better answer posting on hadoop mailing list. Thanks On Tue, Oct 9, 2018 at 5:28 AM neo0731 wrote: > > Question arises when migrating the data from one hbase table to another. > > Input > > To sync the production cluster data with dev cluster. Additionaly, while > copying we need to re-hash the following fields: hashed_email, lexer_id, > foo_imsi, foo_msn, signal_uid, bar_imsi. > > Question is : Does copyTable support hashing of data while copying? Same > for > distcp utility ? Is it possible to supply some example code in scala as > well > > Any help on it would be much appreciated? > > > > -- > Sent from: > http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html >
[jira] [Created] (HBASE-21279) Split TestAdminShell into several tests
Ted Yu created HBASE-21279: -- Summary: Split TestAdminShell into several tests Key: HBASE-21279 URL: https://issues.apache.org/jira/browse/HBASE-21279 Project: HBase Issue Type: Test Reporter: Ted Yu In the flaky test board, TestAdminShell often timed out (https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/branch-2/lastSuccessfulBuild/artifact/dashboard.html). I ran the test on Linux with SSD and reproduced the timeout (see attached test output). {code} 2018-10-08 02:36:09,146 DEBUG [main] hbase.HBaseTestingUtility(351): Setting hbase.rootdir to /mnt/disk2/a/2-hbase/hbase-shell/target/test-data/a103d8e4-695c-a5a9-6690-1ef2580050f9 ... 2018-10-08 02:49:09,093 DEBUG [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=7] master.MasterRpcServices(1171): Checking to see if procedure is done pid=871 Took 0.7262 seconds2018-10-08 02:49:09,324 DEBUG [PEWorker-1] util.FSTableDescriptors(684): Wrote into hdfs://localhost:43859/user/hbase/test-data/cefc73d9-cc37-d2a6-b92b- d935316c9241/.tmp/data/default/hbase_shell_tests_table/.tabledesc/.tableinfo.01 2018-10-08 02:49:09,328 INFO [RegionOpenAndInitThread-hbase_shell_tests_table-1] regionserver.HRegion(7004): creating HRegion hbase_shell_tests_table HTD == 'hbase_shell_tests_table', {NAME => 'x', VERSIONS => '5', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}, {NAME => 'y', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'} RootDir = hdfs://localhost:43859/ user/hbase/test-data/cefc73d9-cc37-d2a6-b92b-d935316c9241/.tmp Table name == hbase_shell_tests_table ^[[38;5;226mE^[[0m === Error: ^[[48;5;16;38;5;226;1mtest_Get_simple_status(Hbase::StatusTest)^[[0m: Java::JavaIo::InterruptedIOException: Interrupt while waiting on Operation: CREATE, Table Name: default:hbase_shell_tests_table, procId: 871 2018-10-08 02:49:09,361 INFO [Block report processor] blockmanagement.BlockManager(2645): BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:41338 is added to blk_1073742193_1369{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-ecc89143-e0a5-4a1c-b552-120be2561334:NORMAL:127.0.0.1: 41338|RBW]]} size 58 > TEST TIMED OUT. PRINTING THREAD DUMP. < {code} We can see that the procedure #871 wasn't stuck - the timeout cut in and stopped the test. We should separate the current test into two (or more) test files (with corresponding .rb) so that the execution time consistently would not exceed limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21272) Re-add assertions for RS Group admin tests
Ted Yu created HBASE-21272: -- Summary: Re-add assertions for RS Group admin tests Key: HBASE-21272 URL: https://issues.apache.org/jira/browse/HBASE-21272 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Fix For: 1.5.0 The checked in version of HBASE-21258 for branch-1 didn't include assertions for adding / removing RS group coprocessor hook calls. This issue is to add the assertions to corresponding tests in TestRSGroupsAdmin1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] The first HBase 1.4.8 release candidate (RC0) is available
+1 - verified checksums and signatures: good - basic checking on Web UI : good - ran test suite with : good Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) Maven home: /apache-maven-3.5.4 Java version: 1.8.0_161, vendor: Oracle Corporation, runtime: /mnt/disk2/a/jdk1.8.0_161/jre Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "3.10.0-327.28.3.el7.x86_64", arch: "amd64", family: "unix" - Ran LTT with 1M rows : good On Wed, Oct 3, 2018 at 9:34 AM Andrew Purtell wrote: > RC errata: > > TestRSGroups will pass in isolation but may fail when run as part of the > suite. There have been several JIRAs filed against this unit like > HBASE-19444, HBASE-19461, HBASE-20137, and mentioned on HBASE-21187. It's > running time is far too long. I have filed HBASE-21265 to split it up. Test > stabilization work would be a part of that. I don't think this rises to > the level of failing the RC vote because TestRSGroups will pass > consistently, at least for me, when run in isolation. I do agree that > without work to improve the test it doesn't offer the kind of functional > assurance we'd like to derive from a unit test. > > > On Tue, Oct 2, 2018 at 5:57 PM Andrew Purtell wrote: > > > The first HBase 1.4.8 release candidate (RC0) is available for download > at > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.8RC0/ and Maven > > artifacts are available in the temporary repository > > https://repository.apache.org/content/repositories/orgapachehbase-1233/ > > > > The git tag corresponding to the candidate is '1.4.8RC0' (91118ce5f1). > > > > A detailed source and binary compatibility report for this release is > > available for your review at > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.8RC0/compat-check-report.html > > . There are no reported compatibility issues. > > > > A list of the 33 issues resolved in this release can be found at > > https://s.apache.org/xpxo . > > > > Please try out the candidate and vote +1/0/-1. > > > > The vote will be open for at least 72 hours. Unless objection I will try > > to close it Monday October 8, 2018 if we have sufficient votes. > > > > Prior to making this announcement I made the following preflight checks: > > > > RAT check passes (7u80) > > Unit test suite passes (7u80, 8u181) > > Opened the UI in a browser, poked around > > LTT load 1M rows with 100% verification and 20% updates (8u181) > > ITBLL 500M rows with serverKilling monkey (8u181) > > > > > > -- > > Best regards, > > Andrew > > > > Words like orphans lost among the crosstalk, meaning torn from truth's > > decrepit hands > >- A23, Crosstalk > > > > > -- > Best regards, > Andrew > > Words like orphans lost among the crosstalk, meaning torn from truth's > decrepit hands >- A23, Crosstalk >
[jira] [Reopened] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
[ https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21221: > Ineffective assertion in TestFromClientSide3#testMultiRowMutations > -- > > Key: HBASE-21221 > URL: https://issues.apache.org/jira/browse/HBASE-21221 > Project: HBase > Issue Type: Test > Reporter: Ted Yu > Assignee: Ted Yu >Priority: Minor > Fix For: 3.0.0 > > Attachments: 21221.addendum.txt, 21221.v10.txt, 21221.v11.txt, > 21221.v12.txt, 21221.v7.txt, 21221.v8.txt, 21221.v9.txt > > > Observed the following in > org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): > java.io.IOException: Timed out waiting for lock for row: ROW-1 in region > 089bdfa75f44d88e596479038a6da18b > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) > at > org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) > at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) > at > org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) > at > org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) > at > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) > ... > Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp > should fail because the target lock is blocked by previous put > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > Here is related code: > {code} > cpService.execute(() -> { > ... > if (!threw) { > // Can't call fail() earlier because the catch would eat it. > fail("This cp should fail because the target lock is blocked by > previous put"); > } > {code} > Since the fail() call is executed by the cpService, the assertion had no > bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests
Ted Yu created HBASE-21261: -- Summary: Add log4j.properties for hbase-rsgroup tests Key: HBASE-21261 URL: https://issues.apache.org/jira/browse/HBASE-21261 Project: HBase Issue Type: Test Reporter: Ted Yu When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log. Turns out that under hbase-rsgroup/src/test/resources there is no log4j.properties This issue adds log4j.properties for hbase-rsgroup tests. This would be useful when finding root cause for hbase-rsgroup test failure(s). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21207: > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8 > > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 21207.branch-1.addendum.patch, 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, > HBASE-21207-branch-1.patch, HBASE-21207-branch-1.v1.patch, > HBASE-21207-branch-2.v1.patch, HBASE-21207.patch, HBASE-21207.patch, > HBASE-21207.v1.patch, edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups
Ted Yu created HBASE-21258: -- Summary: Add resetting of flags for RS Group pre/post hooks in TestRSGroups Key: HBASE-21258 URL: https://issues.apache.org/jira/browse/HBASE-21258 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS Group pre/post hooks in TestRSGroups was absent. This issue is to add the resetting of these flags before each subtest starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: QA run unable to process patches
>From https://builds.apache.org/job/PreCommit-HBASE-Build/14515/console : *22:32:24* [Thu Sep 27 22:32:24 UTC 2018 DEBUG]: jira_http_fetch: https://issues.apache.org/jira/browse/HBASE-21247 returned 4xx status code. Maybe incorrect username/password?*22:32:24* [Thu Sep 27 22:32:24 UTC 2018 DEBUG]: jira_locate_patch: not a JIRA. FYI On Thu, Sep 27, 2018 at 3:32 PM Sean Busbey wrote: > Try rebuilding with the debug flag on. > > On Thu, Sep 27, 2018, 14:17 Ted Yu wrote: > > > Hi, > > Starting this morning, some QA bot runs ended with something similar to > the > > following ( > > https://builds.apache.org/job/PreCommit-HBASE-Build/14508/console > > ): > > > > *05:00:34* ERROR: Unsure how to process HBASE-21242. > > > > > > I wonder if someone has idea where I should look in order to determine > > the root cause. > > > > > > Thanks > > >
QA run unable to process patches
Hi, Starting this morning, some QA bot runs ended with something similar to the following (https://builds.apache.org/job/PreCommit-HBASE-Build/14508/console ): *05:00:34* ERROR: Unsure how to process HBASE-21242. I wonder if someone has idea where I should look in order to determine the root cause. Thanks
[jira] [Created] (HBASE-21247) Allow WAL Provider to be specified by configuration without explicit enum in Providers
Ted Yu created HBASE-21247: -- Summary: Allow WAL Provider to be specified by configuration without explicit enum in Providers Key: HBASE-21247 URL: https://issues.apache.org/jira/browse/HBASE-21247 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 21247.v1.txt Currently all the WAL Providers acceptable to hbase are specified in Providers enum of WALFactory. This restricts the ability for additional WAL Providers to be supplied - by class name. This issue introduces additional config which allows the specification of new WAL Provider through class name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21246) Introduce WALIdentity interface
Ted Yu created HBASE-21246: -- Summary: Introduce WALIdentity interface Key: HBASE-21246 URL: https://issues.apache.org/jira/browse/HBASE-21246 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu We are introducing WALIdentity interface so that the WAL representation can be decoupled from distributed filesystem. The interface provides getName method whose return value can represent filename in distributed filesystem environment or, the name of the stream when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21238) MapReduceHFileSplitterJob#run shouldn't call System.exit
Ted Yu created HBASE-21238: -- Summary: MapReduceHFileSplitterJob#run shouldn't call System.exit Key: HBASE-21238 URL: https://issues.apache.org/jira/browse/HBASE-21238 Project: HBase Issue Type: Bug Reporter: Ted Yu {code} if (args.length < 2) { usage("Wrong number of arguments: " + args.length); System.exit(-1); {code} Correct way of handling error condition is through return value of run method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21230) BackupUtils#checkTargetDir doesn't compose error message correctly
Ted Yu created HBASE-21230: -- Summary: BackupUtils#checkTargetDir doesn't compose error message correctly Key: HBASE-21230 URL: https://issues.apache.org/jira/browse/HBASE-21230 Project: HBase Issue Type: Bug Components: backuprestore Reporter: Ted Yu Here is related code: {code} String expMsg = e.getMessage(); String newMsg = null; if (expMsg.contains("No FileSystem for scheme")) { newMsg = "Unsupported filesystem scheme found in the backup target url. Error Message: " + newMsg; {code} I think the intention was to concatenate expMsg at the end of newMsg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-16627) AssignmentManager#isDisabledorDisablingRegionInRIT should check whether table exists
[ https://issues.apache.org/jira/browse/HBASE-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-16627. Resolution: Later > AssignmentManager#isDisabledorDisablingRegionInRIT should check whether table > exists > > > Key: HBASE-16627 > URL: https://issues.apache.org/jira/browse/HBASE-16627 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu >Assignee: Stephen Yuan Jiang >Priority: Minor > > [~stack] first reported this issue when he played with backup feature. > The following exception can be observed in backup unit tests: > {code} > 2016-09-13 16:21:57,661 ERROR [ProcedureExecutor-3] > master.TableStateManager(134): Unable to get table hbase:backup state > org.apache.hadoop.hbase.TableNotFoundException: hbase:backup > at > org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:174) > at > org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:131) > at > org.apache.hadoop.hbase.master.AssignmentManager.isDisabledorDisablingRegionInRIT(AssignmentManager.java:1221) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:739) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1567) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1546) > at > org.apache.hadoop.hbase.util.ModifyRegionUtils.assignRegions(ModifyRegionUtils.java:254) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.assignRegions(CreateTableProcedure.java:430) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:127) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:57) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:119) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:452) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1066) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:855) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:808) > {code} > AssignmentManager#isDisabledorDisablingRegionInRIT should take table > existence into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations
Ted Yu created HBASE-21221: -- Summary: Ineffective assertion in TestFromClientSide3#testMultiRowMutations Key: HBASE-21221 URL: https://issues.apache.org/jira/browse/HBASE-21221 Project: HBase Issue Type: Test Reporter: Ted Yu Observed the following in org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt : {code} Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): java.io.IOException: Timed out waiting for lock for row: ROW-1 in region 089bdfa75f44d88e596479038a6da18b at org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816) at org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982) at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424) at org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116) at org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266) at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463) ... Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp should fail because the target lock is blocked by previous put at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) {code} Here is related code: {code} cpService.execute(() -> { ... if (!threw) { // Can't call fail() earlier because the catch would eat it. fail("This cp should fail because the target lock is blocked by previous put"); } {code} Since the fail() call is executed by the cpService, the assertion had no bearing on the outcome of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
Ted Yu created HBASE-21216: -- Summary: TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky Key: HBASE-21216 URL: https://issues.apache.org/jira/browse/HBASE-21216 Project: HBase Issue Type: Test Reporter: Ted Yu >From >https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/ > : {code} java.lang.AssertionError: Archived hfiles [] and table hfiles [9ca09392705f425f9c916beedc10d63c] is missing snapshot file:6739a09747e54189a4112a6d8f37e894 at org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370) {code} The file appeared in archive dir before hfile cleaners were run: {code} 2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): |-archive/ 2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): |data/ 2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): |---default/ 2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): |--test/ 2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): |-1237d57b63a7bdf067a930441a02514a/ 2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): |recovered.edits/ 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): |---4.seqid 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): |-29e1700e09b51223ad2f5811105a4d51/ 2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): |fam/ 2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): |---2c66a18f6c1a4074b84ffbb3245268c4 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): |---45bb396c6a5e49629e45a4d56f1e9b14 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): |---6739a09747e54189a4112a6d8f37e894 {code} However, the archive dir became empty after hfile cleaners were run: {code} 2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): |-archive/ 2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): |-corrupt/ {code} Leading to the assertion failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21198) Exclude dependency on net.minidev:json-smart
Ted Yu created HBASE-21198: -- Summary: Exclude dependency on net.minidev:json-smart Key: HBASE-21198 URL: https://issues.apache.org/jira/browse/HBASE-21198 Project: HBase Issue Type: Task Reporter: Ted Yu >From >https://builds.apache.org/job/PreCommit-HBASE-Build/14414/artifact/patchprocess/patch-javac-3.0.0.txt > : {code} [ERROR] Failed to execute goal on project hbase-common: Could not resolve dependencies for project org.apache.hbase:hbase-common:jar:3.0.0-SNAPSHOT: Failed to collect dependencies at org.apache.hadoop:hadoop-common:jar:3.0.0 -> org.apache.hadoop:hadoop-auth:jar:3.0.0 -> com.nimbusds:nimbus-jose-jwt:jar:4.41.1 -> net.minidev:json-smart:jar:2.3-SNAPSHOT: Failed to read artifact descriptor for net.minidev:json-smart:jar:2.3-SNAPSHOT: Could not transfer artifact net.minidev:json-smart:pom:2.3-SNAPSHOT from/to dynamodb-local-oregon (https://s3-us-west-2.amazonaws.com/dynamodb-local/release): Access denied to: https://s3-us-west-2.amazonaws.com/dynamodb-local/release/net/minidev/json-smart/2.3-SNAPSHOT/json-smart-2.3-SNAPSHOT.pom , ReasonPhrase:Forbidden. -> [Help 1] {code} We should exclude dependency on net.minidev:json-smart hbase-common/bin/pom.xml has done so. The other pom.xml should do the same. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21194) Add TestCopyTable which exercises MOB feature
Ted Yu created HBASE-21194: -- Summary: Add TestCopyTable which exercises MOB feature Key: HBASE-21194 URL: https://issues.apache.org/jira/browse/HBASE-21194 Project: HBase Issue Type: Test Reporter: Ted Yu Currently TestCopyTable doesn't cover table(s) with MOB feature enabled. We should add variant that enables MOB on the table being copied and verify that MOB content is copied correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module
Ted Yu created HBASE-21180: -- Summary: findbugs incurs DataflowAnalysisException for hbase-server module Key: HBASE-21180 URL: https://issues.apache.org/jira/browse/HBASE-21180 Project: HBase Issue Type: Task Reporter: Ted Yu Running findbugs, I noticed the following in hbase-server module: {code} [INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server --- [INFO] Fork Value is true [java] The following errors occurred during analysis: [java] Error generating derefs for org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get position -1 of stack [java] At edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250) [java] At edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743) [java] At edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141) [java] At edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50) [java] At edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31) [java] At edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369) [java] At edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322) [java] At edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005) [java] At edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325) [java] At edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510) [java] At edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361) [java] At edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266) [java] At edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164) [java] At edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278) [java] At edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209) [java] At edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76) [java] At edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089) [java] At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283) [java] At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393) [java] At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200) [java] The following classes needed for analysis were missing: [java] accept [java] apply [java] run [java] test [java] call [java] exec [java] getAsInt [java] applyAsLong [java] storeFile [java] get [java] visit [java] compare {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving
Ted Yu created HBASE-21175: -- Summary: Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving Key: HBASE-21175 URL: https://issues.apache.org/jira/browse/HBASE-21175 Project: HBase Issue Type: Test Reporter: Ted Yu TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the test. When SnapshotHFileCleaner.init() is called, there is no master parameter passed in {{params}}. When the chore runs the cleaner during the test, NPE comes out of this line in getDeletableFiles(): {code} return cache.getUnreferencedFiles(files, master.getSnapshotManager()); {code} since master is null. We should either check for the null master or, pass master instance properly when constructing the cleaner instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21129) Clean up duplicate codes in #equals and #hashCode methods of Filter
[ https://issues.apache.org/jira/browse/HBASE-21129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-21129. Resolution: Fixed > Clean up duplicate codes in #equals and #hashCode methods of Filter > --- > > Key: HBASE-21129 > URL: https://issues.apache.org/jira/browse/HBASE-21129 > Project: HBase > Issue Type: Improvement > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Minor > Fix For: 3.0.0, 2.2.0 > > Attachments: 21129.addendum, HBASE-21129.master.001.patch, > HBASE-21129.master.002.patch, HBASE-21129.master.003.patch, > HBASE-21129.master.004.patch, HBASE-21129.master.005.patch, > HBASE-21129.master.006.patch, HBASE-21129.master.007.patch, > HBASE-21129.master.008.patch > > > It is a follow-up of HBASE-19008, aiming to clean up duplicate codes in > #equals and #hashCode methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored
Ted Yu created HBASE-21160: -- Summary: Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored Key: HBASE-21160 URL: https://issues.apache.org/jira/browse/HBASE-21160 Project: HBase Issue Type: Test Reporter: Ted Yu >From >https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt > (HBASE-21138 QA run): {code} [WARNING] /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25] [AssertionFailureIgnored] This assertion throws an AssertionError if it fails, which will be caught by an enclosing try block. {code} Here is related code: {code} PrivilegedExceptionAction scanAction = new PrivilegedExceptionAction() { @Override public Void run() throws Exception { try (Connection connection = ConnectionFactory.createConnection(conf); ... assertEquals(1, next.length); } catch (Throwable t) { throw new IOException(t); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21150) Avoid delay in first flushes due to overheads in table metrics registration
[ https://issues.apache.org/jira/browse/HBASE-21150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21150: I didn't open this issue for backporting. HBASE-15728 is still in master and the delay in first flushes is still there. > Avoid delay in first flushes due to overheads in table metrics registration > --- > > Key: HBASE-21150 > URL: https://issues.apache.org/jira/browse/HBASE-21150 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu > Assignee: Ted Yu >Priority: Major > Attachments: 21150.v1.txt, 21150.v2.txt, 21150.v3.txt > > > After HBASE-15728 is integrated, the lazy table metrics registration results > in penalty for the first flushes. > Excerpt from log shows delay (note the same timestamp 08:18:23,234) : > {code:java} > 2018-09-02 08:18:23,232 DEBUG > [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] > regionserver.MetricsTableSourceImpl(124): Creating new > MetricsTableSourceImpl for table 'testtb-1535901500805' > 2018-09-02 08:18:23,233 DEBUG > [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] > regionserver.MetricsTableSourceImpl(137): registering metrics for testtb- > 1535901500805 > 2018-09-02 08:18:23,234 INFO > [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] > regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343, > heapSize ~5.16 KB/5280, currentSize=0 B/0 for > fa403f6a4fb8dbc1a1c389744fce2d58 in 280ms, sequenceid=5, compaction > requested=false > 2018-09-02 08:18:23,234 DEBUG > [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1] > regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register > testtb-1535901500805 > Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup] > 2018-09-02 08:18:23,234 DEBUG > [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] > regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register > testtb-1535901500805 > Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup] > 2018-09-02 08:18:23,234 DEBUG > [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1] > regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register > testtb-1535901500805 > Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup] > 2018-09-02 08:18:23,234 DEBUG > [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2] > regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register > testtb-1535901500805 > Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup] > 2018-09-02 08:18:23,234 DEBUG > [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2] > regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register > testtb-1535901500805 > Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup] > 2018-09-02 08:18:23,234 DEBUG > [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] > regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register > testtb-1535901500805 > Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup] > {code} > This is a regression. > When first region of the table is opened on region server, we can proactively > register table metrics. > This would avoid the penalty on first flushes for the table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21150) Avoid delay in first flushes due to contention in table metrics registration
Ted Yu created HBASE-21150: -- Summary: Avoid delay in first flushes due to contention in table metrics registration Key: HBASE-21150 URL: https://issues.apache.org/jira/browse/HBASE-21150 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu After HBASE-15728 is integrated, the lazy table metrics registration results in penalty for the first flushes. Excerpt from log shows delay (note the same timestamp 08:18:23,234) : {code} 2018-09-02 08:18:23,232 DEBUG [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] regionserver.MetricsTableSourceImpl(124): Creating new MetricsTableSourceImpl for table 'testtb-1535901500805' 2018-09-02 08:18:23,233 DEBUG [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] regionserver.MetricsTableSourceImpl(137): registering metrics for testtb- 1535901500805 2018-09-02 08:18:23,234 INFO [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343, heapSize ~5.16 KB/5280, currentSize=0 B/0 for fa403f6a4fb8dbc1a1c389744fce2d58 in 280ms, sequenceid=5, compaction requested=false 2018-09-02 08:18:23,234 DEBUG [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1] regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register testtb-1535901500805 Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup] 2018-09-02 08:18:23,234 DEBUG [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register testtb-1535901500805 Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup] 2018-09-02 08:18:23,234 DEBUG [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1] regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register testtb-1535901500805 Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup] 2018-09-02 08:18:23,234 DEBUG [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2] regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register testtb-1535901500805 Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup] 2018-09-02 08:18:23,234 DEBUG [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2] regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register testtb-1535901500805 Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup] 2018-09-02 08:18:23,234 DEBUG [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register testtb-1535901500805 Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup] {code} This is a regression. When first region of the table is opened on region server, we can proactively register table metrics. This would avoid the penalty on first flushes for the table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure
Ted Yu created HBASE-21149: -- Summary: TestIncrementalBackupWithBulkLoad may fail due to file copy failure Key: HBASE-21149 URL: https://issues.apache.org/jira/browse/HBASE-21149 Project: HBase Issue Type: Test Components: backuprestore Reporter: Ted Yu >From >https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/ > : {code} 2018-09-03 11:54:30,526 ERROR [Time-limited test] impl.TableBackupClient(235): Unexpected Exception : Failed copy from hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_ to hdfs://localhost:53075/backupUT/backup_1535975655488 java.io.IOException: Failed copy from hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_ to hdfs://localhost:53075/backupUT/backup_1535975655488 at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351) at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219) at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198) at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320) at org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605) at org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) {code} However, some part of the test output was lost: {code} 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions ...[truncated 398396 chars]... 8) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21141) Enable MOB in backup / restore test involving incremental backup
Ted Yu created HBASE-21141: -- Summary: Enable MOB in backup / restore test involving incremental backup Key: HBASE-21141 URL: https://issues.apache.org/jira/browse/HBASE-21141 Project: HBase Issue Type: Test Components: backuprestore Reporter: Ted Yu Currently we only have one test (TestRemoteBackup) where MOB feature is enabled. The test only performs full backup. This issue is to enable MOB in backup / restore test(s) involving incremental backup. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21139) Concurrent invocations of MetricsTableAggregateSourceImpl.getOrCreateTableSource may return unregistered MetricsTableSource
Ted Yu created HBASE-21139: -- Summary: Concurrent invocations of MetricsTableAggregateSourceImpl.getOrCreateTableSource may return unregistered MetricsTableSource Key: HBASE-21139 URL: https://issues.apache.org/jira/browse/HBASE-21139 Project: HBase Issue Type: Bug Reporter: Ted Yu >From test output of TestRestoreFlushSnapshotFromClient : {code} 2018-09-01 21:09:38,174 WARN [member: 'hw13463.attlocal.net,49623,1535861370108' subprocedure-pool6-thread-1] snapshot. RegionServerSnapshotManager$SnapshotSubprocedurePool(348): Got Exception in SnapshotSubprocedurePool java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:324) at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:173) at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:193) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:189) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:53) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.MetricsTableSourceImpl.updateFlushTime(MetricsTableSourceImpl.java:375) at org.apache.hadoop.hbase.regionserver.MetricsTable.updateFlushTime(MetricsTable.java:56) at org.apache.hadoop.hbase.regionserver.MetricsRegionServer.updateFlush(MetricsRegionServer.java:210) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2826) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2444) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2416) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2306) at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2209) at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:115) at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:77) {code} In MetricsTableAggregateSourceImpl.getOrCreateTableSource : {code} MetricsTableSource prev = tableSources.putIfAbsent(table, source); if (prev != null) { return prev; } else { // register the new metrics now register(source); {code} Suppose threads t1 and t2 execute the above code concurrently. t1 calls putIfAbsent first and proceeds to running {{register(source)}}. Context switches, t2 gets to putIfAbsent and retrieves the instance stored by t1 which is not registered yet. We would end up with what the stack trace showed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21138) Close HRegion instance at the end of every test in TestHRegion
Ted Yu created HBASE-21138: -- Summary: Close HRegion instance at the end of every test in TestHRegion Key: HBASE-21138 URL: https://issues.apache.org/jira/browse/HBASE-21138 Project: HBase Issue Type: Test Reporter: Ted Yu TestHRegion has over 100 tests. The following is from one subtest: {code} public void testCompactionAffectedByScanners() throws Exception { byte[] family = Bytes.toBytes("family"); this.region = initHRegion(tableName, method, CONF, family); {code} this.region is not closed at the end of the subtest. testToShowNPEOnRegionScannerReseek is another example. Every subtest should use the following construct toward the end: {code} } finally { HBaseTestingUtility.closeRegionAndWAL(this.region); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-14783) Proc-V2: Master aborts when downgrading from 1.3 to 1.1
[ https://issues.apache.org/jira/browse/HBASE-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-14783. Resolution: Later > Proc-V2: Master aborts when downgrading from 1.3 to 1.1 > --- > > Key: HBASE-14783 > URL: https://issues.apache.org/jira/browse/HBASE-14783 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu >Assignee: Stephen Yuan Jiang >Priority: Major > > I was running ITBLL with 1.3 deployed on a 6 node cluster. > Then I stopped the cluster, deployed 1.1 release and tried to start cluster. > However, master failed to start due to: > {code} > 2015-11-06 00:58:40,351 FATAL [eval-test-2:2.activeMasterManager] > master.HMaster: Failed to become active master > java.io.IOException: The procedure class > org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure must be > accessible and have an empty constructor > at > org.apache.hadoop.hbase.procedure2.Procedure.newInstance(Procedure.java:548) > at org.apache.hadoop.hbase.procedure2.Procedure.convert(Procedure.java:640) > at > org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader.read(ProcedureWALFormatReader.java:105) > at > org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormat.load(ProcedureWALFormat.java:82) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.load(WALProcedureStore.java:298) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:275) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:434) > at > org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1208) > at > org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1107) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:694) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:186) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1713) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:191) > at > org.apache.hadoop.hbase.procedure2.Procedure.newInstance(Procedure.java:536) > ... 12 more > {code} > The cause was that ServerCrashProcedure, written in some WAL file under > MasterProcWALs from first run, was absent in 1.1 release. > After a brief discussion with Stephen, I am logging this JIRA to solicit > discussion on how customer experience can be improved if downgrade of hbase > is performed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-14716) Detection of orphaned table znode should cover table in Enabled state
[ https://issues.apache.org/jira/browse/HBASE-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-14716. Resolution: Later > Detection of orphaned table znode should cover table in Enabled state > - > > Key: HBASE-14716 > URL: https://issues.apache.org/jira/browse/HBASE-14716 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu > Assignee: Ted Yu >Priority: Major > Labels: hbck > Attachments: 14716-branch-1-v1.txt, 14716.branch-1.v4.txt > > > HBASE-12070 introduced fix for orphaned table znode where table doesn't have > entry in hbase:meta > When Stephen and I investigated rolling upgrade failure, > {code} > 2015-10-27 18:21:10,668 WARN [ProcedureExecutorThread-3] > procedure.CreateTableProcedure: The table smoketest does not exist in meta > but has a znode. run hbck to fix inconsistencies. > {code} > we found that the orphaned table znode corresponded to table in Enabled state. > Therefore running hbck didn't report the inconsistency. > Detection for orphaned table znode should cover this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
Ted Yu created HBASE-21097: -- Summary: Flush pressure assertion may fail in testFlushThroughputTuning Key: HBASE-21097 URL: https://issues.apache.org/jira/browse/HBASE-21097 Project: HBase Issue Type: Test Reporter: Ted Yu >From >https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : {code} [ERROR] testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) Time elapsed: 17.446 s <<< FAILURE! java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> at org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) {code} Here is the related assertion: {code} assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); {code} where EPSILON = 1E-6 In the above case, due to margin of 2.9E-7, the assertion didn't pass. It seems the epsilon can be adjusted to accommodate different workload / hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Minimum Maven Version
I would choose #1 bq. if we ever try to backport the hbase-spark module. I doubt this would ever happen for the 2.x releases. Cheers On Wed, Aug 22, 2018 at 9:26 AM Mike Drob wrote: > Hi Devs, > > Our current minimum maven version is 3.0.4, this is both enforced by the > enforcer plugin and documented in the ref guide. Over on HBASE-20175, our > Artem is suggesting to use a newer version of the scala-maven-plugin but > the latest version requires Maven 3.5.3 > > It looks like we have a couple of options that I want to get feedback on: > > 1) Bump our minimum maven version for master branch. We can leave it at > 3.0.4 for branch-1 and branch-2, but this would come up again if we ever > try to backport the hbase-spark module. > > 2) Engage with the scala plugin community to try and get the plugin to work > with older maven versions. I haven't done any feasibility study on this > yet, and am not even sure which community we would be talking to. > > 3) See if the specific issues we are running into are solved by older > versions of the plugin that are compatible with older versions of maven. > > 4) Do some transitive dependency exclusion magic instead of actually > harmonizing the versions of things that we use. > > I'm leaning towards 1) or 4), but would be interested to hear thoughts from > other parties. > > Mike >
[jira] [Created] (HBASE-21088) HStoreFile should be closed in HStore#hasReferences
Ted Yu created HBASE-21088: -- Summary: HStoreFile should be closed in HStore#hasReferences Key: HBASE-21088 URL: https://issues.apache.org/jira/browse/HBASE-21088 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu {code} reloadedStoreFiles = loadStoreFiles(); return StoreUtils.hasReferences(reloadedStoreFiles); {code} The intention of obtaining the HStoreFile's is to check for references. The loaded HStoreFile's should be closed prior to return to prevent leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21076) TestTableResource fails with NPE
Ted Yu created HBASE-21076: -- Summary: TestTableResource fails with NPE Key: HBASE-21076 URL: https://issues.apache.org/jira/browse/HBASE-21076 Project: HBase Issue Type: Test Reporter: Ted Yu The following can be observed in master branch: {code} java.lang.NullPointerException at org.apache.hadoop.hbase.rest.TestTableResource.setUpBeforeClass(TestTableResource.java:134) {code} The NPE comes from the following in TestEndToEndSplitTransaction : {code} compactAndBlockUntilDone(TEST_UTIL.getAdmin(), TEST_UTIL.getMiniHBaseCluster().getRegionServer(0), daughterA.getRegionName()); {code} Initial check of the code shows that TestEndToEndSplitTransaction uses TEST_UTIL instance which is created within TestEndToEndSplitTransaction. However, TestTableResource creates its own instance of HBaseTestingUtility. Meaning TEST_UTIL.getMiniHBaseCluster() would return null, since the instance created by TestEndToEndSplitTransaction has hbaseCluster as null. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21042) processor.getRowsToLock() always assumes there is some row being locked in HRegion#processRowsWithLocks
Ted Yu created HBASE-21042: -- Summary: processor.getRowsToLock() always assumes there is some row being locked in HRegion#processRowsWithLocks Key: HBASE-21042 URL: https://issues.apache.org/jira/browse/HBASE-21042 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu [~tdsilva] reported at the tail of HBASE-18998 that the fix for HBASE-18998 missed finally block of HRegion#processRowsWithLocks This is to fix that remaining call. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21040) printStackTrace() is used in RestoreDriver in case Exception is caught
Ted Yu created HBASE-21040: -- Summary: printStackTrace() is used in RestoreDriver in case Exception is caught Key: HBASE-21040 URL: https://issues.apache.org/jira/browse/HBASE-21040 Project: HBase Issue Type: Bug Reporter: Ted Yu Here is related code: {code} } catch (Exception e) { e.printStackTrace(); {code} The correct way of logging stack trace is to use the Logger instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Hbase mutate is hogging my CPU
There have been several releases for Hbase 1.2 Which release are you using ? The images you sent didn't go through the mailing list. Please consider using third party site for delivery. Have you taken a look at what the server hosting hbase:meta was doing during this period of time ? Thanks On Tue, Aug 7, 2018 at 8:04 AM Mike Freyberger wrote: > Kafka Dev, > > > > I’d love some help investigating a slow Hbase mutator. > > > > The cluster is Hbase 1.2 and cluster has 22 region servers. The region > servers are pretty big: 24 cores, 126 GB RAM. > > > > The cluster has 2 tables, each only have 1 column family. Both tables have > the same pre splits. > > > > Each table is pre split into 400 regions. The split keys are all 2 bytes > and evenly divide the key space. > > > > The keys are 13 bytes. The key is formed by concatenating: > > 1 byte kafka partition > > 8 byte random int > > 4 byte timestamp (second level granularity) > > The workload is 100% write for now. There are about 1M writes per second > with a total data volume of .6GB per second. > > > > I find that my application is spending the majority of its CPU time > (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate > (), which is in turn spending most of its time calling > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion > (). > > > > Attached are two images showing the performance of my application. The > first is an overview showing that my application is spending a lot of in > mutate. The next is a deep dive into the functions that mutate is calling > internally. > > I am very surprised to see this function taking so long. My intuition is > that all this needs to do is: > 1) Determine which region the Mutation belongs in > 2) Append the Mutation to a queue for async write to HBase. > > > Any thoughts, comments of suggestions from the community would be much > appreciated! I’m really hoping to improve the performance profile here so > that my CPU can be freed up. > > > > Thanks, > > > > Mike Freyberger >
Re: [ANNOUNCE] New Committer: Toshihiro Suzuki
Congratulations, Toshihiro ! On Wed, Aug 1, 2018 at 7:47 AM Josh Elser wrote: > On behalf of the HBase PMC, I'm pleased to announce that Toshihiro > Suzuki (aka Toshi, brfn169) has accepted our invitation to become an > HBase committer. This was extended to Toshi as a result of his > consistent, high-quality contributions to HBase. Thanks for all of your > hard work, and we look forward to working with you even more! > > Please join me in extending a hearty "congrats" to Toshi! > > - Josh >
Re: May I take this issue --hbase-spark
bq. ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' The above implies dependency on some class from Hive. Which Hive release would you use if you choose the above route ? Looking forward to your demo. On Tue, Jul 31, 2018 at 9:09 AM bill.yunfu wrote: > hi Ted >Thank you for replying. > The sql support means user can directly use spark sql to create table and > query data from HBase. we found two sql support on HBase > SHC use following command to create table in spark sql: > CREATE TABLE spark_hbase USING > org.apache.spark.sql.execution.datasources.hbase > OPTIONS ('catalog'= > '{"table":{"namespace":"default", "name":"test", > "tableCoder":"PrimitiveType"},"rowkey":"key", > "columns":{ > "col0":{"cf":"rowkey", "col":"key", "type":"string"}, > "col1":{"cf":"cf", "col":"a", "type":"string"}}}' > ) > (SHC is a project can get details from: > https://github.com/hortonworks-spark/shc) > In spark sql also can use hive command to create table: > create table spark_hbase (col0 string, col1 string) > ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' with > SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:a") > STORED AS > INPUTFORMAT > 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat' > tblproperties ("hbase.table.name" = "test"); > > So we want make a similar DDL to create the table for hbase-spark model and > query with the spark sql. > > And for the Spark release, we suggestion first target at spark 2.y, for > example the spark 2.2.2 which is stability now. > > We will create a demo base on hbase-spark model with sql support in local, > then share here to discuss. > > Regards > Bill > > > Ted Yu-3 wrote > > For SQL support, can you be more specific on how the SQL support would be > > added ? > > > > Maybe you can illustrate some examples showing the enhanced SQL syntax. > > > > Also, which Spark release(s) would be targeted? > > > > Thanks > > > > On Mon, Jul 30, 2018 at 10:57 AM bill.yunfu > > > guangcheng.zgc@ > > > > > wrote: > > > > > > -- > Sent from: > http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html >
Re: Question regarding hbase-shell JRuby testing workflow
The flakiness of list_procedures_test.rb is probably related to the load on the node running the test, or other tests in hbase-shell module. I ran list_procedures_test.rb alone a few times which passed. Jack: You can include some other shell test(s) along with this test. You can also retrieve test output following the test runs performed here: https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html FYI On Tue, Jul 31, 2018 at 7:41 AM Josh Elser wrote: > I haven't ever tried to de-couple from Maven. The 'lowest' I ever got > was something like the following: > > 1. mvn clean install -DskipTests > 2. cd hbase-shell > 2. mvn package -Dtest=TestShell -Dshell.test.include=my_test_class.rb -o > > Hope this helps, Jack. I know it's not ideal -- if you do come up with > something that works at a lower level, I think we'd be very supportive > to get it doc'ed and keep it working :) > > On 7/30/18 11:16 PM, Jack Bearden wrote: > > Hey all! I was hacking hbase-shell and JRuby over the weekend and wanted > to > > get some feedback on workflow. My objective was to execute a single Ruby > > unit test in isolation from the TestShell.java class via the jruby > binary. > > I was able to accomplish this by doing the following steps: > > > > 1. Pulled down branch-2 > > 2. Installed and cleaned via maven at the base directory (mvn > > -Dmaven.javadoc.skip -DskipTests install) > > 3. Changed to the hbase-shell directory and exported the classpath (mvn > > dependency:build-classpath -Dmdep.outputFile=/path/to/cpath.txt) > > 4. Exported the path to that file to shell env (export > > TEST_PATH="/path/to/cpath.txt") > > 5. Hacked tests_runner.rb to just load("path/to/test") for the test I > > wanted to run > > 6. From the hbase-shell project directory ran the following: > > > > jruby \ > > -J-cp `cat $TEST_PATH` \ > > -d -w \ > > -I src/test/ruby \ > > -I src/main/ruby \ > > src/test/ruby/tests_runner.rb > > > > The problem is, is this only worked on *most* of the hbase-shell Ruby > > tests. The only way to get, for example, list_procedures_test.rb to work > > completely, was to run it from the TestShell.java file. When ran from the > > jruby binary, I get a "class not found" when > > org.apache.hadoop.hbase.client.procedure.ShellTestProcedure.new was being > > referenced. I can't figure out how to load this class adhoc and not > through > > what appears to be Maven magic. > > > > Any suggestions or better ideas on how to do this? > > >
[jira] [Created] (HBASE-20988) TestShell shouldn't be skipped for hbase-shell module test
Ted Yu created HBASE-20988: -- Summary: TestShell shouldn't be skipped for hbase-shell module test Key: HBASE-20988 URL: https://issues.apache.org/jira/browse/HBASE-20988 Project: HBase Issue Type: Test Reporter: Ted Yu Here is snippet for QA run 13862 for HBASE-20985 : {code} 13:42:50 cd /testptch/hbase/hbase-shell 13:42:50 /usr/share/maven/bin/mvn -Dmaven.repo.local=/home/jenkins/yetus-m2/hbase-master-patch-1 -DHBasePatchProcess -PrunAllTests -Dtest.exclude.pattern=**/master.normalizer. TestSimpleRegionNormalizerOnCluster.java,**/replication.regionserver.TestSerialReplicationEndpoint.java,**/master.procedure.TestServerCrashProcedure.java,**/master.procedure.TestCreateTableProcedure. java,**/TestClientOperationTimeout.java,**/client.TestSnapshotFromClientWithRegionReplicas.java,**/master.TestAssignmentManagerMetrics.java,**/client.TestShell.java,**/client. TestCloneSnapshotFromClientWithRegionReplicas.java,**/master.TestDLSFSHLog.java,**/replication.TestReplicationSmallTestsSync.java,**/master.procedure.TestModifyTableProcedure.java,**/regionserver. TestCompactionInDeadRegionServer.java,**/client.TestFromClientSide3.java,**/master.procedure.TestRestoreSnapshotProcedure.java,**/client.TestRestoreSnapshotFromClient.java,**/security.access. TestCoprocessorWhitelistMasterObserver.java,**/replication.regionserver.TestDrainReplicationQueuesForStandBy.java,**/master.procedure.TestProcedurePriority.java,**/master.locking.TestLockProcedure. java,**/master.cleaner.TestSnapshotFromMaster.java,**/master.assignment.TestSplitTableRegionProcedure.java,**/client.TestMobRestoreSnapshotFromClient.java,**/replication.TestReplicationKillSlaveRS. java,**/regionserver.TestHRegion.java,**/security.access.TestAccessController.java,**/master.procedure.TestTruncateTableProcedure.java,**/client.TestAsyncReplicationAdminApiWithClusters.java,**/ coprocessor.TestMetaTableMetrics.java,**/client.TestMobSnapshotCloneIndependence.java,**/namespace.TestNamespaceAuditor.java,**/master.TestMasterAbortAndRSGotKilled.java,**/client.TestAsyncTable.java,**/master.TestMasterOperationsForRegionReplicas.java,**/util.TestFromClientSide3WoUnsafe.java,**/client.TestSnapshotCloneIndependence.java,**/client.TestAsyncDecommissionAdminApi.java,**/client. TestRestoreSnapshotFromClientWithRegionReplicas.java,**/master.assignment.TestMasterAbortWhileMergingTable.java,**/client.TestFromClientSide.java,**/client.TestAdmin1.java,**/client. TestFromClientSideWithCoprocessor.java,**/replication.TestReplicationKillSlaveRSWithSeparateOldWALs.java,**/master.procedure.TestMasterFailoverWithProcedures.java,**/regionserver. TestSplitTransactionOnCluster.java clean test -fae > /testptch/patchprocess/patch-unit-hbase-shell.txt 2>&1 {code} In this case, there was modification to shell script, leading to running shell tests. However, TestShell was excluded in the QA run, defeating the purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Question regarding hbase-shell JRuby testing workflow
Have you tried sidelining other .rb files under hbase-shell//src/test/ruby/shell/ (keeping only hbase-shell//src/test/ruby/shell/list_procedures_test.rb) ? Cheers On Mon, Jul 30, 2018 at 8:29 PM Jack Bearden wrote: > Hey all! I was hacking hbase-shell and JRuby over the weekend and wanted to > get some feedback on workflow. My objective was to execute a single Ruby > unit test in isolation from the TestShell.java class via the jruby binary. > I was able to accomplish this by doing the following steps: > > 1. Pulled down branch-2 > 2. Installed and cleaned via maven at the base directory (mvn > -Dmaven.javadoc.skip -DskipTests install) > 3. Changed to the hbase-shell directory and exported the classpath (mvn > dependency:build-classpath -Dmdep.outputFile=/path/to/cpath.txt) > 4. Exported the path to that file to shell env (export > TEST_PATH="/path/to/cpath.txt") > 5. Hacked tests_runner.rb to just load("path/to/test") for the test I > wanted to run > 6. From the hbase-shell project directory ran the following: > > jruby \ > -J-cp `cat $TEST_PATH` \ > -d -w \ > -I src/test/ruby \ > -I src/main/ruby \ > src/test/ruby/tests_runner.rb > > The problem is, is this only worked on *most* of the hbase-shell Ruby > tests. The only way to get, for example, list_procedures_test.rb to work > completely, was to run it from the TestShell.java file. When ran from the > jruby binary, I get a "class not found" when > org.apache.hadoop.hbase.client.procedure.ShellTestProcedure.new was being > referenced. I can't figure out how to load this class adhoc and not through > what appears to be Maven magic. > > Any suggestions or better ideas on how to do this? >
Re: May I take this issue --hbase-spark
For SQL support, can you be more specific on how the SQL support would be added ? Maybe you can illustrate some examples showing the enhanced SQL syntax. Also, which Spark release(s) would be targeted? Thanks On Mon, Jul 30, 2018 at 10:57 AM bill.yunfu wrote: > May I take this issue --hbase-spark > > Hi community >I am working in one HBase team which service hundreds customers. We find > that along increasing amount of data in the HBase, many customers have > analysis requirement for their data on Hbase. For example they want use > Spark to do some analysis which may query more data from Hbase and may also > join with other tables, the tables may be in Hbase or Spark. >But Hbase can not support this scenario very well. So we plan use spark > to support this. >We found the Apache Hbase already has one module called Hbase-spark, but > this module is not updated recently and not formally released. Besides we > found there are others project support Sql On Hbase. For example Hive on > Hbase which give good sql syntax support. >Even there are many projects for Spark on Hbase, but I think now no one > is the public knowing for users. Because our customer have more and more > requirement for Spark on Hbase, So we want take this issue. Initial goal is > make a standard and public knowing Spark on Hbase in apache Hbase > community. >Our initial idea is: >SQL support: Now the hbase-spark model can not spark-sql command to > create table, We want make it support sql command which may like the sql > syntax from Hive on HBase or the SQL syntax from SHC. >Performance improved: this part is not very clearly now, the goal is use > spark sql query HBase data has a good performance. > > We want to get some suggestions from community. Then I will raise a JIRA to > track it and put a design document. > > Best Regards > Bill > > > > > -- > Sent from: > http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html >
Re: I am a subscribe please add me thanks
Can you take a look at http://hbase.apache.org/mail-lists.html ? The first column gives you the email address for subscription. Cheers On Mon, Jul 30, 2018 at 7:51 AM 周广成(云覆) wrote: > hi > I send this mail yesterday but still now cannot public the topic in the > hbase dev mallist, can you please help check that. thank you very much. > > Regards > Bill > > > -- > 发件人:周广成(云覆) > 发送时间:2018年7月29日(星期日) 15:50 > 收件人:hbase-dev > 主 题:I am a subscribe please add me thanks > > I am a subscribe please add me thanks
[jira] [Created] (HBASE-20968) list_procedures_test fails due to no matching regex
Ted Yu created HBASE-20968: -- Summary: list_procedures_test fails due to no matching regex Key: HBASE-20968 URL: https://issues.apache.org/jira/browse/HBASE-20968 Project: HBase Issue Type: Test Reporter: Ted Yu >From test output against hadoop3: {code} 2018-07-28 12:04:24,838 DEBUG [Time-limited test] procedure2.ProcedureExecutor(948): Stored pid=12, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.client.procedure. ShellTestProcedure 2018-07-28 12:04:24,864 INFO [RS-EventLoopGroup-1-3] ipc.ServerRpcConnection(556): Connection from 172.18.128.12:46918, version=3.0.0-SNAPSHOT, sasl=false, ugi=hbase (auth: SIMPLE), service=MasterService 2018-07-28 12:04:24,900 DEBUG [Thread-114] master.MasterRpcServices(1157): Checking to see if procedure is done pid=11 ^[[38;5;196mF^[[0m === Failure: ^[[48;5;124;38;5;231;1mtest_list_procedures(Hbase::ListProceduresTest)^[[0m src/test/ruby/shell/list_procedures_test.rb:65:in `block in test_list_procedures' 62: end 63: end 64: ^[[48;5;124;38;5;231;1m => 65: assert_equal(1, matching_lines)^[[0m 66: end 67: end 68: end <^[[48;5;34;38;5;231;1m1^[[0m> expected but was <^[[48;5;124;38;5;231;1m0^[[0m> === ... 2018-07-28 12:04:25,374 INFO [PEWorker-9] procedure2.ProcedureExecutor(1316): Finished pid=12, state=SUCCESS, hasLock=false; org.apache.hadoop.hbase.client.procedure. ShellTestProcedure in 336msec {code} The completion of the ShellTestProcedure was after the assertion was raised. {code} def create_procedure_regexp(table_name) regexp_string = '[0-9]+ .*ShellTestProcedure SUCCESS.*' \ {code} The regex used by the test isn't found in test output either. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only
Ted Yu created HBASE-20966: -- Summary: RestoreTool#getTableInfoPath should look for completed snapshot only Key: HBASE-20966 URL: https://issues.apache.org/jira/browse/HBASE-20966 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu [~gubjanos] reported seeing the following error when running backup / restore test on Azure: {code} 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/. snapshotinfo 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328) 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237) 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351) 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186) {code} Here is related code in master branch: {code} Path getTableInfoPath(TableName tableName) throws IOException { Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, backupId); Path tableInfoPath = null; // can't build the path directly as the timestamp values are different FileStatus[] snapshots = fs.listStatus(tableSnapShotPath); {code} In the above code, we don't exclude incomplete snapshot, leading to exception later when reading snapshot info. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20917) MetaTableMetrics#stop references uninitialized requestsMap for non-meta region
Ted Yu created HBASE-20917: -- Summary: MetaTableMetrics#stop references uninitialized requestsMap for non-meta region Key: HBASE-20917 URL: https://issues.apache.org/jira/browse/HBASE-20917 Project: HBase Issue Type: Bug Reporter: Ted Yu I noticed the following in test output: {code} 2018-07-21 15:54:43,181 ERROR [RS_CLOSE_REGION-regionserver/172.17.5.4:0-1] executor.EventHandler(186): Caught throwable while processing event M_RS_CLOSE_REGION java.lang.NullPointerException at org.apache.hadoop.hbase.coprocessor.MetaTableMetrics.stop(MetaTableMetrics.java:329) at org.apache.hadoop.hbase.coprocessor.BaseEnvironment.shutdown(BaseEnvironment.java:91) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionEnvironment.shutdown(RegionCoprocessorHost.java:165) at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.shutdown(CoprocessorHost.java:290) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.postEnvCall(RegionCoprocessorHost.java:559) at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:622) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postClose(RegionCoprocessorHost.java:551) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1678) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1484) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) {code} {{requestsMap}} is only initialized for the meta region. However, check for meta region is absent in the stop method: {code} public void stop(CoprocessorEnvironment e) throws IOException { // since meta region can move around, clear stale metrics when stop. for (String meterName : requestsMap.keySet()) { {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20892) [UI] Start / End keys are empty on table.jsp
Ted Yu created HBASE-20892: -- Summary: [UI] Start / End keys are empty on table.jsp Key: HBASE-20892 URL: https://issues.apache.org/jira/browse/HBASE-20892 Project: HBase Issue Type: Bug Affects Versions: 2.0.1 Reporter: Ted Yu When viewing table.jsp?name=TestTable , I found that the Start / End keys for all the regions were simply dashes without real value. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: issue while reading data from hbase
Putting dev@ to bcc. Which hbase-spark connector are you using ? What's the hbase release in your deployment ? bq. some of the columns in dataframe becomes null Is it possible to characterize what type of columns become null ? Earlier you said one column has xml data. Did you mean this column from some rows returned null ? Have you checked region server logs where the corresponding regions reside ? Thanks On Fri, Jul 13, 2018 at 4:08 AM hnk45 wrote: > I am reading data from hbase using spark sql. one column has xml data. when > xml size is small , I am able to read correct data. but as soon as size > increases too much, some of the columns in dataframe becomes null. xml is > still coming correctly. > while reading data from sql to hbase I have used this constraint: > hbase.client.keyvalue.maxsize=0 in my sqoop. > > > > -- > Sent from: > http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html >
[jira] [Created] (HBASE-20879) Compacting memstore config should handle lower case
Ted Yu created HBASE-20879: -- Summary: Compacting memstore config should handle lower case Key: HBASE-20879 URL: https://issues.apache.org/jira/browse/HBASE-20879 Project: HBase Issue Type: Bug Affects Versions: 2.0.1 Reporter: Tushar Sharma Assignee: Ted Yu Tushar reported seeing the following in region server log when entering 'basic' for compacting memstore type: {code} 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-0] handler.OpenRegionHandler: Failed open of region=usertable,user6379,1531182972304.69abd81a44e9cc3ef9e150709f4f69ab., starting to roll back the global memstore size. java.io.IOException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hbase.MemoryCompactionPolicy.basic at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hbase.MemoryCompactionPolicy.basic at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26) at org.apache.hadoop.hbase.regionserver.HStore.getMemstore(HStore.java:331) at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:271) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5531) at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:999) at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:996) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-1] handler.OpenRegionHandler: Failed open of region=temp,,1530511278693.0be48eedc68b9358aa475946d00571f1., starting to roll back the global memstore size. java.io.IOException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hbase.MemoryCompactionPolicy.basic at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hbase.MemoryCompactionPolicy.basic at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26
Re: The flakey test dashboard is broken
Please log an INFRA ticket. On Mon, Jul 2, 2018 at 3:12 AM, 张铎(Duo Zhang) wrote: > The console output is > > > + docker run -v > > /home/jenkins/jenkins-slave/workspace/HBase-Find-Flaky-Tests:/hbase > > --workdir=/hbase hbase-dev-support python dev-support/report-flakies.py > > --mvn -v --urls=https://builds.apache.org/job/HBASE-Flaky-Tests/ > > --max-builds=30 --is-yetus=False --urls= > > https://builds.apache.org/job/HBase%20Nightly/job/master/ --max-builds=6 > > --is-yetus=True --urls= > > https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/ > > --max-builds=30 --is-yetus=False --urls= > > https://builds.apache.org/job/HBase%20Nightly/job/branch-2/ > > --max-builds=6 --is-yetus=True --urls= > > http://104.198.223.121:8080/job/HBASE-Flaky-Tests/ --max-builds=30 > > --is-yetus=False > > Traceback (most recent call last): > > File "dev-support/report-flakies.py", line 151, in > > expanded_urls = expand_multi_config_projects(args) > > File "dev-support/report-flakies.py", line 128, in > > expand_multi_config_projects > > response = requests.get(job_url + "/api/json").json() > > File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get > > return request('get', url, **kwargs) > > File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in > > request > > return session.request(method=method, url=url, **kwargs) > > File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, > in > > request > > resp = self.send(prep, **send_kwargs) > > File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, > in > > send > > r = adapter.send(request, **kwargs) > > File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 378, > in > > send > > raise ConnectionError(e) > > requests.exceptions.ConnectionError: > > HTTPConnectionPool(host='104.198.223.121', port=8080): Max retries > exceeded > > with url: /job/HBASE-Flaky-Tests//api/json (Caused by > 'socket.error'>: [Errno 111] Connection refused) > > Build step 'Execute shell' marked build as failure > > > I think the problem is that, the jenkins instance is broken > > http://104.198.223.121:8080/job/HBASE-Flaky-Tests/ > > I temporarily removed this url from the build. Does any one know who is the > maintainer of this machine? > > Thanks. >
Re: [ANNOUNCE] New HBase committer Reid Chan
Congratulations, Reid ! On Mon, Jun 25, 2018 at 6:59 PM, Chia-Ping Tsai wrote: > On behalf of the Apache HBase PMC, I am pleased to announce that Reid Chan > has accepted the PMC's invitation to become a committer on the project. We > appreciate all of Reid’s generous contributions thus far and look forward > to his continued involvement. > > Congratulations and welcome, Reid! > > -- > Chia-Ping >
Re: Template problem native client c++ with new folly
Can you take a look at : HBASE-18901 [C++] Provide CMAKE infrastructure There hasn't been effort to support newer folly. FYI On Wed, Jun 20, 2018 at 1:42 PM, Andrzej wrote: > I have installed new (17 days ago) folly and wangle from sources. > I try compile sources of native client from HBASE-14850 branch. > These sources are old. > I have problem: > ``` > template > > typename R::Return then(F&& func) { > return this->template thenImplementation( > std::forward(func), typename R::Arg()); > } > ``` > from /usr/local/include/folly/futures/Future.h > > Is template called from > ``` > // mimic: std::invoke_result_t, C++17 > template > using invoke_result_t = typename invoke_result::type; > ``` > from /usr/local/include/folly/functional/Invoke.h > > but is called from > ``` > GetRegionLocations(actions, locate_timeout_ns) > .then([=](std::vector>> ) { > std::lock_guard lck(multi_mutex_); > ActionsByServer actions_by_server; > std::vector> locate_failed; > ``` > from > /home/andrzej/projects/simple-hbase2/src/hbase/client/async-batch-rpc-retrying-caller.cc > - my project > > I have turn on -std=gnu++17 > > There are error and notes: > > /home/andrzej/projects/simple-hbase2/src/hbase/client/async- > batch-rpc-retrying-caller.cc|259|error: no matching function for call to > ‘folly::Future > > > >::then(hbase::AsyncBatchRpcRetryingCaller RESP>::GroupAndSend(const std::vector >&, > int32_t) [with REQ = std::shared_ptr; RESP = > std::shared_ptr; int32_t = int]:: y::Try > >&)>)’| > > /usr/local/include/folly/futures/Future.h|737|note: candidate: > template typename R::Return folly::Future::then(F&&) > [with F = F; R = R; T = std::vector ared_ptr > >]| > > /usr/local/include/folly/futures/Future.h|737|note: substitution of > deduced template arguments resulted in errors seen above| > > /usr/local/include/folly/futures/Future.h|753|note: candidate: > template folly::Future folly::isFuture::Inner> folly::Future::then(R (Caller::*)(Args ...), > Caller*) [with R = R; Caller = Caller; Args = {Args ...}; T = > std::vector > >]| > > /usr/local/include/folly/futures/Future.h|753|note: template argument > deduction/substitution failed:| > > /home/andrzej/projects/simple-hbase2/src/hbase/client/async- > batch-rpc-retrying-caller.cc|259|note: mismatched types ‘R > (Caller::*)(Args ...)’ and ‘hbase::AsyncBatchRpcRetryingCaller RESP>::GroupAndSend(const std::vector >&, > int32_t) [with REQ = std::shared_ptr; RESP = > std::shared_ptr; int32_t = int]:: y::Try > >&)>’| > > /usr/local/include/folly/futures/Future.h|770|note: candidate: > template auto > folly::Future::then(Executor*, Arg&&, Args&& ...) [with Executor = > Executor; Arg = Arg; Args = {Args ...}; T = std::vector ared_ptr > >]| > > /usr/local/include/folly/futures/Future.h|770|note: template argument > deduction/substitution failed:| > > /home/andrzej/projects/simple-hbase2/src/hbase/client/async- > batch-rpc-retrying-caller.cc|259|note: mismatched types ‘Executor*’ and > ‘hbase::AsyncBatchRpcRetryingCaller::GroupAndSend(const > std::vector >&, int32_t) [with REQ = > std::shared_ptr; RESP = std::shared_ptr; > int32_t = > int]:: > > >&)>’| > > /usr/local/include/folly/futures/Future-inl.h|975|note: candidate: > folly::Future folly::Future::then() [with T = > std::vector > >]| > > /usr/local/include/folly/futures/Future-inl.h|975|note: candidate > expects 0 arguments, 1 provided| > > > How I can change this piece of sources to fit new folly? >
[jira] [Created] (HBASE-20744) Address FindBugs warnings in branch-1
Ted Yu created HBASE-20744: -- Summary: Address FindBugs warnings in branch-1 Key: HBASE-20744 URL: https://issues.apache.org/jira/browse/HBASE-20744 Project: HBase Issue Type: Bug Reporter: Ted Yu >From >https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350//JDK8_Nightly_Build_Report_(Hadoop2)/ > : {code} FindBugsmodule:hbase-common Inconsistent synchronization of org.apache.hadoop.hbase.io.encoding.EncodedDataBlock$BufferGrabbingByteArrayOutputStream.ourBytes; locked 50% of time Unsynchronized access at EncodedDataBlock.java:50% of time Unsynchronized access at EncodedDataBlock.java:[line 258] {code} {code} FindBugsmodule:hbase-hadoop2-compat java.util.concurrent.ScheduledThreadPoolExecutor stored into non-transient field MetricsExecutorImpl$ExecutorSingleton.scheduler At MetricsExecutorImpl.java:MetricsExecutorImpl$ExecutorSingleton.scheduler At MetricsExecutorImpl.java:[line 51] {code} {code} FindBugsmodule:hbase-server instanceof will always return false in org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, int, int), since a org.apache.hadoop.hbase.quotas.RpcThrottlingException can't be a org.apache.hadoop.hbase.quotas.ThrottlingException At RegionServerQuotaManager.java:in org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, int, int), since a org.apache.hadoop.hbase.quotas.RpcThrottlingException can't be a org.apache.hadoop.hbase.quotas.ThrottlingException At RegionServerQuotaManager.java:[line 193] instanceof will always return true for all non-null values in org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, int, int), since all org.apache.hadoop.hbase.quotas.RpcThrottlingException are instances of org.apache.hadoop.hbase.quotas.RpcThrottlingException At RegionServerQuotaManager.java:for all non-null values in org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, int, int), since all org.apache.hadoop.hbase.quotas.RpcThrottlingException are instances of org.apache.hadoop.hbase.quotas.RpcThrottlingException At RegionServerQuotaManager.java:[line 199] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20743) ASF License warnings for branch-1
Ted Yu created HBASE-20743: -- Summary: ASF License warnings for branch-1 Key: HBASE-20743 URL: https://issues.apache.org/jira/browse/HBASE-20743 Project: HBase Issue Type: Bug Reporter: Ted Yu >From >https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350/artifact/output-general/patch-asflicense-problems.txt > : {code} Lines that start with ? in the ASF License report indicate files that do not have an Apache license header: !? hbase-error-prone/target/checkstyle-result.xml !? hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker !? hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst !? hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst {code} Looks like they should be excluded. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir
Ted Yu created HBASE-20734: -- Summary: Colocate recovered edits directory with hbase.wal.dir Key: HBASE-20734 URL: https://issues.apache.org/jira/browse/HBASE-20734 Project: HBase Issue Type: Improvement Reporter: Ted Yu During investigation of HBASE-20723, I realized that we wouldn't get the best performance when hbase.wal.dir is configured to be on different (fast) media than hbase rootdir w.r.t. recovered edits since recovered edits directory is currently under rootdir. Such setup may not result in fast recovery when there is region server failover. This issue is to find proper (hopefully backward compatible) way in colocating recovered edits directory with hbase.wal.dir . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] Apache HBase 1.4.5 RC1
+1 Checked signtures Ran test suite with jdk 8 On Wed, Jun 13, 2018 at 12:53 PM, Josh Elser wrote: > Hi, > > Please vote to approve the following as Apache HBase 1.4.5. The only > change in RC1 over RC0 is the updated CHANGES.txt. > > https://dist.apache.org/repos/dist/dev/hbase/1.4.5RC1/ > > Per usual, there is a source release as well as a convenience binary > > This is built with JDK7 from the commit: https://git-wip-us.apache.org/ > repos/asf?p=hbase.git;a=commit;h=ca99a9466415dc4cfc095df33efb45cb82fe5480 > (there is a corresponding tag "1.4.5RC1" for convenience). Please ignore > the incorrect commit message (forgot to update that text after rc0). > > hbase-1.4.5-bin.tar.gz: 1C34A448 DF4E102E 7964C6D8 84C2B1E5 35DD0CAA > E67E6B15 > 92DB2B10 8A6D3A0F B2D841F7 677EB5C3 8EB6D78F > 342CB42F > 8CE90AA6 D62A5362 3C727023 4E8C95EE > hbase-1.4.5-src.tar.gz: 61B83966 952D334A FDAE7E3A 11FD8529 90583302 > 4AC186C4 > 2B7BDA11 5CB472A4 34C71466 4DCF90BB 8735F658 > 20975292 > 35319D5C 2287B0DE 64F484C7 42F635D6 > > There is also a Maven staging repository for this release: > https://repository.apache.org/content/repositories/orgapachehbase-1221 > > This vote will be open for at least 72 hours (2018/06/16 2000 UTC). > > - Josh (on behalf of the HBase PMC) > >
[jira] [Reopened] (HBASE-20672) Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at every monitoring interval
[ https://issues.apache.org/jira/browse/HBASE-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-20672: > Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at > every monitoring interval > - > > Key: HBASE-20672 > URL: https://issues.apache.org/jira/browse/HBASE-20672 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: Ankit Jain >Assignee: Ankit Jain >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20672.branch-1.001.patch, > HBASE-20672.master.001.patch, HBASE-20672.master.002.patch, > HBASE-20672.master.003.patch, hits1vs2.4.40.400.png > > > Hbase currently provides counter read/write requests (ReadRequestCount, > WriteRequestCount). That said it is not easy to use counter that reset only > after a restart of the service, we would like to expose 2 new metrics in > HBase to provide ReadRequestRate and WriteRequestRate at region server level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Failure: HBase Generate Website
Build #1372 passed. On Mon, Jun 11, 2018 at 7:51 AM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build status: Failure > > The HBase website has not been updated to incorporate HBase commit > ${CURRENT_HBASE_COMMIT}. > > See https://builds.apache.org/job/hbase_generate_website/1371/console
Re: [VOTE] Apache HBase 1.4.5 rc0
+1 Checked signatures Ran test suite (with Jdk 8) On Thu, Jun 7, 2018 at 4:35 PM, Josh Elser wrote: > Hi, > > Please vote to approve the following as Apache HBase 1.4.5 > > https://dist.apache.org/repos/dist/dev/hbase/1.4.5rc0/ > > Per usual, there is a source release as well as a convenience binary > > This is built with JDK7 from the commit: https://git-wip-us.apache.org/ > repos/asf?p=hbase.git;a=commit;h=74596816c85f1256ec8a302efecc0144f2ea76fa > (there is a corresponding tag "1.4.5rc0" for convenience) > > hbase-1.4.5-bin.tar.gz: 7C8EFD79 CD5EAEFF 92F2E093 8AC8448C ED5717BD > 4C8D2C43 > B95F804B 003E2126 9235EFE0 ABE61302 B81B30B1 > F9F4A785 > 17191950 2F436F64 19F50E53 999B5272 > hbase-1.4.5-src.tar.gz: FED89273 FFA746DA D868DF79 7E46DB75 D0908419 > F3D418FF > 73068583 A6F1DCB2 61BD2389 12DCE920 F8800CAE > 23631343 > DB7601F4 F43331A4 678135E5 E5C566C4 > > There is also a Maven staging repository for this release: > https://repository.apache.org/content/repositories/orgapachehbase-1219 > > This vote will be open for at least 72 hours (2018/06/11 UTC). > > - Josh (on behalf of the HBase PMC) > > >
[jira] [Resolved] (HBASE-20577) Make Log Level page design consistent with the design of other pages in UI
[ https://issues.apache.org/jira/browse/HBASE-20577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-20577. Resolution: Fixed Thanks for the addendum > Make Log Level page design consistent with the design of other pages in UI > -- > > Key: HBASE-20577 > URL: https://issues.apache.org/jira/browse/HBASE-20577 > Project: HBase > Issue Type: Improvement > Components: UI, Usability >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20577.master.001.patch, > HBASE-20577.master.002.patch, HBASE-20577.master.ADDENDUM.patch, > after_patch_LogLevel_CLI.png, after_patch_get_log_level.png, > after_patch_require_field_validation.png, after_patch_set_log_level_bad.png, > after_patch_set_log_level_success.png, > before_patch_no_validation_required_field.png, rest_after_addendum_patch.png > > > The Log Level page in web UI seems out of the place. I think we should make > it look consistent with design of other pages in HBase web UI. > Also, validation of required fields should be done, otherwise user should not > be allowed to click submit button. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException
Ted Yu created HBASE-20690: -- Summary: Moving table to target rsgroup needs to handle TableStateNotFoundException Key: HBASE-20690 URL: https://issues.apache.org/jira/browse/HBASE-20690 Project: HBase Issue Type: Bug Reporter: Ted Yu This is related code: {code} if (targetGroup != null) { for (TableName table: tables) { if (master.getAssignmentManager().isTableDisabled(table)) { LOG.debug("Skipping move regions because the table" + table + " is disabled."); continue; } {code} In a stack trace [~rmani] showed me: {code} 2018-06-06 07:10:44,893 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.TableStateManager: Unable to get table demo:tbl1 state org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: demo:tbl1 at org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193) at org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346) at org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407) at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447) at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470) at org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334) at org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331) at org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540) at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614) at org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331) at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768) at org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750) at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593) at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) {code} The logic should take potential TableStateNotFoundException into account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20680) Master hung during initialization waiting on hbase:meta to be assigned which never does
Ted Yu created HBASE-20680: -- Summary: Master hung during initialization waiting on hbase:meta to be assigned which never does Key: HBASE-20680 URL: https://issues.apache.org/jira/browse/HBASE-20680 Project: HBase Issue Type: Bug Reporter: Josh Elser When running IntegrationTestRSGroups, the test became hung waiting on the master to be initialized. The hbase cluster was launched without RSGroup config. The test script adds required RSGroup configs to hbase-site.xml and restarts the cluster. It seems that, at one point while the master was trying to assign meta, the destination regionserver was in the middle of going down. This has now left HBase in a state where it starts the regionserver recovery procedures, but never actually gets hbase:meta assigned. {code} 2018-06-01 10:47:50,024 INFO [PEWorker-5] procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740}] 2018-06-01 10:47:50,026 DEBUG [WALProcedureStoreSyncThread] wal.WALProcedureStore: hsync completed for hdfs://ctr-e138-1518143905142-340983-03-14.hwx.site:8020/apps/hbase/data/ MasterProcWALs/pv2-0002.log 2018-06-01 10:47:50,026 INFO [PEWorker-3] procedure.MasterProcedureScheduler: pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740 checking lock on 1588230740 2018-06-01 10:47:50,026 DEBUG [PEWorker-3] assignment.RegionStates: setting location=ctr-e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190 for rit=OFFLINE, location=ctr- e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190, table=hbase:meta, region=1588230740 last loc=null 2018-06-01 10:47:50,026 INFO [PEWorker-3] assignment.AssignProcedure: Starting pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,region=1588230740; rit=OFFLINE, location=ctr-e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190; forceNewPlan=false, retain=true target svr=null {code} At Fri Jun 1 10:48:04, master was restarted. The new master picked up pid=41: {code} 2018-06-01 10:48:47,971 INFO [PEWorker-1] assignment.AssignProcedure: Starting pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,region=1588230740; rit=OFFLINE, location=null; forceNewPlan=false, retain=false target svr=null {code} There was no further log for pid=41 after above. Later when master initiated another meta recovery procedure (pid=42), the second procedure seems to be locked out by the former: {code} 2018-06-01 10:49:34,292 INFO [PEWorker-2] procedure.MasterProcedureScheduler: pid=43, ppid=42, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740, target=ctr-e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190 checking lock on 1588230740 2018-06-01 10:49:34,293 DEBUG [PEWorker-2] assignment.RegionTransitionProcedure: LOCK_EVENT_WAIT pid=43 serverLocks={}, namespaceLocks={}, tableLocks={}, regionLocks={{1588230740=exclusiveLockOwner=41, sharedLockCount=0, waitingProcCount=1}}, peerLocks={} {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New HBase committer Guangxu Cheng
Congratulations, Guangxu! Original message From: "张铎(Duo Zhang)" Date: 6/4/18 12:00 AM (GMT-08:00) To: HBase Dev List , hbase-user Subject: [ANNOUNCE] New HBase committer Guangxu Cheng On behalf of the Apache HBase PMC, I am pleased to announce that Guangxu Cheng has accepted the PMC's invitation to become a committer on the project. We appreciate all of Guangxu's generous contributions thus far and look forward to his continued involvement. Congratulations and welcome, Guangxu!
[jira] [Created] (HBASE-20677) Backport HBASE-20566 'Creating a system table after enabling rsgroup feature puts region into RIT ' to branch-2
Ted Yu created HBASE-20677: -- Summary: Backport HBASE-20566 'Creating a system table after enabling rsgroup feature puts region into RIT ' to branch-2 Key: HBASE-20677 URL: https://issues.apache.org/jira/browse/HBASE-20677 Project: HBase Issue Type: Task Reporter: Ted Yu After HBASE-20566 was integrated into master, HBASE-20595 removed the concept of 'special tables' from rsgroups. This task is to backport the fix to branch-2. TestRSGroups#testRSGroupsWithHBaseQuota would be added. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] Apache HBase 1.3.2.1RC0
+1 Built from source Ran test suite using Jdk 8 which passed. On Sat, Jun 2, 2018 at 2:26 PM, Josh Elser wrote: > Hi, > > Please vote to approve the following as Apache HBase 1.3.2.1 > > https://dist.apache.org/repos/dist/dev/hbase/1.3.2.1RC0/ > > Per usual, there is a source release as well as a convenience binary > > This is built with JDK7 from the commit: https://git-wip-us.apache.org/ > repos/asf?p=hbase.git;a=commit;h=bf25c1cb7221178388baaa58f0b16a408e151a69 > (there is a corresponding tag "1.3.2.1RC0" for convenience) > > hbase-1.3.2.1-bin.tar.gz: 1D CB 27 E0 B0 56 28 B8 BE C7 41 03 2E B5 D3 31 > hbase-1.3.2.1-src.tar.gz: 47 99 46 3C 2B E2 59 9B 5B 8B 2F 16 81 53 6B FE > hbase-1.3.2.1-bin.tar.gz: 16EB62DA D4EA40F6 DD8747CF 6A49678E D1A4A53E > B3A9E67D > C53A89F1 471D1DC5 5147E5CA D1AED8B0 B22A01F5 > C1F6F6CA > 4B4E9562 61CDA9B6 91D94C16 26593AFB > hbase-1.3.2.1-src.tar.gz: 63C55C02 DB27461E 2C006758 329EC21E E14823E3 > 9080105B > 43FA6EF2 05BD81A3 D526E2AC 6EAE0FE9 1C3103F4 > 20B8457F > 3C94EF73 5B3CB18C 85B7E0AB 4311CAA4 > > This vote will be open for at least 72 hours (2018/06/05 2130 UTC). > > - Josh (on behalf of the HBase PMC) >
[jira] [Created] (HBASE-20676) Give .hbase-snapshot proper ownership upon directory creation
Ted Yu created HBASE-20676: -- Summary: Give .hbase-snapshot proper ownership upon directory creation Key: HBASE-20676 URL: https://issues.apache.org/jira/browse/HBASE-20676 Project: HBase Issue Type: Task Reporter: Ted Yu This is continuation of the discussion over HBASE-20668. Tthe .hbase-snapshot directory is not created at cluster startup. Normally it is created when snapshot operation is initiated. However, if before any snapshot operation is performed, some non-super user from another cluster conducts ExportSnapshot to this cluster, the .hbase-snapshot directory would be created as that user. (This is just one scenario that can lead to wrong ownership) This JIRA is to seek proper way(s) to ensure that .hbase-snapshot directory would always carry proper onwership and permission upon creation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call
Ted Yu created HBASE-20668: -- Summary: Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call Key: HBASE-20668 URL: https://issues.apache.org/jira/browse/HBASE-20668 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu I was debugging the following error [~romil.choksi] saw during testing ExportSnapshot : {code} 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 ERROR [main] util.AbstractHBaseTool: Error running command-line tool 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: Directory/File does not exist /apps/ hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at org.apache.hadoop.hdfs.server.namenode.FSDirectory. checkOwner(FSDirectory.java:1777) 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp. setOwner(FSDirAttrOp.java:82) {code} Here is corresponding code (with extra log added): {code} try { LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + initialOutputSnapshotDir); boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, initialOutputSnapshotDir, false, false, conf); LOG.info("return val = " + ret); } catch (IOException e) { LOG.warn("Failed to copy the snapshot directory: from=" + snapshotDir + " to=" + initialOutputSnapshotDir, e); throw new ExportSnapshotException("Failed to copy the snapshot directory: from=" + snapshotDir + " to=" + initialOutputSnapshotDir, e); } finally { if (filesUser != null || filesGroup != null) { LOG.warn((filesUser == null ? "" : "Change the owner of " + needSetOwnerDir + " to " + filesUser) + (filesGroup == null ? "" : ", Change the group of " + needSetOwnerDir + " to " + filesGroup)); setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true); } {code} "return val = " was not seen in rerun of the test. This is what the additional log revealed: {code} 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. check(FSPermissionChecker.java:399) 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. checkPermission(FSPermissionChecker.java:255) {code} It turned out that the exception from {{setOwner}} call in the finally block eclipsed the real exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
retrieving leaderAndIsr
Hi, For the following code which works against 0.10 : def isPropagated = server.apis.metadataCache.getPartitionInfo(topic, partition) match { case Some(partitionState) => val leaderAndInSyncReplicas = partitionState.leaderIsrAndControllerEpoch.leaderAndIsr I got the this error compiling against 2.0 : value leaderIsrAndControllerEpoch is not a member of org.apache.kafka.common.requests.UpdateMetadataRequest.PartitionState [ERROR] val leaderAndInSyncReplicas = partitionState.leaderIsrAndControllerEpoch.leaderAndIsr Please comment on the replacement API. Thanks
[jira] [Reopened] (HBASE-20639) Implement permission checking through AccessController instead of RSGroupAdminEndpoint
[ https://issues.apache.org/jira/browse/HBASE-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-20639: > Implement permission checking through AccessController instead of > RSGroupAdminEndpoint > -- > > Key: HBASE-20639 > URL: https://issues.apache.org/jira/browse/HBASE-20639 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu >Assignee: Nihal Jain >Priority: Major > Attachments: HBASE-20639.master.001.patch, > HBASE-20639.master.002.patch, HBASE-20639.master.002.patch > > > Currently permission checking for various RS group operations is done via > RSGroupAdminEndpoint. > e.g. in RSGroupAdminServiceImpl#moveServers() : > {code} > checkPermission("moveServers"); > groupAdminServer.moveServers(hostPorts, request.getTargetGroup()); > {code} > The practice in remaining parts of hbase is to perform permission checking > within AccessController. > Now that observer hooks for RS group operations are in right place, we should > follow best practice and move permission checking to AccessController. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20654) Expose regions in transition thru JMX
Ted Yu created HBASE-20654: -- Summary: Expose regions in transition thru JMX Key: HBASE-20654 URL: https://issues.apache.org/jira/browse/HBASE-20654 Project: HBase Issue Type: Improvement Reporter: Ted Yu Currently only the count of regions in transition is exposed thru JMX. Here is a sample snippet of the /jmx output: {code} { "beans" : [ { ... }, { "name" : "Hadoop:service=HBase,name=Master,sub=AssignmentManager", "modelerType" : "Master,sub=AssignmentManager", "tag.Context" : "master", ... "ritCount" : 3 {code} It would be desirable to expose region name, state for the regions in transition as well. We can place configurable upper bound on the number of entries returned in case there're a lot of regions in transition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20653) Add missing observer hooks for region server group to MasterObserver
Ted Yu created HBASE-20653: -- Summary: Add missing observer hooks for region server group to MasterObserver Key: HBASE-20653 URL: https://issues.apache.org/jira/browse/HBASE-20653 Project: HBase Issue Type: Bug Reporter: Ted Yu Currently the following region server group operations don't have corresponding hook in MasterObserver : * getRSGroupInfo * getRSGroupInfoOfServer * getRSGroupInfoOfTable * listRSGroup This JIRA is to * add them to MasterObserver * add corresponding permission check in AccessController * move the {{checkPermission}} out of RSGroupAdminEndpoint * add corresponding tests to TestRSGroupsWithACL -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20079) Report all the new test classes missing HBaseClassTestRule in one patch
[ https://issues.apache.org/jira/browse/HBASE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-20079. Resolution: Later > Report all the new test classes missing HBaseClassTestRule in one patch > --- > > Key: HBASE-20079 > URL: https://issues.apache.org/jira/browse/HBASE-20079 > Project: HBase > Issue Type: Test > Reporter: Ted Yu >Priority: Trivial > > Currently if there are both new small and large tests without > HBaseClassTestRule in a single patch, the QA bot would report the small test > class as missing HBaseClassTestRule but not the large test. > All new test classes missing HBaseClassTestRule should be reported in the > same QA run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20081) TestDisableTableProcedure sometimes hung in MiniHBaseCluster#waitUntilShutDown
[ https://issues.apache.org/jira/browse/HBASE-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-20081. Resolution: Cannot Reproduce > TestDisableTableProcedure sometimes hung in MiniHBaseCluster#waitUntilShutDown > -- > > Key: HBASE-20081 > URL: https://issues.apache.org/jira/browse/HBASE-20081 > Project: HBase > Issue Type: Test > Reporter: Ted Yu >Priority: Major > > https://builds.apache.org/job/HBase-2.0-hadoop3-tests/lastCompletedBuild/org.apache.hbase$hbase-server/testReport/org.apache.hadoop.hbase.master.procedure/TestDisableTableProcedure/org_apache_hadoop_hbase_master_procedure_TestDisableTableProcedure/ > was one recent occurrence. > I noticed two things in test output: > {code} > 2018-02-25 18:12:45,053 WARN [Time-limited test-EventThread] > master.RegionServerTracker(136): asf912.gq1.ygridcore.net,45649,1519582305777 > is not online or isn't known to the master.The latter could be caused by a > DNS misconfiguration. > {code} > Since DNS misconfiguration was very unlikely on Apache Jenkins nodes, the > above should not have been logged. > {code} > 2018-02-25 18:16:51,531 WARN [master/asf912:0.Chore.1] > master.CatalogJanitor(127): Failed scan of catalog table > java.io.IOException: connection is closed > at > org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:263) > at > org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:761) > at > org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:680) > at > org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:675) > at > org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:188) > at > org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:140) > at > org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:246) > at > org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:119) > at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186) > {code} > The above was possibly related to the lost region server. > I searched test output of successful run where none of the above two can be > seen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20644) Master shutdown due to service ClusterSchemaServiceImpl failing to start
Ted Yu created HBASE-20644: -- Summary: Master shutdown due to service ClusterSchemaServiceImpl failing to start Key: HBASE-20644 URL: https://issues.apache.org/jira/browse/HBASE-20644 Project: HBase Issue Type: Bug Reporter: Romil Choksi >From hbase-hbase-master-ctr-e138-1518143905142-329221-01-03.hwx.site.log : {code} 2018-05-23 22:14:29,750 ERROR [master/ctr-e138-1518143905142-329221-01-03:2] master.HMaster: Failed to become active master java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345) at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291) at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1054) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:918) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2023) {code} Earlier in the log , the namespace region was deemed OPEN on 01-07.hwx.site,16020,1527112194788 which was declared not online: {code} 2018-05-23 21:54:34,786 INFO [master/ctr-e138-1518143905142-329221-01-03:2] assignment.RegionStateStore: Load hbase:meta entry region=01a7f9ba9fffd691f261d3fbc620da06, regionState=OPEN, lastHost=ctr-e138-1518143905142-329221-01-07.hwx.site,16020,1527112194788, regionLocation=ctr-e138-1518143905142-329221-01-07.hwx.site,16020,1527112194788, seqnum=43 2018-05-23 21:54:34,787 INFO [master/ctr-e138-1518143905142-329221-01-03:2] assignment.AssignmentManager: Number of RegionServers=1 2018-05-23 21:54:34,788 INFO [master/ctr-e138-1518143905142-329221-01-03:2] assignment.AssignmentManager: KILL RegionServer=ctr-e138-1518143905142-329221-01-07. hwx.site,16020,1527112194788 hosting regions but not online. {code} Later, even though a different instance on 007 registered with master: {code} 2018-05-23 21:55:13,541 INFO [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.ServerManager: Registering regionserver=ctr-e138-1518143905142-329221-01-07.hwx.site,16020,1527112506002 ... 2018-05-23 21:55:43,881 INFO [master/ctr-e138-1518143905142-329221-01-03:2] client.RpcRetryingCallerImpl: Call exception, tries=12, retries=12, started=69001 ms ago,cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: hbase:namespace,,1527099443383.01a7f9ba9fffd691f261d3fbc620da06. is not online on ctr-e138-1518143905142-329221- 01-07.hwx.site,16020,1527112506002 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3273) at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3250) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2446) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) {code} There was no OPEN request sent to that instance. >From >hbase-hbase-regionserver-ctr-e138-1518143905142-329221-01-07.hwx.site.log : {code} 2018-05-23 21:52:27,414 INFO [RS_CLOSE_REGION-regionserver/ctr-e138-1518143905142-329221-01-07:16020-1] regionserver.HRegion: Closed hbase:namespace,,1527099443383. 01a7f9ba9fffd691f261d3fbc620da06. {code} Then region server 007 restarted: {code} Wed May 23 21:55:03 UTC 2018 Starting regionserver on ctr-e138-1518143905142-329221-01-07.hwx.site {code} After which the region 01a7f9ba9fffd691f261d3fbc620da06 never showed up again in log 007 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20639) Implement permission checking through AccessController instead of RSGroupAdminEndpoint
Ted Yu created HBASE-20639: -- Summary: Implement permission checking through AccessController instead of RSGroupAdminEndpoint Key: HBASE-20639 URL: https://issues.apache.org/jira/browse/HBASE-20639 Project: HBase Issue Type: Bug Reporter: Ted Yu Currently permission checking for various RS group operations is done via RSGroupAdminEndpoint. e.g. in RSGroupAdminServiceImpl#moveServers() : {code} checkPermission("moveServers"); groupAdminServer.moveServers(hostPorts, request.getTargetGroup()); {code} The practice in remaining parts of hbase is to perform permission checking within AccessController. Now that observer hooks for RS group operations are in right place, we should follow best practice and move permission checking to AccessController. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20627) Relocate RS Group pre/post hooks from RSGroupAdminServer to RSGroupAdminEndpoint
[ https://issues.apache.org/jira/browse/HBASE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-20627: > Relocate RS Group pre/post hooks from RSGroupAdminServer to > RSGroupAdminEndpoint > > > Key: HBASE-20627 > URL: https://issues.apache.org/jira/browse/HBASE-20627 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 2.1.0 > > Attachments: 20627.branch-1.txt, 20627.v1.txt, 20627.v2.txt, > 20627.v3.txt > > > Currently RS Group pre/post hooks are called from RSGroupAdminServer. > e.g. RSGroupAdminServer#removeRSGroup : > {code} > if (master.getMasterCoprocessorHost() != null) { > master.getMasterCoprocessorHost().preRemoveRSGroup(name); > } > {code} > RSGroupAdminServer#removeRSGroup is called by RSGroupAdminEndpoint : > {code} > checkPermission("removeRSGroup"); > groupAdminServer.removeRSGroup(request.getRSGroupName()); > {code} > If permission check fails, the pre hook wouldn't be called. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20627) Relocate RS Group pre/post hooks from RSGroupAdminServer to RSGroupAdminEndpoint
Ted Yu created HBASE-20627: -- Summary: Relocate RS Group pre/post hooks from RSGroupAdminServer to RSGroupAdminEndpoint Key: HBASE-20627 URL: https://issues.apache.org/jira/browse/HBASE-20627 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 20627.v1.txt Currently RS Group pre/post hooks are called from RSGroupAdminServer. e.g. RSGroupAdminServer#removeRSGroup : {code} if (master.getMasterCoprocessorHost() != null) { master.getMasterCoprocessorHost().preRemoveRSGroup(name); } {code} RSGroupAdminServer#removeRSGroup is called by RSGroupAdminEndpoint : {code} checkPermission("removeRSGroup"); groupAdminServer.removeRSGroup(request.getRSGroupName()); {code} If permission check fails, the pre hook wouldn't be called. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
RS group related hooks
Hi, I was looking at RSGroupAdminServer#removeRSGroup : if (master.getMasterCoprocessorHost() != null) { master.getMasterCoprocessorHost().preRemoveRSGroup(name); } However, RSGroupAdminServer#removeRSGroup is called by RSGroupAdminEndpoint : checkPermission("removeRSGroup"); groupAdminServer.removeRSGroup(request.getRSGroupName()); Meaning, if permission check fails, the pre hook wouldn't be called. I wonder if the call to preRemoveRSGroup() should be lifted to before calling checkPermission(). Cheers
[jira] [Created] (HBASE-20609) SnapshotHFileCleaner#init should check that params is not null
Ted Yu created HBASE-20609: -- Summary: SnapshotHFileCleaner#init should check that params is not null Key: HBASE-20609 URL: https://issues.apache.org/jira/browse/HBASE-20609 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Noticed the following in the test output of TestHFileArchiving : {code} SnapshotHFileCleaner.init(Map<String,Object>) line: 79 HFileCleaner(CleanerChore).newFileCleaner(String, Configuration) line: 260 HFileCleaner(CleanerChore).initCleanerChain(String) line: 232 HFileCleaner(CleanerChore).(String, int, Stoppable, Configuration, FileSystem, Path, String, Map<String,Object>) line: 182 HFileCleaner.(int, Stoppable, Configuration, FileSystem, Path, Map<String,Object>) line: 104 HFileCleaner.(int, Stoppable, Configuration, FileSystem, Path) line: 51 TestHFileArchiving.testCleaningRace() line: 377 {code} This was due to SnapshotHFileCleaner#init not checking the parameter {{params}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)