[jira] [Created] (HBASE-21511) Remove in progress snapshot check in SnapshotFileCache#getUnreferencedFiles

2018-11-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21511:
--

 Summary: Remove in progress snapshot check in 
SnapshotFileCache#getUnreferencedFiles
 Key: HBASE-21511
 URL: https://issues.apache.org/jira/browse/HBASE-21511
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
 Attachments: 21511.v1.txt

During review of HBASE-21387, [~Apache9] mentioned that the check for in 
progress snapshots in SnapshotFileCache#getUnreferencedFiles is no longer 
needed now that snapshot hfile cleaner and taking snapshot are mutually 
exclusive.

This issue is to address the review comment by removing the check for in 
progress snapshots.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21387) Race condition surrounding in progress snapshot handling in snapshot cache leads to loss of snapshot files

2018-11-23 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21387:


> Race condition surrounding in progress snapshot handling in snapshot cache 
> leads to loss of snapshot files
> --
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
>  Labels: snapshot
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.0.3, 1.4.9, 2.1.2, 1.2.10
>
> Attachments: 0001-UT.patch, 21387-suggest.txt, 21387.dbg.txt, 
> 21387.v10.txt, 21387.v11.txt, 21387.v12.txt, 21387.v2.txt, 21387.v3.txt, 
> 21387.v6.txt, 21387.v7.txt, 21387.v8.txt, 21387.v9.txt, 
> HBASE-21387.branch-1.2.patch, HBASE-21387.branch-1.3.patch, 
> HBASE-21387.branch-1.patch, HBASE-21387.v13.patch, HBASE-21387.v14.patch, 
> HBASE-21387.v15.patch, HBASE-21387.v16.patch, HBASE-21387.v17.patch, 
> two-pass-cleaner.v4.txt, two-pass-cleaner.v6.txt, two-pass-cleaner.v9.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding in progress snapshot(s) handling 
> between refreshCache() and getUnreferencedFiles().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
>   if (!name.equals(SnapshotDescriptionUtils.SNAPSHOT_TMP_DIR_NAME)) {
> {code}
> whose intention is to exclude in progress snapshot(s).
> Suppose when the RefreshCacheTask runs refreshCache, there is some in 
> progress snapshot (about to finish).
> When SnapshotHFileCleaner calls getUnreferencedFiles(), it sees that 
> lastModifiedTime is up to date. So cleaner proceeds to check in progress 
> snapshot(s). However, the snapshot has completed by that time, resulting in 
> some file(s) deemed unreferenced.
> Here is timeline given by Josh illustrating the scenario:
> At time T0, we are checking if F1 is referenced. At time T1, there is a 
> snapshot S1 in progress that is referencing a file F1. refreshCache() is 
> called, but no completed snapshot references F1. At T2, the snapshot S1, 
> which references F1, completes. At T3, we check in-progress snapshots and S1 
> is not included. Thus, F1 is marked as unreferenced even though S1 references 
> it. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21482) TestHRegion fails due to 'Too many open files'

2018-11-15 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21482:
--

 Summary: TestHRegion fails due to 'Too many open files'
 Key: HBASE-21482
 URL: https://issues.apache.org/jira/browse/HBASE-21482
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


TestHRegion fails due to 'Too many open files' in master branch.
Here is one failed subtest :
{code}
testCheckAndDelete_ThatDeleteWasWritten(org.apache.hadoop.hbase.regionserver.TestHRegion)
  Time elapsed: 2.373 sec  <<< ERROR!
java.lang.IllegalStateException: failed to create a child event loop
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4853)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4844)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4835)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testCheckAndDelete_ThatDeleteWasWritten(TestHRegion.java:2034)
Caused by: org.apache.hbase.thirdparty.io.netty.channel.ChannelException: 
failed to open a new selector
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4853)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4844)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4835)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testCheckAndDelete_ThatDeleteWasWritten(TestHRegion.java:2034)
Caused by: java.io.IOException: Too many open files
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4853)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4844)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.initHRegion(TestHRegion.java:4835)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testCheckAndDelete_ThatDeleteWasWritten(TestHRegion.java:2034)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21479) TestHRegionReplayEvents#testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent fails with IndexOutOfBoundsException

2018-11-14 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21479:
--

 Summary: 
TestHRegionReplayEvents#testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent 
fails with IndexOutOfBoundsException
 Key: HBASE-21479
 URL: https://issues.apache.org/jira/browse/HBASE-21479
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


The test fails in both master branch and branch-2 :
{code}
testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent(org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents)
  Time elapsed: 3.74 sec  <<< ERROR!
java.lang.IndexOutOfBoundsException: Index: 2, Size: 1
at 
org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents.testSkippingEditsWithSmallerSeqIdAfterRegionOpenEvent(TestHRegionReplayEvents.java:1042)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New HBase committer Jingyun Tian

2018-11-13 Thread Ted Yu
Congratulations, Jingyun!
 Original message From: Srinivas Reddy 
 Date: 11/13/18  12:46 AM  (GMT-08:00) To: 
dev@hbase.apache.org Cc: Hbase-User  Subject: Re: 
[ANNOUNCE] New HBase committer Jingyun Tian Congratulations Jingyun-Srinivas- 
Typed on tiny keys. pls ignore typos.{mobile app}On Tue 13 Nov, 2018, 15:54 
张铎(Duo Zhang)  On behalf of the Apache HBase PMC, 
I am pleased to announce that Jingyun> Tian has accepted the PMC's invitation 
to become a committer on the> project. We appreciate all of Jingyun's generous 
contributions thus far and> look forward to his continued involvement.>> 
Congratulations and welcome, Jingyun!>

[jira] [Created] (HBASE-21466) WALProcedureStore uses wrong FileSystem if wal.dir is on different FileSystem as rootdir

2018-11-11 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21466:
--

 Summary: WALProcedureStore uses wrong FileSystem if wal.dir is on 
different FileSystem as rootdir
 Key: HBASE-21466
 URL: https://issues.apache.org/jira/browse/HBASE-21466
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


In WALProcedureStore ctor , the fs field is initialized this way:
{code}
this.fs = walDir.getFileSystem(conf);
{code}
However, when wal.dir is on different FileSystem as rootdir, the above would 
return wrong FileSystem.
In the modified TestMasterProcedureEvents, without fix, the master wouldn't 
initialize.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21457) BackupUtils#getWALFilesOlderThan refers to wrong FileSystem

2018-11-08 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21457:
--

 Summary: BackupUtils#getWALFilesOlderThan refers to wrong 
FileSystem
 Key: HBASE-21457
 URL: https://issues.apache.org/jira/browse/HBASE-21457
 Project: HBase
  Issue Type: Bug
Reporter: Janos Gub


Janos reported seeing backup test failure when testing a local HDFS for WALs 
while using WASB/ADLS only for store files.

Janos spotted the code in BackupUtils#getWALFilesOlderThan which uses HBase 
root dir for retrieving WAL files.

We should use the helper methods from CommonFSUtils.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21247) Custom Meta WAL Provider doesn't default to custom WAL Provider whose configuration value is outside the enums in Providers

2018-11-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21247:


> Custom Meta WAL Provider doesn't default to custom WAL Provider whose 
> configuration value is outside the enums in Providers
> ---
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.branch-2.patch, 21247.v1.txt, 21247.v10.txt, 
> 21247.v11.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 
> 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for custom Meta WAL Provider to default to the 
> custom WAL Provider which is supplied by class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21438) TestAdmin2#testGetProcedures fails due to FailedProcedure inaccessible

2018-11-05 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21438:
--

 Summary: TestAdmin2#testGetProcedures fails due to FailedProcedure 
inaccessible
 Key: HBASE-21438
 URL: https://issues.apache.org/jira/browse/HBASE-21438
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


>From 
>https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1863/testReport/org.apache.hadoop.hbase.client/TestAdmin2/testGetProcedures/
> :
{code}
Mon Nov 05 04:52:13 UTC 2018, RpcRetryingCaller{globalStartTime=1541393533029, 
pause=250, maxAttempts=7}, 
org.apache.hadoop.hbase.procedure2.BadProcedureException: 
org.apache.hadoop.hbase.procedure2.BadProcedureException: The procedure class 
org.apache.hadoop.hbase.procedure2.FailedProcedure must be accessible and have 
an empty constructor
 at 
org.apache.hadoop.hbase.procedure2.ProcedureUtil.validateClass(ProcedureUtil.java:82)
 at 
org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProtoProcedure(ProcedureUtil.java:162)
 at 
org.apache.hadoop.hbase.master.MasterRpcServices.getProcedures(MasterRpcServices.java:1249)
 at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21416) Intermittent TestRegionInfoDisplay failure due to shift in relTime of RegionState#toDescriptiveString

2018-10-31 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21416:
--

 Summary: Intermittent TestRegionInfoDisplay failure due to shift 
in relTime of RegionState#toDescriptiveString
 Key: HBASE-21416
 URL: https://issues.apache.org/jira/browse/HBASE-21416
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Over 
https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2.1/1799/testReport/junit/org.apache.hadoop.hbase.client/TestRegionInfoDisplay/testRegionDetailsForDisplay/
 :
{code}
org.junit.ComparisonFailure: expected:<...:30 UTC 2018 (PT0.00[6]S ago), 
server=null> but was:<...:30 UTC 2018 (PT0.00[7]S ago), server=null>
at 
org.apache.hadoop.hbase.client.TestRegionInfoDisplay.testRegionDetailsForDisplay(TestRegionInfoDisplay.java:78)
{code}
Here is how toDescriptiveString composes relTime:
{code}
long relTime = System.currentTimeMillis() - stamp;
{code}
In the test, state.toDescriptiveString() is called twice for the assertion 
where different return values from System.currentTimeMillis() caused the 
assertion to fail in the above occasion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21180.

Resolution: Cannot Reproduce

> findbugs incurs DataflowAnalysisException for hbase-server module
> -
>
> Key: HBASE-21180
> URL: https://issues.apache.org/jira/browse/HBASE-21180
> Project: HBase
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Minor
>
> Running findbugs, I noticed the following in hbase-server module:
> {code}
> [INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server 
> ---
> [INFO] Fork Value is true
>  [java] The following errors occurred during analysis:
>  [java]   Error generating derefs for 
> org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
>  [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get 
> position -1 of stack
>  [java]   At 
> edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250)
>  [java]   At 
> edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322)
>  [java]   At 
> edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005)
>  [java]   At 
> edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209)
>  [java]   At 
> edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76)
>  [java]   At 
> edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089)
>  [java]   At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283)
>  [java]   At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393)
>  [java]   At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200)
>  [java] The following classes needed for analysis were missing:
>  [java]   accept
>  [java]   apply
>  [java]   run
>  [java]   test
>  [java]   call
>  [java]   exec
>  [java]   getAsInt
>  [java]   applyAsLong
>  [java]   storeFile
>  [java]   get
>  [java]   visit
>  [java]   compare
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21387:
--

 Summary: Race condition in snapshot cache refreshing leads to loss 
of snapshot files
 Key: HBASE-21387
 URL: https://issues.apache.org/jira/browse/HBASE-21387
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


During recent report from customer where ExportSnapshot failed:
{code}
2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
snapshot.SnapshotReferenceUtil: Can't find hfile: 
44f6c3c646e84de6a63fe30da4fcb3aa in the real 
(hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 or archive 
(hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 directory for the primary table. 
{code}
We found the following in log:
{code}
2018-10-09 18:54:23,675 DEBUG 
[00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
cleaner.HFileCleaner: Removing: 
hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
from archive
{code}
The root cause is race condition surrounding SnapshotFileCache#refreshCache().
There are two callers of refreshCache: one from RefreshCacheTask#run and the 
other from SnapshotHFileCleaner.
Let's look at the code of refreshCache:
{code}
// if the snapshot directory wasn't modified since we last check, we are 
done
if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;

// 1. update the modified time
this.lastModifiedTime = dirStatus.getModificationTime();

// 2.clear the cache
this.cache.clear();
{code}
Suppose the RefreshCacheTask runs past the if check and sets 
this.lastModifiedTime
The cleaner executes refreshCache and returns immediately since 
this.lastModifiedTime matches the modification time of the directory.
Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
lookup, the cache is empty.
Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21318:


> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, 
> HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21149.

   Resolution: Duplicate
Fix Version/s: (was: 3.0.0)

> TestIncrementalBackupWithBulkLoad may fail due to file copy failure
> ---
>
> Key: HBASE-21149
> URL: https://issues.apache.org/jira/browse/HBASE-21149
> Project: HBase
>  Issue Type: Test
>  Components: backuprestore
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 21149.v2.txt, HBASE-21149-v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
>  :
> {code}
> 2018-09-03 11:54:30,526 ERROR [Time-limited test] 
> impl.TableBackupClient(235): Unexpected Exception : Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
> java.io.IOException: Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
>   at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
>   at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}
> However, some part of the test output was lost:
> {code}
> 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
> ...[truncated 398396 chars]...
> 8)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21381) Document the hadoop versions using which backup and restore feature works

2018-10-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21381:
--

 Summary: Document the hadoop versions using which backup and 
restore feature works
 Key: HBASE-21381
 URL: https://issues.apache.org/jira/browse/HBASE-21381
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


HADOOP-15850 fixes a bug where CopyCommitter#concatFileChunks unconditionally 
tried to concatenate the files being DistCp'ed to target cluster (though the 
files are independent).

Following is the log snippet of the failed concatenation attempt:
{code}
2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
job_local1795473782_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
   
160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
 length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
   
2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
 length = 5142 aclEntries = null, xAttrs = null}
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
{code}
Backup and Restore uses DistCp to transfer files between clusters.
Without the fix from HADOOP-15850, the transfer would fail.

This issue is to document the hadoop versions which contain HADOOP-15850 so 
that user of Backup and Restore feature knows which hadoop versions they can 
use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21353) TestHBCKCommandLineParsing#testCommandWithOptions hangs on call to HBCK2#checkHBCKSupport

2018-10-20 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21353:
--

 Summary: TestHBCKCommandLineParsing#testCommandWithOptions hangs 
on call to HBCK2#checkHBCKSupport
 Key: HBASE-21353
 URL: https://issues.apache.org/jira/browse/HBASE-21353
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


I noticed the following when running 
TestHBCKCommandLineParsing#testCommandWithOptions :
{code}
"main" #1 prio=5 os_prio=31 tid=0x7f851c80 nid=0x1703 waiting on 
condition [0x70216000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00076d3055d8> (a 
java.util.concurrent.CompletableFuture$Signaller)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
at 
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at 
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:564)
at 
org.apache.hadoop.hbase.client.ConnectionImplementation.(ConnectionImplementation.java:297)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:229)
at 
org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$11/502838712.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:347)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:227)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:127)
at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:93)
at org.apache.hbase.HBCK2.run(HBCK2.java:352)
at 
org.apache.hbase.TestHBCKCommandLineParsing.testCommandWithOptions(TestHBCKCommandLineParsing.java:62)
{code}
The test doesn't spin up hbase cluster.
Hence the call to check hbck support hangs.

In HBCK2#run, we can refactor the code such that argument parsing is done prior 
to calling HBCK2#checkHBCKSupport .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21281) Update bouncycastle dependency.

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21281:


> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21341) DeadServer shouldn't import unshaded Preconditions

2018-10-18 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21341:
--

 Summary: DeadServer shouldn't import unshaded Preconditions
 Key: HBASE-21341
 URL: https://issues.apache.org/jira/browse/HBASE-21341
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


DeadServer currently imports unshaded Preconditions :
{code}
import com.google.common.base.Preconditions;
{code}
We should import shaded version of Preconditions.

This is the only place where unshaded class from com.google.common is imported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] Please welcome Zach York to the HBase PMC

2018-10-11 Thread Ted Yu
Congratulations, Zach !

On Thu, Oct 11, 2018 at 1:01 PM Sean Busbey  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that Zach
> York has accepted our invitation to become a PMC member on the Apache
> HBase project. We appreciate Zach stepping up to take more
> responsibility in the HBase project.
>
> Please join me in welcoming Zach to the HBase PMC!
>
> As a reminder, if anyone would like to nominate another person as a
> committer or PMC member, even if you are not currently a committer or
> PMC member, you can always drop a note to priv...@hbase.apache.org to
> let us know.
>


Re: Does Hbase backup process support encryption while transporting the data from one cluster to other cluster

2018-10-09 Thread Ted Yu
bq. Does copyTable support hashing of data while copying?

No.

bq.  Same for distcp utility ?

The above would get better answer posting on hadoop mailing list.

Thanks

On Tue, Oct 9, 2018 at 5:28 AM neo0731  wrote:

>
> Question arises when migrating the data from one hbase table to another.
>
> Input
>
> To sync the production cluster data with dev cluster. Additionaly, while
> copying we need to re-hash the following fields: hashed_email, lexer_id,
> foo_imsi, foo_msn, signal_uid, bar_imsi.
>
> Question is : Does copyTable support hashing of data while copying? Same
> for
> distcp utility ? Is it possible to supply some example code in scala as
> well
>
> Any help on it would be much appreciated?
>
>
>
> --
> Sent from:
> http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html
>


[jira] [Created] (HBASE-21279) Split TestAdminShell into several tests

2018-10-08 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21279:
--

 Summary: Split TestAdminShell into several tests
 Key: HBASE-21279
 URL: https://issues.apache.org/jira/browse/HBASE-21279
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


In the flaky test board, TestAdminShell often timed out 
(https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/branch-2/lastSuccessfulBuild/artifact/dashboard.html).

I ran the test on Linux with SSD and reproduced the timeout (see attached test 
output).
{code}
2018-10-08 02:36:09,146 DEBUG [main] hbase.HBaseTestingUtility(351): Setting 
hbase.rootdir to 
/mnt/disk2/a/2-hbase/hbase-shell/target/test-data/a103d8e4-695c-a5a9-6690-1ef2580050f9
...
2018-10-08 02:49:09,093 DEBUG 
[RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=7] 
master.MasterRpcServices(1171): Checking to see if procedure is done pid=871
Took 0.7262 seconds2018-10-08 02:49:09,324 DEBUG [PEWorker-1] 
util.FSTableDescriptors(684): Wrote into 
hdfs://localhost:43859/user/hbase/test-data/cefc73d9-cc37-d2a6-b92b-   
d935316c9241/.tmp/data/default/hbase_shell_tests_table/.tabledesc/.tableinfo.01
2018-10-08 02:49:09,328 INFO  
[RegionOpenAndInitThread-hbase_shell_tests_table-1] regionserver.HRegion(7004): 
creating HRegion hbase_shell_tests_table HTD == 
'hbase_shell_tests_table', {NAME => 'x', VERSIONS => '5', EVICT_BLOCKS_ON_CLOSE 
=> 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', 
  CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', 
TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 
'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', 
CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', 
COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'},  {NAME 
=> 'y', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR 
=> 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false',  
DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', 
REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 
'false', IN_MEMORY => 'false',  CACHE_BLOOMS_ON_WRITE => 'false', 
PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 
'true', BLOCKSIZE => '65536'} RootDir = hdfs://localhost:43859/
user/hbase/test-data/cefc73d9-cc37-d2a6-b92b-d935316c9241/.tmp Table name == 
hbase_shell_tests_table
^[[38;5;226mE^[[0m
===
Error: ^[[48;5;16;38;5;226;1mtest_Get_simple_status(Hbase::StatusTest)^[[0m: 
Java::JavaIo::InterruptedIOException: Interrupt while waiting on Operation: 
CREATE, Table Name:  default:hbase_shell_tests_table, procId: 871
2018-10-08 02:49:09,361 INFO  [Block report processor] 
blockmanagement.BlockManager(2645): BLOCK* addStoredBlock: blockMap updated: 
127.0.0.1:41338 is added to   
blk_1073742193_1369{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-ecc89143-e0a5-4a1c-b552-120be2561334:NORMAL:127.0.0.1:
   41338|RBW]]} size 58
> TEST TIMED OUT. PRINTING THREAD DUMP. <
{code}
We can see that the procedure #871 wasn't stuck - the timeout cut in and 
stopped the test.

We should separate the current test into two (or more) test files (with 
corresponding .rb) so that the execution time consistently would not exceed 
limit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21272) Re-add assertions for RS Group admin tests

2018-10-05 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21272:
--

 Summary: Re-add assertions for RS Group admin tests
 Key: HBASE-21272
 URL: https://issues.apache.org/jira/browse/HBASE-21272
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 1.5.0


The checked in version of HBASE-21258 for branch-1 didn't include assertions 
for adding / removing RS group coprocessor hook calls.

This issue is to add the assertions to corresponding tests in TestRSGroupsAdmin1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] The first HBase 1.4.8 release candidate (RC0) is available

2018-10-04 Thread Ted Yu
+1

 - verified checksums and signatures: good
 - basic checking on Web UI : good
 - ran test suite with : good

Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe;
2018-06-17T18:33:14Z)
Maven home: /apache-maven-3.5.4
Java version: 1.8.0_161, vendor: Oracle Corporation, runtime:
/mnt/disk2/a/jdk1.8.0_161/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-327.28.3.el7.x86_64", arch: "amd64",
family: "unix"

 - Ran LTT with 1M rows : good

On Wed, Oct 3, 2018 at 9:34 AM Andrew Purtell  wrote:

> RC errata:
>
> TestRSGroups will pass in isolation but may fail when run as part of the
> suite. There have been several JIRAs filed against this unit like
> HBASE-19444, HBASE-19461, HBASE-20137, and mentioned on HBASE-21187. It's
> running time is far too long. I have filed HBASE-21265 to split it up. Test
> stabilization work would be a part of that. I don't think  this rises to
> the level of failing the RC vote because TestRSGroups will pass
> consistently, at least for me, when run in isolation. I do agree that
> without work to improve the test it doesn't offer the kind of functional
> assurance we'd like to derive from a unit test.
>
>
> On Tue, Oct 2, 2018 at 5:57 PM Andrew Purtell  wrote:
>
> > The first HBase 1.4.8 release candidate (RC0) is available for download
> at
> > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.8RC0/ and Maven
> > artifacts are available in the temporary repository
> > https://repository.apache.org/content/repositories/orgapachehbase-1233/
> >
> > The git tag corresponding to the candidate is '1.4.8RC0' (91118ce5f1).
> >
> > A detailed source and binary compatibility report for this release is
> > available for your review at
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.4.8RC0/compat-check-report.html
> > . There are no reported compatibility issues.
> >
> > A list of the 33 issues resolved in this release can be found at
> > https://s.apache.org/xpxo .
> >
> > Please try out the candidate and vote +1/0/-1.
> >
> > The vote will be open for at least 72 hours. Unless objection I will try
> > to close it Monday October 8, 2018 if we have sufficient votes.
> >
> > Prior to making this announcement I made the following preflight checks:
> >
> > RAT check passes (7u80)
> > Unit test suite passes (7u80, 8u181)
> > Opened the UI in a browser, poked around
> > LTT load 1M rows with 100% verification and 20% updates (8u181)
> > ITBLL 500M rows with serverKilling monkey (8u181)
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>


[jira] [Reopened] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations

2018-10-01 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21221:


> Ineffective assertion in TestFromClientSide3#testMultiRowMutations
> --
>
> Key: HBASE-21221
> URL: https://issues.apache.org/jira/browse/HBASE-21221
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 21221.addendum.txt, 21221.v10.txt, 21221.v11.txt, 
> 21221.v12.txt, 21221.v7.txt, 21221.v8.txt, 21221.v9.txt
>
>
> Observed the following in 
> org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt :
> {code}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Timed out waiting for lock for row: ROW-1 in region 
> 089bdfa75f44d88e596479038a6da18b
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424)
>   at 
> org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463)
> ...
> Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp 
> should fail because the target lock is blocked by previous put
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
> Here is related code:
> {code}
>   cpService.execute(() -> {
> ...
> if (!threw) {
>   // Can't call fail() earlier because the catch would eat it.
>   fail("This cp should fail because the target lock is blocked by 
> previous put");
> }
> {code}
> Since the fail() call is executed by the cpService, the assertion had no 
> bearing on the outcome of the test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-09-30 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21261:
--

 Summary: Add log4j.properties for hbase-rsgroup tests
 Key: HBASE-21261
 URL: https://issues.apache.org/jira/browse/HBASE-21261
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
Turns out that under hbase-rsgroup/src/test/resources there is no 
log4j.properties

This issue adds log4j.properties for hbase-rsgroup tests.

This would be useful when finding root cause for hbase-rsgroup test failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.

2018-09-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21207:


> Add client side sorting functionality in master web UI for table and region 
> server details.
> ---
>
> Key: HBASE-21207
> URL: https://issues.apache.org/jira/browse/HBASE-21207
> Project: HBase
>  Issue Type: Improvement
>  Components: master, monitoring, UI, Usability
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, 
> 21207.branch-1.addendum.patch, 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, 
> HBASE-21207-branch-1.patch, HBASE-21207-branch-1.v1.patch, 
> HBASE-21207-branch-2.v1.patch, HBASE-21207.patch, HBASE-21207.patch, 
> HBASE-21207.v1.patch, edc5c812-b928-11e8-87e2-ce6396629bbc.png
>
>
> In Master UI, we can see region server details like requests per seconds and 
> number of regions etc. Similarly, for tables also we can see online regions , 
> offline regions.
> It will help ops people in determining hot spotting if we can provide sort 
> functionality in the UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups

2018-09-29 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21258:
--

 Summary: Add resetting of flags for RS Group pre/post hooks in 
TestRSGroups
 Key: HBASE-21258
 URL: https://issues.apache.org/jira/browse/HBASE-21258
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ted Yu


Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS 
Group pre/post hooks in TestRSGroups was absent.

This issue is to add the resetting of these flags before each subtest starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: QA run unable to process patches

2018-09-27 Thread Ted Yu
>From https://builds.apache.org/job/PreCommit-HBASE-Build/14515/console :

*22:32:24* [Thu Sep 27 22:32:24 UTC 2018 DEBUG]: jira_http_fetch:
https://issues.apache.org/jira/browse/HBASE-21247 returned 4xx status
code. Maybe incorrect username/password?*22:32:24* [Thu Sep 27
22:32:24 UTC 2018 DEBUG]: jira_locate_patch: not a JIRA.


FYI


On Thu, Sep 27, 2018 at 3:32 PM Sean Busbey  wrote:

> Try rebuilding with the debug flag on.
>
> On Thu, Sep 27, 2018, 14:17 Ted Yu  wrote:
>
> > Hi,
> > Starting this morning, some QA bot runs ended with something similar to
> the
> > following (
> > https://builds.apache.org/job/PreCommit-HBASE-Build/14508/console
> > ):
> >
> > *05:00:34* ERROR: Unsure how to process HBASE-21242.
> >
> >
> > I wonder if someone has idea where I should look in order to determine
> > the root cause.
> >
> >
> > Thanks
> >
>


QA run unable to process patches

2018-09-27 Thread Ted Yu
Hi,
Starting this morning, some QA bot runs ended with something similar to the
following (https://builds.apache.org/job/PreCommit-HBASE-Build/14508/console
):

*05:00:34* ERROR: Unsure how to process HBASE-21242.


I wonder if someone has idea where I should look in order to determine
the root cause.


Thanks


[jira] [Created] (HBASE-21247) Allow WAL Provider to be specified by configuration without explicit enum in Providers

2018-09-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21247:
--

 Summary: Allow WAL Provider to be specified by configuration 
without explicit enum in Providers
 Key: HBASE-21247
 URL: https://issues.apache.org/jira/browse/HBASE-21247
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 21247.v1.txt

Currently all the WAL Providers acceptable to hbase are specified in Providers 
enum of WALFactory.
This restricts the ability for additional WAL Providers to be supplied - by 
class name.

This issue introduces additional config which allows the specification of new 
WAL Provider through class name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21246) Introduce WALIdentity interface

2018-09-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21246:
--

 Summary: Introduce WALIdentity interface
 Key: HBASE-21246
 URL: https://issues.apache.org/jira/browse/HBASE-21246
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Assignee: Ted Yu


We are introducing WALIdentity interface so that the WAL representation can be 
decoupled from distributed filesystem.

The interface provides getName method whose return value can represent filename 
in distributed filesystem environment or, the name of the stream when the WAL 
is backed by log stream.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21238) MapReduceHFileSplitterJob#run shouldn't call System.exit

2018-09-26 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21238:
--

 Summary: MapReduceHFileSplitterJob#run shouldn't call System.exit
 Key: HBASE-21238
 URL: https://issues.apache.org/jira/browse/HBASE-21238
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


{code}
if (args.length < 2) {
  usage("Wrong number of arguments: " + args.length);
  System.exit(-1);
{code}
Correct way of handling error condition is through return value of run method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21230) BackupUtils#checkTargetDir doesn't compose error message correctly

2018-09-25 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21230:
--

 Summary: BackupUtils#checkTargetDir doesn't compose error message 
correctly
 Key: HBASE-21230
 URL: https://issues.apache.org/jira/browse/HBASE-21230
 Project: HBase
  Issue Type: Bug
  Components: backuprestore
Reporter: Ted Yu


Here is related code:
{code}
  String expMsg = e.getMessage();
  String newMsg = null;
  if (expMsg.contains("No FileSystem for scheme")) {
newMsg =
"Unsupported filesystem scheme found in the backup target url. 
Error Message: "
+ newMsg;
{code}
I think the intention was to concatenate expMsg at the end of newMsg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-16627) AssignmentManager#isDisabledorDisablingRegionInRIT should check whether table exists

2018-09-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-16627.

Resolution: Later

> AssignmentManager#isDisabledorDisablingRegionInRIT should check whether table 
> exists
> 
>
> Key: HBASE-16627
> URL: https://issues.apache.org/jira/browse/HBASE-16627
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Stephen Yuan Jiang
>Priority: Minor
>
> [~stack] first reported this issue when he played with backup feature.
> The following exception can be observed in backup unit tests:
> {code}
> 2016-09-13 16:21:57,661 ERROR [ProcedureExecutor-3] 
> master.TableStateManager(134): Unable to get table hbase:backup state
> org.apache.hadoop.hbase.TableNotFoundException: hbase:backup
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:174)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:131)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.isDisabledorDisablingRegionInRIT(AssignmentManager.java:1221)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:739)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1567)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1546)
> at 
> org.apache.hadoop.hbase.util.ModifyRegionUtils.assignRegions(ModifyRegionUtils.java:254)
> at 
> org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.assignRegions(CreateTableProcedure.java:430)
> at 
> org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:127)
> at 
> org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:57)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:119)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:452)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1066)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:855)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:808)
> {code}
> AssignmentManager#isDisabledorDisablingRegionInRIT should take table 
> existence into account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations

2018-09-22 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21221:
--

 Summary: Ineffective assertion in 
TestFromClientSide3#testMultiRowMutations
 Key: HBASE-21221
 URL: https://issues.apache.org/jira/browse/HBASE-21221
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Observed the following in 
org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt :
{code}
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
java.io.IOException: Timed out waiting for lock for row: ROW-1 in region 
089bdfa75f44d88e596479038a6da18b
  at 
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816)
  at 
org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008)
  at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424)
  at 
org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116)
  at 
org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266)
  at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182)
  at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481)
  at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463)
...
Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp 
should fail because the target lock is blocked by previous put
  at org.junit.Assert.fail(Assert.java:88)
  at 
org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
{code}
Here is related code:
{code}
  cpService.execute(() -> {
...
if (!threw) {
  // Can't call fail() earlier because the catch would eat it.
  fail("This cp should fail because the target lock is blocked by 
previous put");
}
{code}
Since the fail() call is executed by the cpService, the assertion had no 
bearing on the outcome of the test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky

2018-09-20 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21216:
--

 Summary: TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
 Key: HBASE-21216
 URL: https://issues.apache.org/jira/browse/HBASE-21216
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/
> :
{code}
java.lang.AssertionError: Archived hfiles [] and table hfiles 
[9ca09392705f425f9c916beedc10d63c] is missing snapshot 
file:6739a09747e54189a4112a6d8f37e894
at 
org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370)
{code}
The file appeared in archive dir before hfile cleaners were run:
{code}
2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|-archive/
2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|data/
2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|---default/
2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|--test/
2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|-1237d57b63a7bdf067a930441a02514a/
2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|recovered.edits/
2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): 
|---4.seqid
2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|-29e1700e09b51223ad2f5811105a4d51/
2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|fam/
2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): 
|---2c66a18f6c1a4074b84ffbb3245268c4
2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): 
|---45bb396c6a5e49629e45a4d56f1e9b14
2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): 
|---6739a09747e54189a4112a6d8f37e894
{code}
However, the archive dir became empty after hfile cleaners were run:
{code}
2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|-archive/
2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): 
|-corrupt/
{code}
Leading to the assertion failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21198) Exclude dependency on net.minidev:json-smart

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21198:
--

 Summary: Exclude dependency on net.minidev:json-smart
 Key: HBASE-21198
 URL: https://issues.apache.org/jira/browse/HBASE-21198
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/14414/artifact/patchprocess/patch-javac-3.0.0.txt
> :
{code}
[ERROR] Failed to execute goal on project hbase-common: Could not resolve 
dependencies for project org.apache.hbase:hbase-common:jar:3.0.0-SNAPSHOT: 
Failed to collect dependencies at org.apache.hadoop:hadoop-common:jar:3.0.0 -> 
org.apache.hadoop:hadoop-auth:jar:3.0.0 -> 
com.nimbusds:nimbus-jose-jwt:jar:4.41.1 -> 
net.minidev:json-smart:jar:2.3-SNAPSHOT: Failed to read artifact descriptor for 
net.minidev:json-smart:jar:2.3-SNAPSHOT: Could not transfer artifact 
net.minidev:json-smart:pom:2.3-SNAPSHOT from/to dynamodb-local-oregon 
(https://s3-us-west-2.amazonaws.com/dynamodb-local/release): Access denied to: 
https://s3-us-west-2.amazonaws.com/dynamodb-local/release/net/minidev/json-smart/2.3-SNAPSHOT/json-smart-2.3-SNAPSHOT.pom
 , ReasonPhrase:Forbidden. -> [Help 1]
{code}
We should exclude dependency on net.minidev:json-smart

hbase-common/bin/pom.xml has done so.

The other pom.xml should do the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21194:
--

 Summary: Add TestCopyTable which exercises MOB feature
 Key: HBASE-21194
 URL: https://issues.apache.org/jira/browse/HBASE-21194
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.

We should add variant that enables MOB on the table being copied and verify 
that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module

2018-09-10 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21180:
--

 Summary: findbugs incurs DataflowAnalysisException for 
hbase-server module
 Key: HBASE-21180
 URL: https://issues.apache.org/jira/browse/HBASE-21180
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


Running findbugs, I noticed the following in hbase-server module:
{code}
[INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server ---
[INFO] Fork Value is true
 [java] The following errors occurred during analysis:
 [java]   Error generating derefs for 
org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
 [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get 
position -1 of stack
 [java]   At edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250)
 [java]   At 
edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141)
 [java]   At 
edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50)
 [java]   At 
edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31)
 [java]   At 
edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369)
 [java]   At 
edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322)
 [java]   At 
edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005)
 [java]   At 
edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209)
 [java]   At 
edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76)
 [java]   At 
edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089)
 [java]   At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283)
 [java]   At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393)
 [java]   At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200)
 [java] The following classes needed for analysis were missing:
 [java]   accept
 [java]   apply
 [java]   run
 [java]   test
 [java]   call
 [java]   exec
 [java]   getAsInt
 [java]   applyAsLong
 [java]   storeFile
 [java]   get
 [java]   visit
 [java]   compare
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-09-09 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21175:
--

 Summary: Partially initialized SnapshotHFileCleaner leads to NPE 
during TestHFileArchiving
 Key: HBASE-21175
 URL: https://issues.apache.org/jira/browse/HBASE-21175
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
test.
When SnapshotHFileCleaner.init() is called, there is no master parameter passed 
in {{params}}.

When the chore runs the cleaner during the test, NPE comes out of this line in 
getDeletableFiles():
{code}
  return cache.getUnreferencedFiles(files, master.getSnapshotManager());
{code}
since master is null.

We should either check for the null master or, pass master instance properly 
when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21129) Clean up duplicate codes in #equals and #hashCode methods of Filter

2018-09-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21129.

Resolution: Fixed

> Clean up duplicate codes in #equals and #hashCode methods of Filter
> ---
>
> Key: HBASE-21129
> URL: https://issues.apache.org/jira/browse/HBASE-21129
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21129.addendum, HBASE-21129.master.001.patch, 
> HBASE-21129.master.002.patch, HBASE-21129.master.003.patch, 
> HBASE-21129.master.004.patch, HBASE-21129.master.005.patch, 
> HBASE-21129.master.006.patch, HBASE-21129.master.007.patch, 
> HBASE-21129.master.008.patch
>
>
> It is a follow-up of HBASE-19008, aiming to clean up duplicate codes in 
> #equals and #hashCode methods. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-06 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21160:
--

 Summary: Assertion in 
TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels 
is ignored
 Key: HBASE-21160
 URL: https://issues.apache.org/jira/browse/HBASE-21160
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
> (HBASE-21138 QA run):
{code}
[WARNING] 
/testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
 [AssertionFailureIgnored] This assertion throws an AssertionError if it fails, 
which will be caught by an enclosing try block.
{code}
Here is related code:
{code}
  PrivilegedExceptionAction scanAction = new 
PrivilegedExceptionAction() {
@Override
public Void run() throws Exception {
  try (Connection connection = ConnectionFactory.createConnection(conf);
...
assertEquals(1, next.length);
  } catch (Throwable t) {
throw new IOException(t);
  }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21150) Avoid delay in first flushes due to overheads in table metrics registration

2018-09-04 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21150:


I  didn't open this issue for backporting.

HBASE-15728 is still in master and the delay in first flushes is still there.

> Avoid delay in first flushes due to overheads in table metrics registration
> ---
>
> Key: HBASE-21150
> URL: https://issues.apache.org/jira/browse/HBASE-21150
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Attachments: 21150.v1.txt, 21150.v2.txt, 21150.v3.txt
>
>
> After HBASE-15728 is integrated, the lazy table metrics registration results 
> in penalty for the first flushes.
> Excerpt from log shows delay (note the same timestamp 08:18:23,234) :
> {code:java}
> 2018-09-02 08:18:23,232 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableSourceImpl(124): Creating new  
> MetricsTableSourceImpl for table 'testtb-1535901500805'
> 2018-09-02 08:18:23,233 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableSourceImpl(137): registering metrics for testtb-   
> 1535901500805
> 2018-09-02 08:18:23,234 INFO  
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
> regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343,   
> heapSize ~5.16 KB/5280, currentSize=0 B/0 for 
> fa403f6a4fb8dbc1a1c389744fce2d58 in 280ms, sequenceid=5, compaction 
> requested=false
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup]
> {code}
> This is a regression.
> When first region of the table is opened on region server, we can proactively 
> register table metrics.
> This would avoid the penalty on first flushes for the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21150) Avoid delay in first flushes due to contention in table metrics registration

2018-09-04 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21150:
--

 Summary: Avoid delay in first flushes due to contention in table 
metrics registration
 Key: HBASE-21150
 URL: https://issues.apache.org/jira/browse/HBASE-21150
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


After HBASE-15728 is integrated, the lazy table metrics registration results in 
penalty for the first flushes.
Excerpt from log shows delay (note the same timestamp 08:18:23,234) :
{code}
2018-09-02 08:18:23,232 DEBUG 
[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
regionserver.MetricsTableSourceImpl(124): Creating new  
MetricsTableSourceImpl for table 'testtb-1535901500805'
2018-09-02 08:18:23,233 DEBUG 
[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
regionserver.MetricsTableSourceImpl(137): registering metrics for testtb-   
1535901500805
2018-09-02 08:18:23,234 INFO  
[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343,   
heapSize ~5.16 KB/5280, currentSize=0 B/0 for fa403f6a4fb8dbc1a1c389744fce2d58 
in 280ms, sequenceid=5, compaction requested=false
2018-09-02 08:18:23,234 DEBUG 
[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1] 
regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
testtb-1535901500805 
Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup]
2018-09-02 08:18:23,234 DEBUG 
[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register  
testtb-1535901500805 
Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup]
2018-09-02 08:18:23,234 DEBUG 
[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1] 
regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
testtb-1535901500805 
Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup]
2018-09-02 08:18:23,234 DEBUG 
[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2] 
regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
testtb-1535901500805 
Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup]
2018-09-02 08:18:23,234 DEBUG 
[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2] 
regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register  
testtb-1535901500805 
Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup]
2018-09-02 08:18:23,234 DEBUG 
[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
testtb-1535901500805 
Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup]
{code}
This is a regression.

When first region of the table is opened on region server, we can proactively 
register table metrics.
This would avoid the penalty on first flushes for the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-09-04 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21149:
--

 Summary: TestIncrementalBackupWithBulkLoad may fail due to file 
copy failure
 Key: HBASE-21149
 URL: https://issues.apache.org/jira/browse/HBASE-21149
 Project: HBase
  Issue Type: Test
  Components: backuprestore
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
> :
{code}
2018-09-03 11:54:30,526 ERROR [Time-limited test] impl.TableBackupClient(235): 
Unexpected Exception : Failed copy from 
hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
 to hdfs://localhost:53075/backupUT/backup_1535975655488
java.io.IOException: Failed copy from 
hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
 to hdfs://localhost:53075/backupUT/backup_1535975655488
at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
at 
org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
{code}
However, some part of the test output was lost:
{code}
2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
...[truncated 398396 chars]...
8)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21141) Enable MOB in backup / restore test involving incremental backup

2018-09-02 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21141:
--

 Summary: Enable MOB in backup / restore test involving incremental 
backup
 Key: HBASE-21141
 URL: https://issues.apache.org/jira/browse/HBASE-21141
 Project: HBase
  Issue Type: Test
  Components: backuprestore
Reporter: Ted Yu


Currently we only have one test (TestRemoteBackup) where MOB feature is 
enabled. The test only performs full backup.

This issue is to enable MOB in backup / restore test(s) involving incremental 
backup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21139) Concurrent invocations of MetricsTableAggregateSourceImpl.getOrCreateTableSource may return unregistered MetricsTableSource

2018-09-01 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21139:
--

 Summary: Concurrent invocations of 
MetricsTableAggregateSourceImpl.getOrCreateTableSource may return unregistered 
MetricsTableSource
 Key: HBASE-21139
 URL: https://issues.apache.org/jira/browse/HBASE-21139
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


>From test output of TestRestoreFlushSnapshotFromClient :
{code}
2018-09-01 21:09:38,174 WARN  [member: 
'hw13463.attlocal.net,49623,1535861370108' subprocedure-pool6-thread-1] 
snapshot.  
RegionServerSnapshotManager$SnapshotSubprocedurePool(348): Got Exception in 
SnapshotSubprocedurePool
java.util.concurrent.ExecutionException: java.lang.NullPointerException
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:324)
  at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:173)
  at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:193)
  at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:189)
  at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:53)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
  at 
org.apache.hadoop.hbase.regionserver.MetricsTableSourceImpl.updateFlushTime(MetricsTableSourceImpl.java:375)
  at 
org.apache.hadoop.hbase.regionserver.MetricsTable.updateFlushTime(MetricsTable.java:56)
  at 
org.apache.hadoop.hbase.regionserver.MetricsRegionServer.updateFlush(MetricsRegionServer.java:210)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2826)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2444)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2416)
  at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2306)
  at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2209)
  at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:115)
  at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:77)
{code}
In MetricsTableAggregateSourceImpl.getOrCreateTableSource :
{code}
MetricsTableSource prev = tableSources.putIfAbsent(table, source);

if (prev != null) {
  return prev;
} else {
  // register the new metrics now
  register(source);
{code}
Suppose threads t1 and t2 execute the above code concurrently.
t1 calls putIfAbsent first and proceeds to running {{register(source)}}.
Context switches, t2 gets to putIfAbsent and retrieves the instance stored by 
t1 which is not registered yet.
We would end up with what the stack trace showed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21138) Close HRegion instance at the end of every test in TestHRegion

2018-08-31 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21138:
--

 Summary: Close HRegion instance at the end of every test in 
TestHRegion
 Key: HBASE-21138
 URL: https://issues.apache.org/jira/browse/HBASE-21138
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


TestHRegion has over 100 tests.
The following is from one subtest:
{code}
  public void testCompactionAffectedByScanners() throws Exception {
byte[] family = Bytes.toBytes("family");
this.region = initHRegion(tableName, method, CONF, family);
{code}
this.region is not closed at the end of the subtest.

testToShowNPEOnRegionScannerReseek is another example.

Every subtest should use the following construct toward the end:
{code}
} finally {
  HBaseTestingUtility.closeRegionAndWAL(this.region);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-14783) Proc-V2: Master aborts when downgrading from 1.3 to 1.1

2018-08-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-14783.

Resolution: Later

> Proc-V2: Master aborts when downgrading from 1.3 to 1.1
> ---
>
> Key: HBASE-14783
> URL: https://issues.apache.org/jira/browse/HBASE-14783
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Stephen Yuan Jiang
>Priority: Major
>
> I was running ITBLL with 1.3 deployed on a 6 node cluster.
> Then I stopped the cluster, deployed 1.1 release and tried to start cluster.
> However, master failed to start due to:
> {code}
> 2015-11-06 00:58:40,351 FATAL [eval-test-2:2.activeMasterManager] 
> master.HMaster: Failed to become active master
> java.io.IOException: The procedure class 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure must be 
> accessible and have an empty constructor
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.newInstance(Procedure.java:548)
>   at org.apache.hadoop.hbase.procedure2.Procedure.convert(Procedure.java:640)
>   at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader.read(ProcedureWALFormatReader.java:105)
>   at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormat.load(ProcedureWALFormat.java:82)
>   at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.load(WALProcedureStore.java:298)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:275)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:434)
>   at 
> org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1208)
>   at 
> org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1107)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:694)
>   at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:186)
>   at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1713)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:191)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.newInstance(Procedure.java:536)
>   ... 12 more
> {code}
> The cause was that ServerCrashProcedure, written in some WAL file under 
> MasterProcWALs from first run, was absent in 1.1 release.
> After a brief discussion with Stephen, I am logging this JIRA to solicit 
> discussion on how customer experience can be improved if downgrade of hbase 
> is performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-14716) Detection of orphaned table znode should cover table in Enabled state

2018-08-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-14716.

Resolution: Later

> Detection of orphaned table znode should cover table in Enabled state
> -
>
> Key: HBASE-14716
> URL: https://issues.apache.org/jira/browse/HBASE-14716
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
>  Labels: hbck
> Attachments: 14716-branch-1-v1.txt, 14716.branch-1.v4.txt
>
>
> HBASE-12070 introduced fix for orphaned table znode where table doesn't have 
> entry in hbase:meta
> When Stephen and I investigated rolling upgrade failure,
> {code}
> 2015-10-27 18:21:10,668 WARN  [ProcedureExecutorThread-3] 
> procedure.CreateTableProcedure: The table smoketest does not exist in meta 
> but has a znode. run hbck to fix inconsistencies.
> {code}
> we found that the orphaned table znode corresponded to table in Enabled state.
> Therefore running hbck didn't report the inconsistency.
> Detection for orphaned table znode should cover this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning

2018-08-22 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21097:
--

 Summary: Flush pressure assertion may fail in 
testFlushThroughputTuning 
 Key: HBASE-21097
 URL: https://issues.apache.org/jira/browse/HBASE-21097
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt
> :
{code}
[ERROR] 
testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController)
  Time elapsed: 17.446 s  <<< FAILURE!
java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6>
at 
org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185)
{code}
Here is the related assertion:
{code}
assertEquals(0.0, regionServer.getFlushPressure(), EPSILON);
{code}
where EPSILON = 1E-6

In the above case, due to margin of 2.9E-7, the assertion didn't pass.
It seems the epsilon can be adjusted to accommodate different workload / 
hardware combination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Minimum Maven Version

2018-08-22 Thread Ted Yu
I would choose #1

bq. if we ever try to backport the hbase-spark module.

I doubt this would ever happen for the 2.x releases.

Cheers

On Wed, Aug 22, 2018 at 9:26 AM Mike Drob  wrote:

> Hi Devs,
>
> Our current minimum maven version is 3.0.4, this is both enforced by the
> enforcer plugin and documented in the ref guide. Over on HBASE-20175, our
> Artem is suggesting to use a newer version of the scala-maven-plugin but
> the latest version requires Maven 3.5.3
>
> It looks like we have a couple of options that I want to get feedback on:
>
> 1) Bump our minimum maven version for master branch. We can leave it at
> 3.0.4 for branch-1 and branch-2, but this would come up again if we ever
> try to backport the hbase-spark module.
>
> 2) Engage with the scala plugin community to try and get the plugin to work
> with older maven versions. I haven't done any feasibility study on this
> yet, and am not even sure which community we would be talking to.
>
> 3) See if the specific issues we are running into are solved by older
> versions of the plugin that are compatible with older versions of maven.
>
> 4) Do some transitive dependency exclusion magic instead of actually
> harmonizing the versions of things that we use.
>
> I'm leaning towards 1) or 4), but would be interested to hear thoughts from
> other parties.
>
> Mike
>


[jira] [Created] (HBASE-21088) HStoreFile should be closed in HStore#hasReferences

2018-08-21 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21088:
--

 Summary: HStoreFile should be closed in HStore#hasReferences
 Key: HBASE-21088
 URL: https://issues.apache.org/jira/browse/HBASE-21088
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


{code}
  reloadedStoreFiles = loadStoreFiles();
  return StoreUtils.hasReferences(reloadedStoreFiles);
{code}
The intention of obtaining the HStoreFile's is to check for references.
The loaded HStoreFile's should be closed prior to return to prevent leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21076) TestTableResource fails with NPE

2018-08-20 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21076:
--

 Summary: TestTableResource fails with NPE
 Key: HBASE-21076
 URL: https://issues.apache.org/jira/browse/HBASE-21076
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


The following can be observed in master branch:
{code}
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.rest.TestTableResource.setUpBeforeClass(TestTableResource.java:134)
{code}
The NPE comes from the following in TestEndToEndSplitTransaction :
{code}
compactAndBlockUntilDone(TEST_UTIL.getAdmin(),
  TEST_UTIL.getMiniHBaseCluster().getRegionServer(0), 
daughterA.getRegionName());
{code}
Initial check of the code shows that TestEndToEndSplitTransaction uses 
TEST_UTIL instance which is created within TestEndToEndSplitTransaction. 
However, TestTableResource creates its own instance of HBaseTestingUtility.
Meaning TEST_UTIL.getMiniHBaseCluster() would return null, since the instance 
created by TestEndToEndSplitTransaction has hbaseCluster as null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21042) processor.getRowsToLock() always assumes there is some row being locked in HRegion#processRowsWithLocks

2018-08-13 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21042:
--

 Summary: processor.getRowsToLock() always assumes there is some 
row being locked in HRegion#processRowsWithLocks
 Key: HBASE-21042
 URL: https://issues.apache.org/jira/browse/HBASE-21042
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


[~tdsilva] reported at the tail of HBASE-18998 that the fix for HBASE-18998 
missed finally block of HRegion#processRowsWithLocks

This is to fix that remaining call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21040) printStackTrace() is used in RestoreDriver in case Exception is caught

2018-08-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21040:
--

 Summary: printStackTrace() is used in RestoreDriver in case 
Exception is caught
 Key: HBASE-21040
 URL: https://issues.apache.org/jira/browse/HBASE-21040
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


Here is related code:
{code}
} catch (Exception e) {
  e.printStackTrace();
{code}
The correct way of logging stack trace is to use the Logger instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Hbase mutate is hogging my CPU

2018-08-07 Thread Ted Yu
There have been several releases for Hbase 1.2
Which release are you using ?

The images you sent didn't go through the mailing list. Please consider
using third party site for delivery.

Have you taken a look at what the server hosting hbase:meta was doing
during this period of time ?

Thanks

On Tue, Aug 7, 2018 at 8:04 AM Mike Freyberger 
wrote:

> Kafka Dev,
>
>
>
> I’d love some help investigating a slow Hbase mutator.
>
>
>
> The cluster is Hbase 1.2 and cluster has 22 region servers. The region
> servers are pretty big: 24 cores, 126 GB RAM.
>
>
>
> The cluster has 2 tables, each only have 1 column family. Both tables have
> the same pre splits.
>
>
>
> Each table is pre split into 400 regions. The split keys are all 2 bytes
> and evenly divide the key space.
>
>
>
> The keys are 13 bytes. The key is formed by concatenating:
>
> 1 byte kafka partition
>
> 8 byte random int
>
> 4 byte timestamp (second level granularity)
>
> The workload is 100% write for now. There are about 1M writes per second
> with a total data volume of .6GB per second.
>
>
>
> I find that my application is spending the majority of its CPU time
> (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate
> (), which is in turn spending most of its time calling
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion
> ().
>
>
>
> Attached are two images showing the performance of my application. The
> first is an overview showing that my application is spending a lot of in
> mutate. The next is a deep dive into the functions that mutate is calling
> internally.
>
> I am very surprised to see this function taking so long. My intuition is
> that all this needs to do is:
> 1) Determine which region the Mutation belongs in
> 2) Append the Mutation to a queue for async write to HBase.
>
>
> Any thoughts, comments of suggestions from the community would be much
> appreciated! I’m really hoping to improve the performance profile here so
> that my CPU can be freed up.
>
>
>
> Thanks,
>
>
>
> Mike Freyberger
>


Re: [ANNOUNCE] New Committer: Toshihiro Suzuki

2018-08-01 Thread Ted Yu
Congratulations, Toshihiro !

On Wed, Aug 1, 2018 at 7:47 AM Josh Elser  wrote:

> On behalf of the HBase PMC, I'm pleased to announce that Toshihiro
> Suzuki (aka Toshi, brfn169) has accepted our invitation to become an
> HBase committer. This was extended to Toshi as a result of his
> consistent, high-quality contributions to HBase. Thanks for all of your
> hard work, and we look forward to working with you even more!
>
> Please join me in extending a hearty "congrats" to Toshi!
>
> - Josh
>


Re: May I take this issue --hbase-spark

2018-07-31 Thread Ted Yu
bq. ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe'

The above implies dependency on some class from Hive.

Which Hive release would you use if you choose the above route ?

Looking forward to your demo.

On Tue, Jul 31, 2018 at 9:09 AM bill.yunfu 
wrote:

> hi Ted
>Thank you for replying.
> The sql support means user can directly use spark sql to create table and
> query data from HBase. we found two sql support on HBase
> SHC use following command to create table in spark sql:
> CREATE TABLE spark_hbase USING
> org.apache.spark.sql.execution.datasources.hbase
>   OPTIONS ('catalog'=
>   '{"table":{"namespace":"default", "name":"test",
> "tableCoder":"PrimitiveType"},"rowkey":"key",
>   "columns":{
>   "col0":{"cf":"rowkey", "col":"key", "type":"string"},
>   "col1":{"cf":"cf", "col":"a", "type":"string"}}}'
>   )
> (SHC is a project can get details from:
> https://github.com/hortonworks-spark/shc)
> In spark sql also can use hive command to create table:
> create  table spark_hbase (col0 string, col1 string)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' with
> SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:a")
> STORED AS
> INPUTFORMAT
> 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat'
> tblproperties ("hbase.table.name" = "test");
>
> So we want make a similar DDL to create the table for hbase-spark model and
> query with the spark sql.
>
> And for the Spark release, we suggestion first target at spark 2.y, for
> example the spark 2.2.2 which is stability now.
>
> We will create a demo base on hbase-spark model with sql support in local,
> then share here to discuss.
>
> Regards
> Bill
>
>
> Ted Yu-3 wrote
> > For SQL support, can you be more specific on how the SQL support would be
> > added ?
> >
> > Maybe you can illustrate some examples showing the enhanced SQL syntax.
> >
> > Also, which Spark release(s) would be targeted?
> >
> > Thanks
> >
> > On Mon, Jul 30, 2018 at 10:57 AM bill.yunfu 
>
> > guangcheng.zgc@
>
> > 
> > wrote:
>
>
>
>
>
> --
> Sent from:
> http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html
>


Re: Question regarding hbase-shell JRuby testing workflow

2018-07-31 Thread Ted Yu
The flakiness of list_procedures_test.rb is probably related to the load on
the node running the test, or other tests in hbase-shell module.

I ran list_procedures_test.rb alone a few times which passed.

Jack:
You can include some other shell test(s) along with this test.

You can also retrieve test output following the test runs performed here:

https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html

FYI

On Tue, Jul 31, 2018 at 7:41 AM Josh Elser  wrote:

> I haven't ever tried to de-couple from Maven. The 'lowest' I ever got
> was something like the following:
>
> 1. mvn clean install -DskipTests
> 2. cd hbase-shell
> 2. mvn package -Dtest=TestShell -Dshell.test.include=my_test_class.rb -o
>
> Hope this helps, Jack. I know it's not ideal -- if you do come up with
> something that works at a lower level, I think we'd be very supportive
> to get it doc'ed and keep it working :)
>
> On 7/30/18 11:16 PM, Jack Bearden wrote:
> > Hey all! I was hacking hbase-shell and JRuby over the weekend and wanted
> to
> > get some feedback on workflow. My objective was to execute a single Ruby
> > unit test in isolation from the TestShell.java class via the jruby
> binary.
> > I was able to accomplish this by doing the following steps:
> >
> > 1.  Pulled down branch-2
> > 2.  Installed and cleaned via maven at the base directory (mvn
> > -Dmaven.javadoc.skip -DskipTests install)
> > 3.  Changed to the hbase-shell directory and exported the classpath (mvn
> > dependency:build-classpath -Dmdep.outputFile=/path/to/cpath.txt)
> > 4.  Exported the path to that file to shell env (export
> > TEST_PATH="/path/to/cpath.txt")
> > 5.  Hacked tests_runner.rb to just load("path/to/test") for the test I
> > wanted to run
> > 6.  From the hbase-shell project directory ran the following:
> >
> > jruby \
> > -J-cp `cat $TEST_PATH` \
> > -d -w \
> > -I src/test/ruby \
> > -I src/main/ruby \
> > src/test/ruby/tests_runner.rb
> >
> > The problem is, is this only worked on *most* of the hbase-shell Ruby
> > tests. The only way to get, for example, list_procedures_test.rb to work
> > completely, was to run it from the TestShell.java file. When ran from the
> > jruby binary, I get a "class not found" when
> > org.apache.hadoop.hbase.client.procedure.ShellTestProcedure.new was being
> > referenced. I can't figure out how to load this class adhoc and not
> through
> > what appears to be Maven magic.
> >
> > Any suggestions or better ideas on how to do this?
> >
>


[jira] [Created] (HBASE-20988) TestShell shouldn't be skipped for hbase-shell module test

2018-07-31 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20988:
--

 Summary: TestShell shouldn't be skipped for hbase-shell module test
 Key: HBASE-20988
 URL: https://issues.apache.org/jira/browse/HBASE-20988
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Here is snippet for QA run 13862 for HBASE-20985 :
{code}
13:42:50 cd /testptch/hbase/hbase-shell
13:42:50 /usr/share/maven/bin/mvn 
-Dmaven.repo.local=/home/jenkins/yetus-m2/hbase-master-patch-1 
-DHBasePatchProcess -PrunAllTests -Dtest.exclude.pattern=**/master.normalizer.  
  
TestSimpleRegionNormalizerOnCluster.java,**/replication.regionserver.TestSerialReplicationEndpoint.java,**/master.procedure.TestServerCrashProcedure.java,**/master.procedure.TestCreateTableProcedure.

java,**/TestClientOperationTimeout.java,**/client.TestSnapshotFromClientWithRegionReplicas.java,**/master.TestAssignmentManagerMetrics.java,**/client.TestShell.java,**/client.

TestCloneSnapshotFromClientWithRegionReplicas.java,**/master.TestDLSFSHLog.java,**/replication.TestReplicationSmallTestsSync.java,**/master.procedure.TestModifyTableProcedure.java,**/regionserver.
   
TestCompactionInDeadRegionServer.java,**/client.TestFromClientSide3.java,**/master.procedure.TestRestoreSnapshotProcedure.java,**/client.TestRestoreSnapshotFromClient.java,**/security.access.

TestCoprocessorWhitelistMasterObserver.java,**/replication.regionserver.TestDrainReplicationQueuesForStandBy.java,**/master.procedure.TestProcedurePriority.java,**/master.locking.TestLockProcedure.
  
java,**/master.cleaner.TestSnapshotFromMaster.java,**/master.assignment.TestSplitTableRegionProcedure.java,**/client.TestMobRestoreSnapshotFromClient.java,**/replication.TestReplicationKillSlaveRS.
  
java,**/regionserver.TestHRegion.java,**/security.access.TestAccessController.java,**/master.procedure.TestTruncateTableProcedure.java,**/client.TestAsyncReplicationAdminApiWithClusters.java,**/
 
coprocessor.TestMetaTableMetrics.java,**/client.TestMobSnapshotCloneIndependence.java,**/namespace.TestNamespaceAuditor.java,**/master.TestMasterAbortAndRSGotKilled.java,**/client.TestAsyncTable.java,**/master.TestMasterOperationsForRegionReplicas.java,**/util.TestFromClientSide3WoUnsafe.java,**/client.TestSnapshotCloneIndependence.java,**/client.TestAsyncDecommissionAdminApi.java,**/client.

TestRestoreSnapshotFromClientWithRegionReplicas.java,**/master.assignment.TestMasterAbortWhileMergingTable.java,**/client.TestFromClientSide.java,**/client.TestAdmin1.java,**/client.
 
TestFromClientSideWithCoprocessor.java,**/replication.TestReplicationKillSlaveRSWithSeparateOldWALs.java,**/master.procedure.TestMasterFailoverWithProcedures.java,**/regionserver.
TestSplitTransactionOnCluster.java clean test -fae > 
/testptch/patchprocess/patch-unit-hbase-shell.txt 2>&1
{code}
In this case, there was modification to shell script, leading to running shell 
tests.

However, TestShell was excluded in the QA run, defeating the purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Question regarding hbase-shell JRuby testing workflow

2018-07-30 Thread Ted Yu
Have you tried sidelining other .rb files
under hbase-shell//src/test/ruby/shell/ (keeping only
hbase-shell//src/test/ruby/shell/list_procedures_test.rb) ?

Cheers

On Mon, Jul 30, 2018 at 8:29 PM Jack Bearden  wrote:

> Hey all! I was hacking hbase-shell and JRuby over the weekend and wanted to
> get some feedback on workflow. My objective was to execute a single Ruby
> unit test in isolation from the TestShell.java class via the jruby binary.
> I was able to accomplish this by doing the following steps:
>
> 1.  Pulled down branch-2
> 2.  Installed and cleaned via maven at the base directory (mvn
> -Dmaven.javadoc.skip -DskipTests install)
> 3.  Changed to the hbase-shell directory and exported the classpath (mvn
> dependency:build-classpath -Dmdep.outputFile=/path/to/cpath.txt)
> 4.  Exported the path to that file to shell env (export
> TEST_PATH="/path/to/cpath.txt")
> 5.  Hacked tests_runner.rb to just load("path/to/test") for the test I
> wanted to run
> 6.  From the hbase-shell project directory ran the following:
>
> jruby \
> -J-cp `cat $TEST_PATH` \
> -d -w \
> -I src/test/ruby \
> -I src/main/ruby \
> src/test/ruby/tests_runner.rb
>
> The problem is, is this only worked on *most* of the hbase-shell Ruby
> tests. The only way to get, for example, list_procedures_test.rb to work
> completely, was to run it from the TestShell.java file. When ran from the
> jruby binary, I get a "class not found" when
> org.apache.hadoop.hbase.client.procedure.ShellTestProcedure.new was being
> referenced. I can't figure out how to load this class adhoc and not through
> what appears to be Maven magic.
>
> Any suggestions or better ideas on how to do this?
>


Re: May I take this issue --hbase-spark

2018-07-30 Thread Ted Yu
For SQL support, can you be more specific on how the SQL support would be
added ?

Maybe you can illustrate some examples showing the enhanced SQL syntax.

Also, which Spark release(s) would be targeted?

Thanks

On Mon, Jul 30, 2018 at 10:57 AM bill.yunfu 
wrote:

> May I take this issue --hbase-spark
>
> Hi community
>I am working in one HBase team which service hundreds customers. We find
> that along increasing amount of data in the HBase, many customers have
> analysis requirement for their data on Hbase. For example they want use
> Spark to do some analysis which may query more data from Hbase and may also
> join with other tables, the tables may be in Hbase or Spark.
>But Hbase can not support this scenario very well. So we plan use spark
> to support this.
>We found the Apache Hbase already has one module called Hbase-spark, but
> this module is not updated recently and not formally released. Besides we
> found there are others project support Sql On Hbase. For example Hive on
> Hbase which give good sql syntax support.
>Even there are many projects for Spark on Hbase, but I think now no one
> is the public knowing for users. Because our customer have more and more
> requirement for Spark on Hbase, So we want take this issue. Initial goal is
> make a standard and public knowing Spark on Hbase in apache Hbase
> community.
>Our initial idea is:
>SQL support:  Now the hbase-spark model can not spark-sql command to
> create table, We want make it support sql command which may like the sql
> syntax from Hive on HBase or the SQL syntax from SHC.
>Performance improved: this part is not very clearly now, the goal is use
> spark sql query HBase data has a good performance.
>
> We want to get some suggestions from community. Then I will raise a JIRA to
> track it and put a design document.
>
> Best Regards
> Bill
>
>
>
>
> --
> Sent from:
> http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html
>


Re: I am a subscribe please add me thanks

2018-07-30 Thread Ted Yu
Can you take a look at http://hbase.apache.org/mail-lists.html ?

The first column gives you the email address for subscription.

Cheers

On Mon, Jul 30, 2018 at 7:51 AM 周广成(云覆) 
wrote:

> hi
>  I send this mail yesterday but still now cannot public the topic in the
> hbase dev mallist, can you please help check that. thank you very much.
>
> Regards
> Bill
>
>
> --
> 发件人:周广成(云覆) 
> 发送时间:2018年7月29日(星期日) 15:50
> 收件人:hbase-dev 
> 主 题:I am a subscribe please add me thanks
>
> I am a subscribe please add me thanks


[jira] [Created] (HBASE-20968) list_procedures_test fails due to no matching regex

2018-07-28 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20968:
--

 Summary: list_procedures_test fails due to no matching regex
 Key: HBASE-20968
 URL: https://issues.apache.org/jira/browse/HBASE-20968
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


>From test output against hadoop3:
{code}
2018-07-28 12:04:24,838 DEBUG [Time-limited test] 
procedure2.ProcedureExecutor(948): Stored pid=12, state=RUNNABLE, 
hasLock=false; org.apache.hadoop.hbase.client.procedure.  ShellTestProcedure
2018-07-28 12:04:24,864 INFO  [RS-EventLoopGroup-1-3] 
ipc.ServerRpcConnection(556): Connection from 172.18.128.12:46918, 
version=3.0.0-SNAPSHOT, sasl=false, ugi=hbase (auth: SIMPLE), 
service=MasterService
2018-07-28 12:04:24,900 DEBUG [Thread-114] master.MasterRpcServices(1157): 
Checking to see if procedure is done pid=11
^[[38;5;196mF^[[0m
===
Failure: 
^[[48;5;124;38;5;231;1mtest_list_procedures(Hbase::ListProceduresTest)^[[0m
src/test/ruby/shell/list_procedures_test.rb:65:in `block in 
test_list_procedures'
 62: end
 63:   end
 64:
^[[48;5;124;38;5;231;1m  => 65:   assert_equal(1, matching_lines)^[[0m
 66: end
 67:   end
 68: end
<^[[48;5;34;38;5;231;1m1^[[0m> expected but was
<^[[48;5;124;38;5;231;1m0^[[0m>
===
...
2018-07-28 12:04:25,374 INFO  [PEWorker-9] procedure2.ProcedureExecutor(1316): 
Finished pid=12, state=SUCCESS, hasLock=false; 
org.apache.hadoop.hbase.client.procedure.   ShellTestProcedure in 
336msec
{code}
The completion of the ShellTestProcedure was after the assertion was raised.
{code}
def create_procedure_regexp(table_name)
  regexp_string = '[0-9]+ .*ShellTestProcedure SUCCESS.*' \
{code}
The regex used by the test isn't found in test output either.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20966:
--

 Summary: RestoreTool#getTableInfoPath should look for completed 
snapshot only
 Key: HBASE-20966
 URL: https://issues.apache.org/jira/browse/HBASE-20966
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


[~gubjanos] reported seeing the following error when running backup / restore 
test on Azure:
{code}
2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
 Couldn't read snapshot info 
from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
snapshotinfo
2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
{code}
Here is related code in master branch:
{code}
  Path getTableInfoPath(TableName tableName) throws IOException {
Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
backupId);
Path tableInfoPath = null;

// can't build the path directly as the timestamp values are different
FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
{code}
In the above code, we don't exclude incomplete snapshot, leading to exception 
later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20917) MetaTableMetrics#stop references uninitialized requestsMap for non-meta region

2018-07-21 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20917:
--

 Summary: MetaTableMetrics#stop references uninitialized 
requestsMap for non-meta region
 Key: HBASE-20917
 URL: https://issues.apache.org/jira/browse/HBASE-20917
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


I noticed the following in test output:
{code}
2018-07-21 15:54:43,181 ERROR [RS_CLOSE_REGION-regionserver/172.17.5.4:0-1] 
executor.EventHandler(186): Caught throwable while processing event 
M_RS_CLOSE_REGION
java.lang.NullPointerException
  at 
org.apache.hadoop.hbase.coprocessor.MetaTableMetrics.stop(MetaTableMetrics.java:329)
  at 
org.apache.hadoop.hbase.coprocessor.BaseEnvironment.shutdown(BaseEnvironment.java:91)
  at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionEnvironment.shutdown(RegionCoprocessorHost.java:165)
  at 
org.apache.hadoop.hbase.coprocessor.CoprocessorHost.shutdown(CoprocessorHost.java:290)
  at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.postEnvCall(RegionCoprocessorHost.java:559)
  at 
org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:622)
  at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postClose(RegionCoprocessorHost.java:551)
  at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1678)
  at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1484)
  at 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
  at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
{code}
{{requestsMap}} is only initialized for the meta region.
However, check for meta region is absent in the stop method:
{code}
  public void stop(CoprocessorEnvironment e) throws IOException {
// since meta region can move around, clear stale metrics when stop.
for (String meterName : requestsMap.keySet()) {
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20892) [UI] Start / End keys are empty on table.jsp

2018-07-15 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20892:
--

 Summary: [UI] Start / End keys are empty on table.jsp
 Key: HBASE-20892
 URL: https://issues.apache.org/jira/browse/HBASE-20892
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.1
Reporter: Ted Yu


When viewing table.jsp?name=TestTable , I found that the Start / End keys for 
all the regions were simply dashes without real value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: issue while reading data from hbase

2018-07-13 Thread Ted Yu
Putting dev@ to bcc.

Which hbase-spark connector are you using ?
What's the hbase release in your deployment ?

bq. some of the columns in dataframe becomes null

Is it possible to characterize what type of columns become null ? Earlier
you said one column has xml data. Did you mean this column from some rows
returned null ?

Have you checked region server logs where the corresponding regions reside ?

Thanks

On Fri, Jul 13, 2018 at 4:08 AM hnk45  wrote:

> I am reading data from hbase using spark sql. one column has xml data. when
> xml size is small , I am able to read correct data. but as soon as size
> increases too much, some of the columns in dataframe becomes null. xml is
> still coming correctly.
> while reading data from sql to hbase I have used this constraint:
> hbase.client.keyvalue.maxsize=0 in my sqoop.
>
>
>
> --
> Sent from:
> http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html
>


[jira] [Created] (HBASE-20879) Compacting memstore config should handle lower case

2018-07-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20879:
--

 Summary: Compacting memstore config should handle lower case
 Key: HBASE-20879
 URL: https://issues.apache.org/jira/browse/HBASE-20879
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.1
Reporter: Tushar Sharma
Assignee: Ted Yu


Tushar reported seeing the following in region server log when entering 'basic' 
for compacting memstore type:
{code}
2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-0] 
handler.OpenRegionHandler: Failed open of 
region=usertable,user6379,1531182972304.69abd81a44e9cc3ef9e150709f4f69ab., 
starting to roll back the global memstore size.
java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at java.lang.Enum.valueOf(Enum.java:238)
at 
org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26)
at 
org.apache.hadoop.hbase.regionserver.HStore.getMemstore(HStore.java:331)
at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:271)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5531)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:999)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:996)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-1] 
handler.OpenRegionHandler: Failed open of 
region=temp,,1530511278693.0be48eedc68b9358aa475946d00571f1., starting to roll 
back the global memstore size.
java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at java.lang.Enum.valueOf(Enum.java:238)
at 
org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26

Re: The flakey test dashboard is broken

2018-07-02 Thread Ted Yu
Please log an INFRA ticket.

On Mon, Jul 2, 2018 at 3:12 AM, 张铎(Duo Zhang)  wrote:

> The console output is
>
>
> + docker run -v
> > /home/jenkins/jenkins-slave/workspace/HBase-Find-Flaky-Tests:/hbase
> > --workdir=/hbase hbase-dev-support python dev-support/report-flakies.py
> > --mvn -v --urls=https://builds.apache.org/job/HBASE-Flaky-Tests/
> > --max-builds=30 --is-yetus=False --urls=
> > https://builds.apache.org/job/HBase%20Nightly/job/master/ --max-builds=6
> > --is-yetus=True --urls=
> > https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/
> > --max-builds=30 --is-yetus=False --urls=
> > https://builds.apache.org/job/HBase%20Nightly/job/branch-2/
> > --max-builds=6 --is-yetus=True --urls=
> > http://104.198.223.121:8080/job/HBASE-Flaky-Tests/ --max-builds=30
> > --is-yetus=False
> > Traceback (most recent call last):
> > File "dev-support/report-flakies.py", line 151, in 
> > expanded_urls = expand_multi_config_projects(args)
> > File "dev-support/report-flakies.py", line 128, in
> > expand_multi_config_projects
> > response = requests.get(job_url + "/api/json").json()
> > File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
> > return request('get', url, **kwargs)
> > File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in
> > request
> > return session.request(method=method, url=url, **kwargs)
> > File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455,
> in
> > request
> > resp = self.send(prep, **send_kwargs)
> > File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558,
> in
> > send
> > r = adapter.send(request, **kwargs)
> > File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 378,
> in
> > send
> > raise ConnectionError(e)
> > requests.exceptions.ConnectionError:
> > HTTPConnectionPool(host='104.198.223.121', port=8080): Max retries
> exceeded
> > with url: /job/HBASE-Flaky-Tests//api/json (Caused by  > 'socket.error'>: [Errno 111] Connection refused)
> > Build step 'Execute shell' marked build as failure
>
>
> I think the problem is that, the jenkins instance is broken
>
> http://104.198.223.121:8080/job/HBASE-Flaky-Tests/
>
> I temporarily removed this url from the build. Does any one know who is the
> maintainer of this machine?
>
> Thanks.
>


Re: [ANNOUNCE] New HBase committer Reid Chan

2018-06-25 Thread Ted Yu
Congratulations, Reid !

On Mon, Jun 25, 2018 at 6:59 PM, Chia-Ping Tsai  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that Reid Chan
> has accepted the PMC's invitation to become a committer on the project. We
> appreciate all of Reid’s generous contributions thus far and look forward
> to his continued involvement.
>
> Congratulations and welcome, Reid!
>
> --
> Chia-Ping
>


Re: Template problem native client c++ with new folly

2018-06-21 Thread Ted Yu
Can you take a look at :
HBASE-18901 [C++] Provide CMAKE infrastructure

There hasn't been effort to support newer folly.

FYI

On Wed, Jun 20, 2018 at 1:42 PM, Andrzej  wrote:

> I have installed new (17 days ago) folly and wangle from sources.
> I try compile sources of native client from HBASE-14850 branch.
> These sources are old.
> I have problem:
> ```
> template >
>   typename R::Return then(F&& func) {
> return this->template thenImplementation(
> std::forward(func), typename R::Arg());
>   }
> ```
> from /usr/local/include/folly/futures/Future.h
>
> Is template called from
> ```
> //  mimic: std::invoke_result_t, C++17
> template 
> using invoke_result_t = typename invoke_result::type;
> ```
> from /usr/local/include/folly/functional/Invoke.h
>
> but is called from
> ```
>  GetRegionLocations(actions, locate_timeout_ns)
>   .then([=](std::vector>> ) {
> std::lock_guard lck(multi_mutex_);
> ActionsByServer actions_by_server;
> std::vector> locate_failed;
> ```
> from 
> /home/andrzej/projects/simple-hbase2/src/hbase/client/async-batch-rpc-retrying-caller.cc
> - my project
>
> I have turn on -std=gnu++17
>
> There are error and notes:
> 
> /home/andrzej/projects/simple-hbase2/src/hbase/client/async-
> batch-rpc-retrying-caller.cc|259|error: no matching function for call to
> ‘folly::Future
> > > >::then(hbase::AsyncBatchRpcRetryingCaller RESP>::GroupAndSend(const std::vector >&,
> int32_t) [with REQ = std::shared_ptr; RESP =
> std::shared_ptr; int32_t = int]:: y::Try > >&)>)’|
>
> /usr/local/include/folly/futures/Future.h|737|note: candidate:
> template typename R::Return folly::Future::then(F&&)
> [with F = F; R = R; T = std::vector ared_ptr > >]|
>
> /usr/local/include/folly/futures/Future.h|737|note:   substitution of
> deduced template arguments resulted in errors seen above|
>
> /usr/local/include/folly/futures/Future.h|753|note: candidate:
> template folly::Future folly::isFuture::Inner> folly::Future::then(R (Caller::*)(Args ...),
> Caller*) [with R = R; Caller = Caller; Args = {Args ...}; T =
> std::vector > >]|
>
> /usr/local/include/folly/futures/Future.h|753|note:   template argument
> deduction/substitution failed:|
>
> /home/andrzej/projects/simple-hbase2/src/hbase/client/async-
> batch-rpc-retrying-caller.cc|259|note:   mismatched types ‘R
> (Caller::*)(Args ...)’ and ‘hbase::AsyncBatchRpcRetryingCaller RESP>::GroupAndSend(const std::vector >&,
> int32_t) [with REQ = std::shared_ptr; RESP =
> std::shared_ptr; int32_t = int]:: y::Try > >&)>’|
>
> /usr/local/include/folly/futures/Future.h|770|note: candidate:
> template auto
> folly::Future::then(Executor*, Arg&&, Args&& ...) [with Executor =
> Executor; Arg = Arg; Args = {Args ...}; T = std::vector ared_ptr > >]|
>
> /usr/local/include/folly/futures/Future.h|770|note:   template argument
> deduction/substitution failed:|
>
> /home/andrzej/projects/simple-hbase2/src/hbase/client/async-
> batch-rpc-retrying-caller.cc|259|note:   mismatched types ‘Executor*’ and
> ‘hbase::AsyncBatchRpcRetryingCaller::GroupAndSend(const
> std::vector >&, int32_t) [with REQ =
> std::shared_ptr; RESP = std::shared_ptr;
> int32_t = 
> int]::
> > >&)>’|
>
> /usr/local/include/folly/futures/Future-inl.h|975|note: candidate:
> folly::Future folly::Future::then() [with T =
> std::vector > >]|
>
> /usr/local/include/folly/futures/Future-inl.h|975|note:   candidate
> expects 0 arguments, 1 provided|
> 
>
> How I can change this piece of sources to fit new folly?
>


[jira] [Created] (HBASE-20744) Address FindBugs warnings in branch-1

2018-06-16 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20744:
--

 Summary: Address FindBugs warnings in branch-1
 Key: HBASE-20744
 URL: https://issues.apache.org/jira/browse/HBASE-20744
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350//JDK8_Nightly_Build_Report_(Hadoop2)/
> :
{code}
FindBugsmodule:hbase-common
Inconsistent synchronization of 
org.apache.hadoop.hbase.io.encoding.EncodedDataBlock$BufferGrabbingByteArrayOutputStream.ourBytes;
 locked 50% of time Unsynchronized access at EncodedDataBlock.java:50% of time 
Unsynchronized access at EncodedDataBlock.java:[line 258]
{code}
{code}
FindBugsmodule:hbase-hadoop2-compat
java.util.concurrent.ScheduledThreadPoolExecutor stored into non-transient 
field MetricsExecutorImpl$ExecutorSingleton.scheduler At 
MetricsExecutorImpl.java:MetricsExecutorImpl$ExecutorSingleton.scheduler At 
MetricsExecutorImpl.java:[line 51]
{code}
{code}
FindBugsmodule:hbase-server
instanceof will always return false in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since a org.apache.hadoop.hbase.quotas.RpcThrottlingException can't 
be a org.apache.hadoop.hbase.quotas.ThrottlingException At 
RegionServerQuotaManager.java:in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since a org.apache.hadoop.hbase.quotas.RpcThrottlingException can't 
be a org.apache.hadoop.hbase.quotas.ThrottlingException At 
RegionServerQuotaManager.java:[line 193]
instanceof will always return true for all non-null values in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since all org.apache.hadoop.hbase.quotas.RpcThrottlingException are 
instances of org.apache.hadoop.hbase.quotas.RpcThrottlingException At 
RegionServerQuotaManager.java:for all non-null values in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since all org.apache.hadoop.hbase.quotas.RpcThrottlingException are 
instances of org.apache.hadoop.hbase.quotas.RpcThrottlingException At 
RegionServerQuotaManager.java:[line 199]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20743) ASF License warnings for branch-1

2018-06-16 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20743:
--

 Summary: ASF License warnings for branch-1
 Key: HBASE-20743
 URL: https://issues.apache.org/jira/browse/HBASE-20743
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350/artifact/output-general/patch-asflicense-problems.txt
> :
{code}
Lines that start with ? in the ASF License  report indicate files that do 
not have an Apache license header:
 !? hbase-error-prone/target/checkstyle-result.xml
 !? 
hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
{code}
Looks like they should be excluded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20734:
--

 Summary: Colocate recovered edits directory with hbase.wal.dir
 Key: HBASE-20734
 URL: https://issues.apache.org/jira/browse/HBASE-20734
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu


During investigation of HBASE-20723, I realized that we wouldn't get the best 
performance when hbase.wal.dir is configured to be on different (fast) media 
than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
currently under rootdir.

Such setup may not result in fast recovery when there is region server failover.

This issue is to find proper (hopefully backward compatible) way in colocating 
recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Apache HBase 1.4.5 RC1

2018-06-13 Thread Ted Yu
+1

Checked signtures
Ran test suite with jdk 8


On Wed, Jun 13, 2018 at 12:53 PM, Josh Elser  wrote:

> Hi,
>
> Please vote to approve the following as Apache HBase 1.4.5. The only
> change in RC1 over RC0 is the updated CHANGES.txt.
>
> https://dist.apache.org/repos/dist/dev/hbase/1.4.5RC1/
>
> Per usual, there is a source release as well as a convenience binary
>
> This is built with JDK7 from the commit: https://git-wip-us.apache.org/
> repos/asf?p=hbase.git;a=commit;h=ca99a9466415dc4cfc095df33efb45cb82fe5480
> (there is a corresponding tag "1.4.5RC1" for convenience). Please ignore
> the incorrect commit message (forgot to update that text after rc0).
>
> hbase-1.4.5-bin.tar.gz: 1C34A448 DF4E102E 7964C6D8 84C2B1E5 35DD0CAA
> E67E6B15
> 92DB2B10 8A6D3A0F B2D841F7 677EB5C3 8EB6D78F
> 342CB42F
> 8CE90AA6 D62A5362 3C727023 4E8C95EE
> hbase-1.4.5-src.tar.gz: 61B83966 952D334A FDAE7E3A 11FD8529 90583302
> 4AC186C4
> 2B7BDA11 5CB472A4 34C71466 4DCF90BB 8735F658
> 20975292
> 35319D5C 2287B0DE 64F484C7 42F635D6
>
> There is also a Maven staging repository for this release:
> https://repository.apache.org/content/repositories/orgapachehbase-1221
>
> This vote will be open for at least 72 hours (2018/06/16 2000 UTC).
>
> - Josh (on behalf of the HBase PMC)
>
>


[jira] [Reopened] (HBASE-20672) Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at every monitoring interval

2018-06-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-20672:


> Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at 
> every monitoring interval
> -
>
> Key: HBASE-20672
> URL: https://issues.apache.org/jira/browse/HBASE-20672
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Ankit Jain
>Assignee: Ankit Jain
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20672.branch-1.001.patch, 
> HBASE-20672.master.001.patch, HBASE-20672.master.002.patch, 
> HBASE-20672.master.003.patch, hits1vs2.4.40.400.png
>
>
> Hbase currently provides counter read/write requests (ReadRequestCount, 
> WriteRequestCount). That said it is not easy to use counter that reset only 
> after a restart of the service, we would like to expose 2 new metrics in 
> HBase to provide ReadRequestRate and WriteRequestRate at region server level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Failure: HBase Generate Website

2018-06-11 Thread Ted Yu
Build #1372 passed.

On Mon, Jun 11, 2018 at 7:51 AM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build status: Failure
>
> The HBase website has not been updated to incorporate HBase commit
> ${CURRENT_HBASE_COMMIT}.
>
> See https://builds.apache.org/job/hbase_generate_website/1371/console


Re: [VOTE] Apache HBase 1.4.5 rc0

2018-06-08 Thread Ted Yu
+1

Checked signatures
Ran test suite (with Jdk 8)

On Thu, Jun 7, 2018 at 4:35 PM, Josh Elser  wrote:

> Hi,
>
> Please vote to approve the following as Apache HBase 1.4.5
>
> https://dist.apache.org/repos/dist/dev/hbase/1.4.5rc0/
>
> Per usual, there is a source release as well as a convenience binary
>
> This is built with JDK7 from the commit: https://git-wip-us.apache.org/
> repos/asf?p=hbase.git;a=commit;h=74596816c85f1256ec8a302efecc0144f2ea76fa
> (there is a corresponding tag "1.4.5rc0" for convenience)
>
> hbase-1.4.5-bin.tar.gz: 7C8EFD79 CD5EAEFF 92F2E093 8AC8448C ED5717BD
> 4C8D2C43
> B95F804B 003E2126 9235EFE0 ABE61302 B81B30B1
> F9F4A785
> 17191950 2F436F64 19F50E53 999B5272
> hbase-1.4.5-src.tar.gz: FED89273 FFA746DA D868DF79 7E46DB75 D0908419
> F3D418FF
> 73068583 A6F1DCB2 61BD2389 12DCE920 F8800CAE
> 23631343
> DB7601F4 F43331A4 678135E5 E5C566C4
>
> There is also a Maven staging repository for this release:
> https://repository.apache.org/content/repositories/orgapachehbase-1219
>
> This vote will be open for at least 72 hours (2018/06/11  UTC).
>
> - Josh (on behalf of the HBase PMC)
>
>
>


[jira] [Resolved] (HBASE-20577) Make Log Level page design consistent with the design of other pages in UI

2018-06-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-20577.

Resolution: Fixed

Thanks for the addendum

> Make Log Level page design consistent with the design of other pages in UI
> --
>
> Key: HBASE-20577
> URL: https://issues.apache.org/jira/browse/HBASE-20577
> Project: HBase
>  Issue Type: Improvement
>  Components: UI, Usability
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20577.master.001.patch, 
> HBASE-20577.master.002.patch, HBASE-20577.master.ADDENDUM.patch, 
> after_patch_LogLevel_CLI.png, after_patch_get_log_level.png, 
> after_patch_require_field_validation.png, after_patch_set_log_level_bad.png, 
> after_patch_set_log_level_success.png, 
> before_patch_no_validation_required_field.png, rest_after_addendum_patch.png
>
>
> The Log Level page in web UI seems out of the place. I think we should make 
> it look consistent with design of other pages in HBase web UI.
> Also, validation of required fields should be done, otherwise user should not 
> be allowed to click submit button.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException

2018-06-06 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20690:
--

 Summary: Moving table to target rsgroup needs to handle 
TableStateNotFoundException
 Key: HBASE-20690
 URL: https://issues.apache.org/jira/browse/HBASE-20690
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


This is related code:
{code}
  if (targetGroup != null) {
for (TableName table: tables) {
  if (master.getAssignmentManager().isTableDisabled(table)) {
LOG.debug("Skipping move regions because the table" + table + " is 
disabled.");
continue;
  }
{code}
In a stack trace [~rmani] showed me:
{code}
2018-06-06 07:10:44,893 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] 
master.TableStateManager: Unable to get table demo:tbl1 state
org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: 
demo:tbl1
at 
org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193)
at 
org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143)
at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470)
at 
org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334)
at 
org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331)
at 
org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
at 
org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
at 
org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331)
at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768)
at 
org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
{code}
The logic should take potential TableStateNotFoundException into account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20680) Master hung during initialization waiting on hbase:meta to be assigned which never does

2018-06-04 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20680:
--

 Summary: Master hung during initialization waiting on hbase:meta 
to be assigned which never does
 Key: HBASE-20680
 URL: https://issues.apache.org/jira/browse/HBASE-20680
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser


When running IntegrationTestRSGroups, the test became hung waiting on the 
master to be initialized.

The hbase cluster was launched without RSGroup config. The test script adds 
required RSGroup configs to hbase-site.xml and restarts the cluster.
It seems that, at one point while the master was trying to assign meta, the 
destination regionserver was in the middle of going down. This has now left 
HBase in a state where it starts the regionserver recovery procedures, but 
never actually gets hbase:meta assigned.

{code}

2018-06-01 10:47:50,024 INFO  [PEWorker-5] procedure2.ProcedureExecutor: 
Initialized subprocedures=[{pid=41, ppid=40, 
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, 
region=1588230740}]

2018-06-01 10:47:50,026 DEBUG [WALProcedureStoreSyncThread] 
wal.WALProcedureStore: hsync completed for 
hdfs://ctr-e138-1518143905142-340983-03-14.hwx.site:8020/apps/hbase/data/   
   MasterProcWALs/pv2-0002.log

2018-06-01 10:47:50,026 INFO  [PEWorker-3] procedure.MasterProcedureScheduler: 
pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=hbase:meta, region=1588230740 checking lock on 1588230740

2018-06-01 10:47:50,026 DEBUG [PEWorker-3] assignment.RegionStates: setting 
location=ctr-e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190 
for rit=OFFLINE, location=ctr-  
e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190, 
table=hbase:meta, region=1588230740 last loc=null

2018-06-01 10:47:50,026 INFO  [PEWorker-3] assignment.AssignProcedure: Starting 
pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=hbase:meta,region=1588230740; rit=OFFLINE, 
location=ctr-e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190; 
forceNewPlan=false, retain=true target svr=null

{code}

At Fri Jun  1 10:48:04, master was restarted.

The new master picked up pid=41:

{code}

2018-06-01 10:48:47,971 INFO  [PEWorker-1] assignment.AssignProcedure: Starting 
pid=41, ppid=40, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=hbase:meta,region=1588230740; rit=OFFLINE, location=null; 
forceNewPlan=false, retain=false target svr=null
{code}

There was no further log for pid=41 after above.

Later when master initiated another meta recovery procedure (pid=42), the 
second procedure seems to be locked out by the former:

{code}
2018-06-01 10:49:34,292 INFO  [PEWorker-2] procedure.MasterProcedureScheduler: 
pid=43, ppid=42, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=hbase:meta, region=1588230740, 
target=ctr-e138-1518143905142-340983-03-14.hwx.site,16020,1527849994190 
checking lock on 1588230740

2018-06-01 10:49:34,293 DEBUG [PEWorker-2] 
assignment.RegionTransitionProcedure: LOCK_EVENT_WAIT pid=43 serverLocks={}, 
namespaceLocks={}, tableLocks={}, 
regionLocks={{1588230740=exclusiveLockOwner=41, sharedLockCount=0, 
waitingProcCount=1}}, peerLocks={}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New HBase committer Guangxu Cheng

2018-06-04 Thread Ted Yu
Congratulations, Guangxu!
 Original message From: "张铎(Duo Zhang)"  
Date: 6/4/18  12:00 AM  (GMT-08:00) To: HBase Dev List , 
hbase-user  Subject: [ANNOUNCE] New HBase committer 
Guangxu Cheng 
On behalf of the Apache HBase PMC, I am pleased to announce that Guangxu
Cheng has accepted the PMC's invitation to become a committer on the
project. We appreciate all of Guangxu's generous contributions thus far and
look forward to his continued involvement.

Congratulations and welcome, Guangxu!


[jira] [Created] (HBASE-20677) Backport HBASE-20566 'Creating a system table after enabling rsgroup feature puts region into RIT ' to branch-2

2018-06-03 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20677:
--

 Summary: Backport HBASE-20566 'Creating a system table after 
enabling rsgroup feature puts region into RIT ' to branch-2
 Key: HBASE-20677
 URL: https://issues.apache.org/jira/browse/HBASE-20677
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


After HBASE-20566 was integrated into master, HBASE-20595 removed the concept 
of 'special tables' from rsgroups.

This task is to backport the fix to branch-2.

TestRSGroups#testRSGroupsWithHBaseQuota would be added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Apache HBase 1.3.2.1RC0

2018-06-03 Thread Ted Yu
+1

Built from source
Ran test suite using Jdk 8 which passed.

On Sat, Jun 2, 2018 at 2:26 PM, Josh Elser  wrote:

> Hi,
>
> Please vote to approve the following as Apache HBase 1.3.2.1
>
> https://dist.apache.org/repos/dist/dev/hbase/1.3.2.1RC0/
>
> Per usual, there is a source release as well as a convenience binary
>
> This is built with JDK7 from the commit: https://git-wip-us.apache.org/
> repos/asf?p=hbase.git;a=commit;h=bf25c1cb7221178388baaa58f0b16a408e151a69
> (there is a corresponding tag "1.3.2.1RC0" for convenience)
>
> hbase-1.3.2.1-bin.tar.gz: 1D CB 27 E0 B0 56 28 B8  BE C7 41 03 2E B5 D3 31
> hbase-1.3.2.1-src.tar.gz: 47 99 46 3C 2B E2 59 9B  5B 8B 2F 16 81 53 6B FE
> hbase-1.3.2.1-bin.tar.gz: 16EB62DA D4EA40F6 DD8747CF 6A49678E D1A4A53E
> B3A9E67D
>   C53A89F1 471D1DC5 5147E5CA D1AED8B0 B22A01F5
> C1F6F6CA
>   4B4E9562 61CDA9B6 91D94C16 26593AFB
> hbase-1.3.2.1-src.tar.gz: 63C55C02 DB27461E 2C006758 329EC21E E14823E3
> 9080105B
>   43FA6EF2 05BD81A3 D526E2AC 6EAE0FE9 1C3103F4
> 20B8457F
>   3C94EF73 5B3CB18C 85B7E0AB 4311CAA4
>
> This vote will be open for at least 72 hours (2018/06/05 2130 UTC).
>
> - Josh (on behalf of the HBase PMC)
>


[jira] [Created] (HBASE-20676) Give .hbase-snapshot proper ownership upon directory creation

2018-06-02 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20676:
--

 Summary: Give .hbase-snapshot proper ownership upon directory 
creation
 Key: HBASE-20676
 URL: https://issues.apache.org/jira/browse/HBASE-20676
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


This is continuation of the discussion over HBASE-20668.

Tthe .hbase-snapshot directory is not created at cluster startup. Normally it 
is created when snapshot operation is initiated.

However, if before any snapshot operation is performed, some non-super user 
from another cluster conducts ExportSnapshot to this cluster, the 
.hbase-snapshot directory would be created as that user.
(This is just one scenario that can lead to wrong ownership)

This JIRA is to seek proper way(s) to ensure that .hbase-snapshot directory 
would always carry proper onwership and permission upon creation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

2018-06-01 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20668:
--

 Summary: Exception from FileSystem operation in finally block of 
ExportSnapshot#doWork may hide exception from FileUtil.copy call
 Key: HBASE-20668
 URL: https://issues.apache.org/jira/browse/HBASE-20668
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


I was debugging the following error [~romil.choksi] saw during testing 
ExportSnapshot :
{code}
2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 ERROR 
[main] util.AbstractHBaseTool: Error  running command-line tool
2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: 
Directory/File does not exist /apps/ 
hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
checkOwner(FSDirectory.java:1777)
2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
setOwner(FSDirAttrOp.java:82)
{code}
Here is corresponding code (with extra log added):
{code}
try {
  LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
initialOutputSnapshotDir);
  boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
initialOutputSnapshotDir, false,
  false, conf);
  LOG.info("return val = " + ret);
} catch (IOException e) {
  LOG.warn("Failed to copy the snapshot directory: from=" +
  snapshotDir + " to=" + initialOutputSnapshotDir, e);
  throw new ExportSnapshotException("Failed to copy the snapshot directory: 
from=" +
snapshotDir + " to=" + initialOutputSnapshotDir, e);
} finally {
  if (filesUser != null || filesGroup != null) {
LOG.warn((filesUser == null ? "" : "Change the owner of " + 
needSetOwnerDir + " to "
+ filesUser)
+ (filesGroup == null ? "" : ", Change the group of " + 
needSetOwnerDir + " to "
+ filesGroup));
setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
  }
{code}
"return val = " was not seen in rerun of the test.
This is what the additional log revealed:
{code}
2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN  
[main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
 Permission denied:   user=hbase, access=WRITE, 
inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
check(FSPermissionChecker.java:399)
2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
checkPermission(FSPermissionChecker.java:255)
{code}
It turned out that the exception from {{setOwner}} call in the finally block 
eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


retrieving leaderAndIsr

2018-05-31 Thread Ted Yu
Hi,
For the following code which works against 0.10 :

def isPropagated = server.apis.metadataCache.getPartitionInfo(topic,
partition) match {
  case Some(partitionState) =>
val leaderAndInSyncReplicas =
partitionState.leaderIsrAndControllerEpoch.leaderAndIsr

I got the this error compiling against 2.0 :

 value leaderIsrAndControllerEpoch is not a member of
org.apache.kafka.common.requests.UpdateMetadataRequest.PartitionState
[ERROR] val leaderAndInSyncReplicas =
partitionState.leaderIsrAndControllerEpoch.leaderAndIsr

Please comment on the replacement API.

Thanks


[jira] [Reopened] (HBASE-20639) Implement permission checking through AccessController instead of RSGroupAdminEndpoint

2018-05-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-20639:


> Implement permission checking through AccessController instead of 
> RSGroupAdminEndpoint
> --
>
> Key: HBASE-20639
> URL: https://issues.apache.org/jira/browse/HBASE-20639
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Nihal Jain
>Priority: Major
> Attachments: HBASE-20639.master.001.patch, 
> HBASE-20639.master.002.patch, HBASE-20639.master.002.patch
>
>
> Currently permission checking for various RS group operations is done via 
> RSGroupAdminEndpoint.
> e.g. in RSGroupAdminServiceImpl#moveServers() :
> {code}
> checkPermission("moveServers");
> groupAdminServer.moveServers(hostPorts, request.getTargetGroup());
> {code}
> The practice in remaining parts of hbase is to perform permission checking 
> within AccessController.
> Now that observer hooks for RS group operations are in right place, we should 
> follow best practice and move permission checking to AccessController.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20654) Expose regions in transition thru JMX

2018-05-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20654:
--

 Summary: Expose regions in transition thru JMX
 Key: HBASE-20654
 URL: https://issues.apache.org/jira/browse/HBASE-20654
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu


Currently only the count of regions in transition is exposed thru JMX.
Here is a sample snippet of the /jmx output:
{code}
{
  "beans" : [ {
...
  }, {
"name" : "Hadoop:service=HBase,name=Master,sub=AssignmentManager",
"modelerType" : "Master,sub=AssignmentManager",
"tag.Context" : "master",
...
"ritCount" : 3
{code}
It would be desirable to expose region name, state for the regions in 
transition as well.
We can place configurable upper bound on the number of entries returned in case 
there're a lot of regions in transition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20653) Add missing observer hooks for region server group to MasterObserver

2018-05-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20653:
--

 Summary: Add missing observer hooks for region server group to 
MasterObserver
 Key: HBASE-20653
 URL: https://issues.apache.org/jira/browse/HBASE-20653
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


Currently the following region server group operations don't have corresponding 
hook in MasterObserver :

* getRSGroupInfo
* getRSGroupInfoOfServer
* getRSGroupInfoOfTable
* listRSGroup

This JIRA is to 

* add them to MasterObserver
* add corresponding permission check in AccessController
* move the {{checkPermission}} out of RSGroupAdminEndpoint
* add corresponding tests to TestRSGroupsWithACL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20079) Report all the new test classes missing HBaseClassTestRule in one patch

2018-05-25 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-20079.

Resolution: Later

> Report all the new test classes missing HBaseClassTestRule in one patch
> ---
>
> Key: HBASE-20079
> URL: https://issues.apache.org/jira/browse/HBASE-20079
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Priority: Trivial
>
> Currently if there are both new small and large tests without 
> HBaseClassTestRule in a single patch, the QA bot would report the small test 
> class as missing HBaseClassTestRule but not the large test.
> All new test classes missing HBaseClassTestRule should be reported in the 
> same QA run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20081) TestDisableTableProcedure sometimes hung in MiniHBaseCluster#waitUntilShutDown

2018-05-25 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-20081.

Resolution: Cannot Reproduce

> TestDisableTableProcedure sometimes hung in MiniHBaseCluster#waitUntilShutDown
> --
>
> Key: HBASE-20081
> URL: https://issues.apache.org/jira/browse/HBASE-20081
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Priority: Major
>
> https://builds.apache.org/job/HBase-2.0-hadoop3-tests/lastCompletedBuild/org.apache.hbase$hbase-server/testReport/org.apache.hadoop.hbase.master.procedure/TestDisableTableProcedure/org_apache_hadoop_hbase_master_procedure_TestDisableTableProcedure/
>  was one recent occurrence.
> I noticed two things in test output:
> {code}
> 2018-02-25 18:12:45,053 WARN  [Time-limited test-EventThread] 
> master.RegionServerTracker(136): asf912.gq1.ygridcore.net,45649,1519582305777 
> is not online or isn't known to the master.The latter could be caused by a 
> DNS misconfiguration.
> {code}
> Since DNS misconfiguration was very unlikely on Apache Jenkins nodes, the 
> above should not have been logged.
> {code}
> 2018-02-25 18:16:51,531 WARN  [master/asf912:0.Chore.1] 
> master.CatalogJanitor(127): Failed scan of catalog table
> java.io.IOException: connection is closed
>   at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:263)
>   at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:761)
>   at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:680)
>   at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:675)
>   at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:188)
>   at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:140)
>   at 
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:246)
>   at 
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:119)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> {code}
> The above was possibly related to the lost region server.
> I searched test output of successful run where none of the above two can be 
> seen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20644) Master shutdown due to service ClusterSchemaServiceImpl failing to start

2018-05-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20644:
--

 Summary: Master shutdown due to service ClusterSchemaServiceImpl 
failing to start
 Key: HBASE-20644
 URL: https://issues.apache.org/jira/browse/HBASE-20644
 Project: HBase
  Issue Type: Bug
Reporter: Romil Choksi


>From hbase-hbase-master-ctr-e138-1518143905142-329221-01-03.hwx.site.log :
{code}
2018-05-23 22:14:29,750 ERROR 
[master/ctr-e138-1518143905142-329221-01-03:2] master.HMaster: Failed 
to become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1054)
at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:918)
at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2023)
{code}
Earlier in the log , the namespace region was deemed OPEN on 
01-07.hwx.site,16020,1527112194788 which was declared not online:
{code}
2018-05-23 21:54:34,786 INFO  
[master/ctr-e138-1518143905142-329221-01-03:2] 
assignment.RegionStateStore: Load hbase:meta entry  
   region=01a7f9ba9fffd691f261d3fbc620da06, regionState=OPEN, 
lastHost=ctr-e138-1518143905142-329221-01-07.hwx.site,16020,1527112194788, 
regionLocation=ctr-e138-1518143905142-329221-01-07.hwx.site,16020,1527112194788,
 seqnum=43
2018-05-23 21:54:34,787 INFO  
[master/ctr-e138-1518143905142-329221-01-03:2] 
assignment.AssignmentManager: Number of RegionServers=1
2018-05-23 21:54:34,788 INFO  
[master/ctr-e138-1518143905142-329221-01-03:2] 
assignment.AssignmentManager: KILL 
RegionServer=ctr-e138-1518143905142-329221-01-07.   
hwx.site,16020,1527112194788 hosting regions but not online.
{code}
Later, even though a different instance on 007 registered with master:
{code}
2018-05-23 21:55:13,541 INFO  
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] 
master.ServerManager: Registering 
regionserver=ctr-e138-1518143905142-329221-01-07.hwx.site,16020,1527112506002
...
2018-05-23 21:55:43,881 INFO  
[master/ctr-e138-1518143905142-329221-01-03:2] 
client.RpcRetryingCallerImpl: Call exception, tries=12, retries=12, 
started=69001 ms ago,cancelled=false, 
msg=org.apache.hadoop.hbase.NotServingRegionException: 
hbase:namespace,,1527099443383.01a7f9ba9fffd691f261d3fbc620da06. is not online 
on ctr-e138-1518143905142-329221-  01-07.hwx.site,16020,1527112506002
  at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3273)
  at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3250)
  at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
  at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2446)
  at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
{code}
There was no OPEN request sent to that instance.

>From 
>hbase-hbase-regionserver-ctr-e138-1518143905142-329221-01-07.hwx.site.log :
{code}
2018-05-23 21:52:27,414 INFO  
[RS_CLOSE_REGION-regionserver/ctr-e138-1518143905142-329221-01-07:16020-1] 
regionserver.HRegion: Closed hbase:namespace,,1527099443383.   
01a7f9ba9fffd691f261d3fbc620da06.
{code}
Then region server 007 restarted:
{code}
Wed May 23 21:55:03 UTC 2018 Starting regionserver on 
ctr-e138-1518143905142-329221-01-07.hwx.site
{code}
After which the region 01a7f9ba9fffd691f261d3fbc620da06 never showed up again 
in log 007



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20639) Implement permission checking through AccessController instead of RSGroupAdminEndpoint

2018-05-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20639:
--

 Summary: Implement permission checking through AccessController 
instead of RSGroupAdminEndpoint
 Key: HBASE-20639
 URL: https://issues.apache.org/jira/browse/HBASE-20639
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


Currently permission checking for various RS group operations is done via 
RSGroupAdminEndpoint.
e.g. in RSGroupAdminServiceImpl#moveServers() :
{code}
checkPermission("moveServers");
groupAdminServer.moveServers(hostPorts, request.getTargetGroup());
{code}
The practice in remaining parts of hbase is to perform permission checking 
within AccessController.

Now that observer hooks for RS group operations are in right place, we should 
follow best practice and move permission checking to AccessController.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-20627) Relocate RS Group pre/post hooks from RSGroupAdminServer to RSGroupAdminEndpoint

2018-05-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-20627:


> Relocate RS Group pre/post hooks from RSGroupAdminServer to 
> RSGroupAdminEndpoint
> 
>
> Key: HBASE-20627
> URL: https://issues.apache.org/jira/browse/HBASE-20627
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20627.branch-1.txt, 20627.v1.txt, 20627.v2.txt, 
> 20627.v3.txt
>
>
> Currently RS Group pre/post hooks are called from RSGroupAdminServer.
> e.g. RSGroupAdminServer#removeRSGroup :
> {code}
>   if (master.getMasterCoprocessorHost() != null) {
> master.getMasterCoprocessorHost().preRemoveRSGroup(name);
>   }
> {code}
> RSGroupAdminServer#removeRSGroup is called by RSGroupAdminEndpoint :
> {code}
> checkPermission("removeRSGroup");
> groupAdminServer.removeRSGroup(request.getRSGroupName());
> {code}
> If permission check fails, the pre hook wouldn't be called.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20627) Relocate RS Group pre/post hooks from RSGroupAdminServer to RSGroupAdminEndpoint

2018-05-23 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20627:
--

 Summary: Relocate RS Group pre/post hooks from RSGroupAdminServer 
to RSGroupAdminEndpoint
 Key: HBASE-20627
 URL: https://issues.apache.org/jira/browse/HBASE-20627
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 20627.v1.txt

Currently RS Group pre/post hooks are called from RSGroupAdminServer.
e.g. RSGroupAdminServer#removeRSGroup :
{code}
  if (master.getMasterCoprocessorHost() != null) {
master.getMasterCoprocessorHost().preRemoveRSGroup(name);
  }
{code}
RSGroupAdminServer#removeRSGroup is called by RSGroupAdminEndpoint :
{code}
checkPermission("removeRSGroup");
groupAdminServer.removeRSGroup(request.getRSGroupName());
{code}
If permission check fails, the pre hook wouldn't be called.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


RS group related hooks

2018-05-23 Thread Ted Yu
Hi,
I was looking at RSGroupAdminServer#removeRSGroup :

  if (master.getMasterCoprocessorHost() != null) {

master.getMasterCoprocessorHost().preRemoveRSGroup(name);

  }
However, RSGroupAdminServer#removeRSGroup is called by RSGroupAdminEndpoint
:

checkPermission("removeRSGroup");
groupAdminServer.removeRSGroup(request.getRSGroupName());

Meaning, if permission check fails, the pre hook wouldn't be called.

I wonder if the call to preRemoveRSGroup() should be lifted to before
calling checkPermission().

Cheers


[jira] [Created] (HBASE-20609) SnapshotHFileCleaner#init should check that params is not null

2018-05-21 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20609:
--

 Summary: SnapshotHFileCleaner#init should check that params is not 
null
 Key: HBASE-20609
 URL: https://issues.apache.org/jira/browse/HBASE-20609
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


Noticed the following in the test output of TestHFileArchiving :
{code}
SnapshotHFileCleaner.init(Map<String,Object>) line: 79
HFileCleaner(CleanerChore).newFileCleaner(String, Configuration) line: 260
HFileCleaner(CleanerChore).initCleanerChain(String) line: 232
HFileCleaner(CleanerChore).(String, int, Stoppable, Configuration, 
FileSystem, Path, String, Map<String,Object>) line: 182
HFileCleaner.(int, Stoppable, Configuration, FileSystem, Path, 
Map<String,Object>) line: 104
HFileCleaner.(int, Stoppable, Configuration, FileSystem, Path) line: 51
TestHFileArchiving.testCleaningRace() line: 377
{code}
This was due to SnapshotHFileCleaner#init not checking the parameter {{params}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >