[jira] [Commented] (HBASE-20001) cleanIfNoMetaEntry() uses encoded instead of region name to lookup region

2018-02-17 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368455#comment-16368455
 ] 

Chia-Ping Tsai commented on HBASE-20001:


TestCatalogJanitorInMemoryStates is traced by HBASE-20016.

> cleanIfNoMetaEntry() uses encoded instead of region name to lookup region
> -
>
> Key: HBASE-20001
> URL: https://issues.apache.org/jira/browse/HBASE-20001
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.1.7
>Reporter: Francis Liu
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>
> Attachments: HBASE-20001.branch-1.4.001.patch
>
>
> In RegionStates.cleanIfNoMetaEntry()
> {{if (MetaTableAccessor.getRegion(server.getConnection(), 
> hri.getEncodedNameAsBytes()) == null) {}}
> {{regionOffline(hri);}}
> {{FSUtils.deleteRegionDir(server.getConfiguration(), hri);}}
> }
> But api expects regionname
> {{public static Pair getRegion(Connection 
> connection, byte [] regionName)}}
> So we might end up cleaning good regions.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20016) TestCatalogJanitorInMemoryStates#testInMemoryForReplicaParentCleanup is flaky

2018-02-17 Thread Chia-Ping Tsai (JIRA)
Chia-Ping Tsai created HBASE-20016:
--

 Summary: 
TestCatalogJanitorInMemoryStates#testInMemoryForReplicaParentCleanup is flaky
 Key: HBASE-20016
 URL: https://issues.apache.org/jira/browse/HBASE-20016
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai
 Fix For: 1.5.0, 1.4.3


It is a time-based test. RegionStates#isRegionOnline will return false if the 
target region is in transition. The list of region assignment may not updated 
yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19400) Add missing security hooks for MasterService RPCs

2018-02-17 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19400:
-
Attachment: HBASE-19400.master.006.patch

> Add missing security hooks for MasterService RPCs
> -
>
> Key: HBASE-19400
> URL: https://issues.apache.org/jira/browse/HBASE-19400
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-beta-1
>Reporter: Balazs Meszaros
>Assignee: Appy
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19400.master.001.patch, 
> HBASE-19400.master.002.patch, HBASE-19400.master.003.patch, 
> HBASE-19400.master.004.patch, HBASE-19400.master.004.patch, 
> HBASE-19400.master.005.patch, HBASE-19400.master.006.patch
>
>
> The following RPC methods do not call the observers, therefore they are not 
> guarded by AccessController:
> - normalize
> - setNormalizerRunning
> - runCatalogScan
> - enableCatalogJanitor
> - runCleanerChore
> - setCleanerChoreRunning
> - execMasterService
> - execProcedure
> - execProcedureWithRet



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19400) Add missing security hooks for MasterService RPCs

2018-02-17 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368449#comment-16368449
 ] 

Appy commented on HBASE-19400:
--

Debugged both the tests.
TestAccessController was failing because new references to TableAuthManager 
from RSRpcServices weren't being released.
Added stop() method to AccessChecker to release the reference since it's the 
one thats "owns" the reference.

For TestPriorityRpc, it was because zk was null for those tests. Updated patch 
to handle that in a way that also makes sense for prod - create AccessChecker 
instance only when authorization is enabled.

> Add missing security hooks for MasterService RPCs
> -
>
> Key: HBASE-19400
> URL: https://issues.apache.org/jira/browse/HBASE-19400
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-beta-1
>Reporter: Balazs Meszaros
>Assignee: Appy
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19400.master.001.patch, 
> HBASE-19400.master.002.patch, HBASE-19400.master.003.patch, 
> HBASE-19400.master.004.patch, HBASE-19400.master.004.patch, 
> HBASE-19400.master.005.patch
>
>
> The following RPC methods do not call the observers, therefore they are not 
> guarded by AccessController:
> - normalize
> - setNormalizerRunning
> - runCatalogScan
> - enableCatalogJanitor
> - runCleanerChore
> - setCleanerChoreRunning
> - execMasterService
> - execProcedure
> - execProcedureWithRet



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19400) Add missing security hooks for MasterService RPCs

2018-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368443#comment-16368443
 ] 

Hadoop QA commented on HBASE-19400:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
7s{color} | {color:red} hbase-server: The patch generated 5 new + 311 unchanged 
- 6 fixed = 316 total (was 317) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-rsgroup: The patch generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m  9s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hbase-rsgroup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestPriorityRpc |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19400 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12911045/HBASE-19400.master.005.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux be78956b9a88 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HBASE-20001) cleanIfNoMetaEntry() uses encoded instead of region name to lookup region

2018-02-17 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-20001:
---
Fix Version/s: 1.2.7

> cleanIfNoMetaEntry() uses encoded instead of region name to lookup region
> -
>
> Key: HBASE-20001
> URL: https://issues.apache.org/jira/browse/HBASE-20001
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.1.7
>Reporter: Francis Liu
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>
> Attachments: HBASE-20001.branch-1.4.001.patch
>
>
> In RegionStates.cleanIfNoMetaEntry()
> {{if (MetaTableAccessor.getRegion(server.getConnection(), 
> hri.getEncodedNameAsBytes()) == null) {}}
> {{regionOffline(hri);}}
> {{FSUtils.deleteRegionDir(server.getConfiguration(), hri);}}
> }
> But api expects regionname
> {{public static Pair getRegion(Connection 
> connection, byte [] regionName)}}
> So we might end up cleaning good regions.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19954) ShutdownHook should check whether shutdown hook is tracked by ShutdownHookManager

2018-02-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368420#comment-16368420
 ] 

Ted Yu edited comment on HBASE-19954 at 2/18/18 1:54 AM:
-

Did some debugging by installing hadoop-common of hadoop3 with additional 
logging into local maven repo.
{code}
2018-02-17 16:14:14,573 INFO  [Time-limited test] 
util.ShutdownHookManager(286): clearing hooks
2018-02-17 16:14:14,588 INFO  [Time-limited test] 
hbase.HBaseTestingUtility(1114): Minicluster is down
2018-02-17 16:14:14,627 INFO  [Time-limited test] hbase.ResourceChecker(172): 
after: fs.TestBlockReorder#testBlockLocationReorder Thread=110 (was 8)
{code}
Note the above was the first test in TestBlockReorder where the {{hooks}} Set 
of hadoop ShutdownHookManager was cleared (first line).
The 'Failed suppression' exception happened in the second subtest where the 
FileSystem$Cache$ClientFinalizer instance was no longer in the Set.
I dumped the contents of the {{hooks}} Set at time of the exception and saw 
fsdataset.impl.BlockPoolSlice instances but no ClientFinalizer instance. 

After poking around hadoop ShutdownHookManager, I don't see bug.


was (Author: yuzhih...@gmail.com):
Did some debugging by installing hadoop-common with additional logging into 
local maven repo.
{code}
2018-02-17 16:14:14,573 INFO  [Time-limited test] 
util.ShutdownHookManager(286): clearing hooks
2018-02-17 16:14:14,588 INFO  [Time-limited test] 
hbase.HBaseTestingUtility(1114): Minicluster is down
2018-02-17 16:14:14,627 INFO  [Time-limited test] hbase.ResourceChecker(172): 
after: fs.TestBlockReorder#testBlockLocationReorder Thread=110 (was 8)
{code}
Note the above was the first test in TestBlockReorder where the {{hooks}} Set 
of hadoop ShutdownHookManager was cleared (first line).
The 'Failed suppression' exception happened in the second subtest where the 
FileSystem$Cache$ClientFinalizer instance was no longer in the Set.
I dumped the contents of the {{hooks}} Set at time of the exception and saw 
fsdataset.impl.BlockPoolSlice instances but no ClientFinalizer instance. 

After poking around hadoop ShutdownHookManager, I don't see bug.

> ShutdownHook should check whether shutdown hook is tracked by 
> ShutdownHookManager
> -
>
> Key: HBASE-19954
> URL: https://issues.apache.org/jira/browse/HBASE-19954
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 19954.v1.txt
>
>
> Currently ShutdownHook#suppressHdfsShutdownHook() does the following:
> {code}
>   synchronized (fsShutdownHooks) {
> boolean isFSCacheDisabled = 
> fs.getConf().getBoolean("fs.hdfs.impl.disable.cache", false);
> if (!isFSCacheDisabled && 
> !fsShutdownHooks.containsKey(hdfsClientFinalizer)
> && !ShutdownHookManager.deleteShutdownHook(hdfsClientFinalizer)) {
> {code}
> There is no check that ShutdownHookManager still tracks the shutdown hook, 
> leading to potential RuntimeException (as can be observed in hadoop3 Jenkins 
> job).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19954) ShutdownHook should check whether shutdown hook is tracked by ShutdownHookManager

2018-02-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368420#comment-16368420
 ] 

Ted Yu commented on HBASE-19954:


Did some debugging by installing hadoop-common with additional logging into 
local maven repo.
{code}
2018-02-17 16:14:14,573 INFO  [Time-limited test] 
util.ShutdownHookManager(286): clearing hooks
2018-02-17 16:14:14,588 INFO  [Time-limited test] 
hbase.HBaseTestingUtility(1114): Minicluster is down
2018-02-17 16:14:14,627 INFO  [Time-limited test] hbase.ResourceChecker(172): 
after: fs.TestBlockReorder#testBlockLocationReorder Thread=110 (was 8)
{code}
Note the above was the first test in TestBlockReorder where the {{hooks}} Set 
of hadoop ShutdownHookManager was cleared (first line).
The 'Failed suppression' exception happened in the second subtest where the 
FileSystem$Cache$ClientFinalizer instance was no longer in the Set.
I dumped the contents of the {{hooks}} Set at time of the exception and saw 
fsdataset.impl.BlockPoolSlice instances but no ClientFinalizer instance. 

After poking around hadoop ShutdownHookManager, I don't see bug.

> ShutdownHook should check whether shutdown hook is tracked by 
> ShutdownHookManager
> -
>
> Key: HBASE-19954
> URL: https://issues.apache.org/jira/browse/HBASE-19954
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 19954.v1.txt
>
>
> Currently ShutdownHook#suppressHdfsShutdownHook() does the following:
> {code}
>   synchronized (fsShutdownHooks) {
> boolean isFSCacheDisabled = 
> fs.getConf().getBoolean("fs.hdfs.impl.disable.cache", false);
> if (!isFSCacheDisabled && 
> !fsShutdownHooks.containsKey(hdfsClientFinalizer)
> && !ShutdownHookManager.deleteShutdownHook(hdfsClientFinalizer)) {
> {code}
> There is no check that ShutdownHookManager still tracks the shutdown hook, 
> leading to potential RuntimeException (as can be observed in hadoop3 Jenkins 
> job).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19400) Add missing security hooks for MasterService RPCs

2018-02-17 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19400:
-
Attachment: HBASE-19400.master.005.patch

> Add missing security hooks for MasterService RPCs
> -
>
> Key: HBASE-19400
> URL: https://issues.apache.org/jira/browse/HBASE-19400
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-beta-1
>Reporter: Balazs Meszaros
>Assignee: Appy
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19400.master.001.patch, 
> HBASE-19400.master.002.patch, HBASE-19400.master.003.patch, 
> HBASE-19400.master.004.patch, HBASE-19400.master.004.patch, 
> HBASE-19400.master.005.patch
>
>
> The following RPC methods do not call the observers, therefore they are not 
> guarded by AccessController:
> - normalize
> - setNormalizerRunning
> - runCatalogScan
> - enableCatalogJanitor
> - runCleanerChore
> - setCleanerChoreRunning
> - execMasterService
> - execProcedure
> - execProcedureWithRet



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-15806) An endpoint-based export tool

2018-02-17 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-15806:
-
Release Note: 
org.apache.hadoop.hbase.coprocessor.Export
Instructs HBase to dump the contents of table to HDFS in a sequence file
+ replaces MR by endpoint (see org.apache.hadoop.hbase.mapreduce.Export)
+ no large data to be transfered between hbase server and client
+ same command line as org.apache.hadoop.hbase.mapreduce.Export
- user needs to alter table for deploying ExportEndpoint
- user needs to adjust the endpoint timeout for dumping large data
- user needs to get the EXECUTE permission

  was:
org.apache.hadoop.hbase.coprocessor.ExportEndpoint
Instructs HBase to dump the contents of table to HDFS in a sequence file
+ replaces MR by endpoint (see org.apache.hadoop.hbase.mapreduce.Export)
+ no large data to be transfered between hbase server and client
+ same command line as org.apache.hadoop.hbase.mapreduce.Export
- user needs to alter table for deploying ExportEndpoint
- user needs to adjust the endpoint timeout for dumping large data
- user needs to get the EXECUTE permission


> An endpoint-based export tool
> -
>
> Key: HBASE-15806
> URL: https://issues.apache.org/jira/browse/HBASE-15806
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors, tooling
>Affects Versions: 2.0.0
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: Experiment.png, HBASE-15806-v1.patch, 
> HBASE-15806-v2.patch, HBASE-15806-v3.patch, HBASE-15806.patch, 
> HBASE-15806.v10.patch, HBASE-15806.v10.patch, HBASE-15806.v11.patch, 
> HBASE-15806.v4.patch, HBASE-15806.v5.patch, HBASE-15806.v6.patch, 
> HBASE-15806.v7.patch, HBASE-15806.v8.patch, HBASE-15806.v9.patch
>
>
> The time for exporting table can be reduced, if we use the endpoint technique 
> to export the hdfs files by the region server rather than by hbase client.
> In my experiments, the elapsed time of endpoint-based export can be less than 
> half of current export tool (enable the hdfs compression)
> But the shortcomings is we need to alter table for deploying the endpoint
> any comments about this? thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey

2018-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368395#comment-16368395
 ] 

Hudson commented on HBASE-20015:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4603 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4603/])
HBASE-20015 TestMergeTableRegionsProcedure and (stack: rev 
f3ff55a2b4bb7a8b4980fdbb5b1f7a8d033631f3)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java


> TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
> -
>
> Key: HBASE-20015
> URL: https://issues.apache.org/jira/browse/HBASE-20015
> Project: HBase
>  Issue Type: Sub-task
>  Components: flakey
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-20015.branch-2.001.patch
>
>
> MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can 
> run in a test mode such that it kills the Procedure before it can persist 
> state and it does this repeatedly to shake out areas where Procedures may not 
> be preserving all needed state at each Procedural step. The kill will cause 
> the Procedure to 'fail'. It'll then run the rollback procedure. The 
> MergeRegionProcedure is not able to roll back the last few steps of Merge 
> It throws an UnsupportedException (the hope was that the missing steps would 
> be filled in ... but they are hard to complete in that they themselves are 
> stepped).
> So
> Well it turns out that Split has a mechanism where it will not fail the 
> Procedure if gets to a stage from which it cannot rollback. Instead, it will 
> just retry and keep retrying till it succeeds eventually. Merge has this 
> facility half-implemented. Merge tests are therefore flakey. They do stuff 
> like this:
> {code}
> 2018-02-17 04:04:02,999 WARN  [PEWorker-1] 
> assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step 
> MERGE_TABLE_REGIONS_UPDATE_META for merging the regions 
> [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table 
> testRollbackAndDoubleExecution
> java.lang.UnsupportedOperationException: pid=44, 
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
>  abort requested; MergeTableRegionsProcedure 
> table=testRollbackAndDoubleExecution, 
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
> forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
>   at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
>   at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
> 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=44, 
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
>  abort requested; MergeTableRegionsProcedure 
> table=testRollbackAndDoubleExecution, 
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
> forcibly=false
> java.lang.UnsupportedOperationException: pid=44, 
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
>  abort requested; MergeTableRegionsProcedure 
> table=testRollbackAndDoubleExecution, 
> regions=[485dd0c2a5d14601d61fed791f793158, 

[jira] [Commented] (HBASE-20003) WALLess HBase on Persistent Memory

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368394#comment-16368394
 ] 

Andrew Purtell commented on HBASE-20003:


It would also be interesting to hear about how, for the case where all replicas 
go down, and only a subset are recovered, how we know the recovered subset 
actually has the latest data. :) This will depend on the guarantees made by the 
means of synchronous replication. A typical majority/consensus approach would 
mean out of 3 replicas, 1 might be stale, and if you only bring that 1 back 
into service, you'd lose data. If 2 of the replicas are brought back, we could 
say most recent view wins? Need to think about that.

> WALLess HBase on Persistent Memory
> --
>
> Key: HBASE-20003
> URL: https://issues.apache.org/jira/browse/HBASE-20003
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
>
> This JIRA aims to make use of persistent memory (pmem) technologies in HBase. 
> One such usage is to make the Memstore to reside on pmem. Making a persistent 
> memstore would remove the need for WAL and paves way for a WALLess HBase. 
> The existing region replica feature could be used here and ensure the data 
> written to memstores are synchronously replicated to the replicas and ensure 
> strong consistency of the data. (pipeline model)
> Advantages :
> -Data Availability : Since the data across replicas are consistent 
> (synchronously written) our data is always 100 % available.
> -Lower MTTR : It becomes easier/faster to switch over to the replicas on a 
> primary region failure as there is no WAL replay involved. Building the 
> memstore map data also is much faster than reading the WAL and replaying the 
> WAL.
> -Possibility of bigger memstores : These pmems are designed to have more 
> memory than DRAMs so it would also enable us to have bigger sized memstores 
> which leads to lesser flushes/compaction IO. 
> -Removes the dependency of HDFS on the write path
> Initial PoC has been designed and developed. Testing is underway and we would 
> publish the PoC results along with the design doc sooner. The PoC doc will 
> talk about the design decisions, the libraries considered to work with these 
> pmem devices, pros and cons of those libraries and the performance results.
> Note : Next gen memory technologies using 3DXPoint gives persistent memory 
> feature. Such memory DIMMs are soon to appear in the market. The PoC is done 
> around Intel's ApachePass (AEP)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20003) WALLess HBase on Persistent Memory

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368381#comment-16368381
 ] 

Andrew Purtell edited comment on HBASE-20003 at 2/17/18 10:26 PM:
--

{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we have lost the 
ability to resume service for the affected region(s) on the other available 
hosts, because nobody beyond those replicas has any of the memstore data that 
was not flushed prior to loss of the hosts. On a 1000+ node cluster, if you 
happen to lose 3 of the servers at once (which is more likely than you'd like, 
but the reality of scale ops) there is a good chance some regions have become 
completely unavailable, except perhaps a "timeline consistent" view of some 
past point in time, until one of those servers can be brought back online. That 
is different from today, where every single server in the cluster has access to 
region data and WAL data in HDFS and can host the affected region(s).

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look. I suppose a case could be made that in some ways this matches 
the availability model of HDFS's default block placement policy, although HDFS 
does active mitigation of replica loss via re-replication and blocks are more 
dispersed than region replicas so an analysis is nontrivial.


was (Author: apurtell):
{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we have lost the 
ability to resume service for the affected region(s) on the other available 
hosts, because nobody beyond those replicas has any of the memstore data that 
was not flushed prior to loss of the hosts. On a 1000+ node cluster, if you 
happen to lose 3 of the servers at once (which is more likely than you'd like, 
but the reality of scale ops) there is a good chance some regions have become 
completely unavailable until one of those servers can be brought back online. 
That is different from today, where every single server in the cluster has 
access to region data and WAL data in HDFS and can host the affected region(s).

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look. I suppose a case could be made that in some ways this matches 
the availability model of HDFS's default block placement policy, although HDFS 
does active mitigation of replica loss via re-replication and blocks are more 
dispersed than region replicas so an analysis is nontrivial.

> WALLess HBase on Persistent Memory
> --
>
> Key: HBASE-20003
> URL: https://issues.apache.org/jira/browse/HBASE-20003
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
>
> This JIRA aims to make use of persistent memory (pmem) technologies in HBase. 
> One such usage is to make the Memstore to reside on pmem. Making a persistent 
> memstore would remove the need for WAL and paves way for a WALLess HBase. 
> The existing region replica feature could be used here and ensure the data 
> written to memstores are synchronously replicated to the replicas and ensure 
> strong consistency of the data. (pipeline model)
> Advantages :
> -Data Availability : Since the data across replicas are consistent 
> (synchronously written) our data is always 100 % available.
> -Lower 

[jira] [Comment Edited] (HBASE-20003) WALLess HBase on Persistent Memory

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368381#comment-16368381
 ] 

Andrew Purtell edited comment on HBASE-20003 at 2/17/18 10:25 PM:
--

{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we have lost the 
ability to resume service for the affected region(s) on the other available 
hosts, because nobody beyond those replicas has any of the memstore data that 
was not flushed prior to loss of the hosts. On a 1000+ node cluster, if you 
happen to lose 3 of the servers at once (which is more likely than you'd like, 
but the reality of scale ops) there is a good chance some regions have become 
completely unavailable until one of those servers can be brought back online. 
That is different from today, where every single server in the cluster has 
access to region data and WAL data in HDFS and can host the affected region(s).

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look. I suppose a case could be made that in some ways this matches 
the availability model of HDFS's default block placement policy, although HDFS 
does active mitigation of replica loss via re-replication and blocks are more 
dispersed than region replicas so an analysis is nontrivial.


was (Author: apurtell):
{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we have lost the 
ability to resume service for the affected region(s) on the other available 
hosts, because nobody beyond those replicas has any of the data. On a 1000+ 
node cluster, if you happen to lose 3 of the servers at once (which is more 
likely than you'd like, but the reality of scale ops) there is a good chance 
some regions have become completely unavailable until one of those servers can 
be brought back online. That is different from today, where every single server 
in the cluster has access to region data and WAL data in HDFS and can host the 
affected region(s).

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look. I suppose a case could be made that in some ways this matches 
the availability model of HDFS's default block placement policy, although HDFS 
does active mitigation of replica loss via re-replication and blocks are more 
dispersed than region replicas so an analysis is nontrivial.

> WALLess HBase on Persistent Memory
> --
>
> Key: HBASE-20003
> URL: https://issues.apache.org/jira/browse/HBASE-20003
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
>
> This JIRA aims to make use of persistent memory (pmem) technologies in HBase. 
> One such usage is to make the Memstore to reside on pmem. Making a persistent 
> memstore would remove the need for WAL and paves way for a WALLess HBase. 
> The existing region replica feature could be used here and ensure the data 
> written to memstores are synchronously replicated to the replicas and ensure 
> strong consistency of the data. (pipeline model)
> Advantages :
> -Data Availability : Since the data across replicas are consistent 
> (synchronously written) our data is always 100 % available.
> -Lower MTTR : It becomes easier/faster to switch over to the replicas on a 
> primary region failure as there is no WAL replay involved. 

[jira] [Commented] (HBASE-19400) Add missing security hooks for MasterService RPCs

2018-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368388#comment-16368388
 ] 

Hadoop QA commented on HBASE-19400:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 3s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
7s{color} | {color:red} hbase-server: The patch generated 5 new + 284 unchanged 
- 6 fixed = 289 total (was 290) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 56s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestPriorityRpc |
|   | hadoop.hbase.security.access.TestAccessController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19400 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12911031/HBASE-19400.master.004.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux fb51227900b5 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f3ff55a2b4 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11558/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11558/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11558/testReport/ |
| Max. 

[jira] [Comment Edited] (HBASE-20003) WALLess HBase on Persistent Memory

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368381#comment-16368381
 ] 

Andrew Purtell edited comment on HBASE-20003 at 2/17/18 10:09 PM:
--

{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we have lost the 
ability to resume service for the affected region(s) on the other available 
hosts, because nobody beyond those replicas has any of the data. On a 1000+ 
node cluster, if you happen to lose 3 of the servers at once (which is more 
likely than you'd like, but the reality of scale ops) there is a good chance 
some regions have become completely unavailable until one of those servers can 
be brought back online. That is different from today, where every single server 
in the cluster has access to region data and WAL data in HDFS and can host the 
affected region(s).

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look. I suppose a case could be made that in some ways this matches 
the availability model of HDFS's default block placement policy, although HDFS 
does active mitigation of replica loss via re-replication and blocks are more 
dispersed than region replicas so an analysis is nontrivial.


was (Author: apurtell):
{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we lose the 
ability to resume service for the affected region(s) on the other available 
hosts.

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look.

 

> WALLess HBase on Persistent Memory
> --
>
> Key: HBASE-20003
> URL: https://issues.apache.org/jira/browse/HBASE-20003
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
>
> This JIRA aims to make use of persistent memory (pmem) technologies in HBase. 
> One such usage is to make the Memstore to reside on pmem. Making a persistent 
> memstore would remove the need for WAL and paves way for a WALLess HBase. 
> The existing region replica feature could be used here and ensure the data 
> written to memstores are synchronously replicated to the replicas and ensure 
> strong consistency of the data. (pipeline model)
> Advantages :
> -Data Availability : Since the data across replicas are consistent 
> (synchronously written) our data is always 100 % available.
> -Lower MTTR : It becomes easier/faster to switch over to the replicas on a 
> primary region failure as there is no WAL replay involved. Building the 
> memstore map data also is much faster than reading the WAL and replaying the 
> WAL.
> -Possibility of bigger memstores : These pmems are designed to have more 
> memory than DRAMs so it would also enable us to have bigger sized memstores 
> which leads to lesser flushes/compaction IO. 
> -Removes the dependency of HDFS on the write path
> Initial PoC has been designed and developed. Testing is underway and we would 
> publish the PoC results along with the design doc sooner. The PoC doc will 
> talk about the design decisions, the libraries considered to work with these 
> pmem devices, pros and cons of those libraries and the performance results.
> Note : Next gen memory technologies using 3DXPoint gives persistent memory 
> feature. Such memory DIMMs are soon to appear in the market. The PoC is done 

[jira] [Commented] (HBASE-20003) WALLess HBase on Persistent Memory

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368381#comment-16368381
 ] 

Andrew Purtell commented on HBASE-20003:


{quote}The existing region replica feature could be used here and ensure the 
data written to memstores are synchronously replicated to the replicas and 
ensure strong consistency of the data. (pipeline model)
{quote}
Let me be more precise about what I meant by "pmem doesn't obviate the need for 
a WAL unless it is replicated itself among multiple servers.".I mean the 
availability of the data in pmem needs to match today's data availability with 
the WAL or there is an overall availability loss.

Synchronous replication of edits from one region replica to another is a WAL by 
another name, but instead of the edit stream being available to the entire 
cluster in a replayable form it is limited to the three servers participating 
in the region replication. When all replicas go down at once, we lose the 
ability to resume service for the affected region(s) on the other available 
hosts.

Perhaps the PoC doc quantifies the availability loss? I'd be interested in 
taking a look.

 

> WALLess HBase on Persistent Memory
> --
>
> Key: HBASE-20003
> URL: https://issues.apache.org/jira/browse/HBASE-20003
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
>
> This JIRA aims to make use of persistent memory (pmem) technologies in HBase. 
> One such usage is to make the Memstore to reside on pmem. Making a persistent 
> memstore would remove the need for WAL and paves way for a WALLess HBase. 
> The existing region replica feature could be used here and ensure the data 
> written to memstores are synchronously replicated to the replicas and ensure 
> strong consistency of the data. (pipeline model)
> Advantages :
> -Data Availability : Since the data across replicas are consistent 
> (synchronously written) our data is always 100 % available.
> -Lower MTTR : It becomes easier/faster to switch over to the replicas on a 
> primary region failure as there is no WAL replay involved. Building the 
> memstore map data also is much faster than reading the WAL and replaying the 
> WAL.
> -Possibility of bigger memstores : These pmems are designed to have more 
> memory than DRAMs so it would also enable us to have bigger sized memstores 
> which leads to lesser flushes/compaction IO. 
> -Removes the dependency of HDFS on the write path
> Initial PoC has been designed and developed. Testing is underway and we would 
> publish the PoC results along with the design doc sooner. The PoC doc will 
> talk about the design decisions, the libraries considered to work with these 
> pmem devices, pros and cons of those libraries and the performance results.
> Note : Next gen memory technologies using 3DXPoint gives persistent memory 
> feature. Such memory DIMMs are soon to appear in the market. The PoC is done 
> around Intel's ApachePass (AEP)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20003) WALLess HBase on Persistent Memory

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368378#comment-16368378
 ] 

Andrew Purtell commented on HBASE-20003:


So Intel is getting to this again. :)

 

My feedback remains the same: pmem is pretty useless when the host goes down, 
and so what's the point. We need the WAL or whatever replaces it to be 
distributed, not just durable. pmem doesn't obviate the need for a WAL unless 
it is replicated itself among multiple servers.

> WALLess HBase on Persistent Memory
> --
>
> Key: HBASE-20003
> URL: https://issues.apache.org/jira/browse/HBASE-20003
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
>
> This JIRA aims to make use of persistent memory (pmem) technologies in HBase. 
> One such usage is to make the Memstore to reside on pmem. Making a persistent 
> memstore would remove the need for WAL and paves way for a WALLess HBase. 
> The existing region replica feature could be used here and ensure the data 
> written to memstores are synchronously replicated to the replicas and ensure 
> strong consistency of the data. (pipeline model)
> Advantages :
> -Data Availability : Since the data across replicas are consistent 
> (synchronously written) our data is always 100 % available.
> -Lower MTTR : It becomes easier/faster to switch over to the replicas on a 
> primary region failure as there is no WAL replay involved. Building the 
> memstore map data also is much faster than reading the WAL and replaying the 
> WAL.
> -Possibility of bigger memstores : These pmems are designed to have more 
> memory than DRAMs so it would also enable us to have bigger sized memstores 
> which leads to lesser flushes/compaction IO. 
> -Removes the dependency of HDFS on the write path
> Initial PoC has been designed and developed. Testing is underway and we would 
> publish the PoC results along with the design doc sooner. The PoC doc will 
> talk about the design decisions, the libraries considered to work with these 
> pmem devices, pros and cons of those libraries and the performance results.
> Note : Next gen memory technologies using 3DXPoint gives persistent memory 
> feature. Such memory DIMMs are soon to appear in the market. The PoC is done 
> around Intel's ApachePass (AEP)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20004) Client is not able to execute REST queries through browser in a secure cluster

2018-02-17 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20004:
---
Priority: Minor  (was: Critical)

> Client is not able to execute REST queries through browser in a secure cluster
> --
>
> Key: HBASE-20004
> URL: https://issues.apache.org/jira/browse/HBASE-20004
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.3.1
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>Priority: Minor
> Attachments: HBASE-20004.branch-1.patch, HBASE-20004.patch
>
>
> Firefox browser is not able to negotiate REST queries with server in secure 
> mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20004) Client is not able to execute REST queries through browser in a secure cluster

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368377#comment-16368377
 ] 

Andrew Purtell commented on HBASE-20004:


Downgrading this from Critical to Minor

> Client is not able to execute REST queries through browser in a secure cluster
> --
>
> Key: HBASE-20004
> URL: https://issues.apache.org/jira/browse/HBASE-20004
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.3.1
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>Priority: Minor
> Attachments: HBASE-20004.branch-1.patch, HBASE-20004.patch
>
>
> Firefox browser is not able to negotiate REST queries with server in secure 
> mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20004) Client is not able to execute REST queries through browser in a secure cluster

2018-02-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368376#comment-16368376
 ] 

Andrew Purtell commented on HBASE-20004:


Are we committing to supporting browser based query of the REST API? That seems 
like something clearly out of scope for us. Typically when one fronts a 
headless service like a datastore with a RESTful API, it is designed for use 
case specific clients, or perhaps a generic client like CURL, but certainly not 
a web browser. It can be accidentally convenient for developers if a web 
browser works against the API, perhaps, but that should be low priority for us, 
especially lower priority than a fix for a security problem.

> Client is not able to execute REST queries through browser in a secure cluster
> --
>
> Key: HBASE-20004
> URL: https://issues.apache.org/jira/browse/HBASE-20004
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.3.1
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>Priority: Critical
> Attachments: HBASE-20004.branch-1.patch, HBASE-20004.patch
>
>
> Firefox browser is not able to negotiate REST queries with server in secure 
> mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19400) Add missing security hooks for MasterService RPCs

2018-02-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19400:
--
Attachment: HBASE-19400.master.004.patch

> Add missing security hooks for MasterService RPCs
> -
>
> Key: HBASE-19400
> URL: https://issues.apache.org/jira/browse/HBASE-19400
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-beta-1
>Reporter: Balazs Meszaros
>Assignee: Appy
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19400.master.001.patch, 
> HBASE-19400.master.002.patch, HBASE-19400.master.003.patch, 
> HBASE-19400.master.004.patch, HBASE-19400.master.004.patch
>
>
> The following RPC methods do not call the observers, therefore they are not 
> guarded by AccessController:
> - normalize
> - setNormalizerRunning
> - runCatalogScan
> - enableCatalogJanitor
> - runCleanerChore
> - setCleanerChoreRunning
> - execMasterService
> - execProcedure
> - execProcedureWithRet



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey

2018-02-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368335#comment-16368335
 ] 

stack commented on HBASE-20015:
---

Pushed to master and branch-2 after fixing checkstyle. Leaving open to see if 
this makes a difference in our test runs.

> TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
> -
>
> Key: HBASE-20015
> URL: https://issues.apache.org/jira/browse/HBASE-20015
> Project: HBase
>  Issue Type: Sub-task
>  Components: flakey
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-20015.branch-2.001.patch
>
>
> MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can 
> run in a test mode such that it kills the Procedure before it can persist 
> state and it does this repeatedly to shake out areas where Procedures may not 
> be preserving all needed state at each Procedural step. The kill will cause 
> the Procedure to 'fail'. It'll then run the rollback procedure. The 
> MergeRegionProcedure is not able to roll back the last few steps of Merge 
> It throws an UnsupportedException (the hope was that the missing steps would 
> be filled in ... but they are hard to complete in that they themselves are 
> stepped).
> So
> Well it turns out that Split has a mechanism where it will not fail the 
> Procedure if gets to a stage from which it cannot rollback. Instead, it will 
> just retry and keep retrying till it succeeds eventually. Merge has this 
> facility half-implemented. Merge tests are therefore flakey. They do stuff 
> like this:
> {code}
> 2018-02-17 04:04:02,999 WARN  [PEWorker-1] 
> assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step 
> MERGE_TABLE_REGIONS_UPDATE_META for merging the regions 
> [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table 
> testRollbackAndDoubleExecution
> java.lang.UnsupportedOperationException: pid=44, 
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
>  abort requested; MergeTableRegionsProcedure 
> table=testRollbackAndDoubleExecution, 
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
> forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
>   at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
>   at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
> 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=44, 
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
>  abort requested; MergeTableRegionsProcedure 
> table=testRollbackAndDoubleExecution, 
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
> forcibly=false
> java.lang.UnsupportedOperationException: pid=44, 
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
>  abort requested; MergeTableRegionsProcedure 
> table=testRollbackAndDoubleExecution, 
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
> forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
>   at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
>   at 
> 

[jira] [Commented] (HBASE-19950) Introduce a ColumnValueFilter

2018-02-17 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368272#comment-16368272
 ] 

Chia-Ping Tsai commented on HBASE-19950:


ping [~anoop.hbase]. Any comment? Does [~reidchan] answer you question 
correctly? 

> Introduce a ColumnValueFilter
> -
>
> Key: HBASE-19950
> URL: https://issues.apache.org/jira/browse/HBASE-19950
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-19950.master.001.patch, 
> HBASE-19950.master.002.patch, HBASE-19950.master.003.patch, 
> HBASE-19950.master.004.patch, HBASE-19950.master.005.patch, 
> HBASE-19950.master.006.patch, HBASE-19950.master.007.patch, 
> HBASE-19950.master.008.patch, HBASE-19950.master.009.patch
>
>
> Different from {{SingleColumnValueFilter}} which returns an entire row when 
> specified condition is matched, this new filter will return the matched cell 
> only. There're already some discussions in HBASE-19824.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20013) TestZKPermissionWatcher is flakey

2018-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368164#comment-16368164
 ] 

Hudson commented on HBASE-20013:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4599 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4599/])
HBASE-20013 TestZKPermissionWatcher is flakey (stack: rev 
68d509bc1ff7a3bf69a596aed49f238b42ee0679)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> TestZKPermissionWatcher is flakey
> -
>
> Key: HBASE-20013
> URL: https://issues.apache.org/jira/browse/HBASE-20013
> Project: HBase
>  Issue Type: Sub-task
>  Components: flakey
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-20013.branch-2.001.patch
>
>
> The last two nightlies failed on this test in here on shutdown:
> {code}
> 2018-02-16 20:49:35,132 DEBUG [M:0;881c50037eea:35808] 
> master.MasterRpcServices(1153): Checking to see if procedure is done pid=7
> 2018-02-16 20:49:35,133 DEBUG [M:0;881c50037eea:35808] 
> client.RpcRetryingCallerImpl(132): Call exception, tries=7, retries=7, 
> started=8122 ms ago, cancelled=false, msg=null, details=, 
> exception=org.apache.hadoop.hbase.MasterNotRunningException
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2736)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getProcedureResult(MasterRpcServices.java:1155)
>   at 
> org.apache.hadoop.hbase.client.ShortCircuitMasterConnection.getProcedureResult(ShortCircuitMasterConnection.java:423)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture$2.rpcCall(HBaseAdmin.java:3490)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture$2.rpcCall(HBaseAdmin.java:3487)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:100)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3055)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3047)
>   at org.apache.hadoop.hbase.client.HBaseAdmin.access$700(HBaseAdmin.java:224)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.getProcedureResult(HBaseAdmin.java:3486)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult(HBaseAdmin.java:3438)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get(HBaseAdmin.java:3394)
>   at org.apache.hadoop.hbase.client.HBaseAdmin.get(HBaseAdmin.java:2123)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:612)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:586)
>   at 
> org.apache.hadoop.hbase.security.access.AccessController.createACLTable(AccessController.java:1130)
>   at 
> org.apache.hadoop.hbase.security.access.AccessController.postStartMaster(AccessController.java:1107)
>   at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$71.call(MasterCoprocessorHost.java:994)
>   at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$71.call(MasterCoprocessorHost.java:991)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
>   at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost.postStartMaster(MasterCoprocessorHost.java:991)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:969)
>   at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> We get stuck retrying.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20014) TestAdmin1 Times out

2018-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368163#comment-16368163
 ] 

Hudson commented on HBASE-20014:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4599 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4599/])
HBASE-20014 TestAdmin1 Times out (stack: rev 
969895105c77aaa34f4678d1924e83ebda7edb0f)
* (edit) 
hbase-common/src/test/java/org/apache/hadoop/hbase/HBaseClassTestRule.java
* (edit) src/main/asciidoc/_chapters/developer.adoc


> TestAdmin1 Times out
> 
>
> Key: HBASE-20014
> URL: https://issues.apache.org/jira/browse/HBASE-20014
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-20014.branch-2.001.patch
>
>
> TestAdmin1 has 28+ tests.  Nightly #336 shows this:
> {code}
> [ERROR] Tests run: 27, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
> 575.102 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAdmin1
> [ERROR] org.apache.hadoop.hbase.client.TestAdmin1  Time elapsed: 1.134 s  <<< 
> ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 600 
> seconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin1.testTableExist(TestAdmin1.java:899)
> [ERROR] testTableExist(org.apache.hadoop.hbase.client.TestAdmin1)  Time 
> elapsed: 1.134 s  <<< ERROR!
> java.io.InterruptedIOException: Interrupt while waiting on Operation: CREATE, 
> Table Name: default:testTableExist
>   at 
> org.apache.hadoop.hbase.client.TestAdmin1.testTableExist(TestAdmin1.java:899)
> [ERROR] org.apache.hadoop.hbase.client.TestAdmin1  Time elapsed: 0.022 s  <<< 
> ERROR!
> java.lang.Exception: Appears to be stuck in thread RS-EventLoopGroup-5-4
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 576.783 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAdmin1
> [ERROR] org.apache.hadoop.hbase.client.TestAdmin1  Time elapsed: 1.704 s  <<< 
> ERROR!
> java.lang.IllegalStateException: A mini-cluster is already running
>   at 
> org.apache.hadoop.hbase.client.TestAdmin1.setUpBeforeClass(TestAdmin1.java:97)
> {code}
> So, we timeout after running a test for only 1.3 seconds...  but in total it 
> is > 600.
> Then we are on to interesting stuff like minicluster running already. I've 
> seen this in past.
> I could refactor TestAdmin1... let me up the general timeout instead



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20015) TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey

2018-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368160#comment-16368160
 ] 

Hadoop QA commented on HBASE-20015:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
16s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 2s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
9s{color} | {color:red} hbase-server: The patch generated 1 new + 148 unchanged 
- 0 fixed = 149 total (was 148) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 8s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 57s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}103m 
29s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-20015 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12911002/HBASE-20015.branch-2.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7722d4c1beb3 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 8be0696320 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11557/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11557/testReport/ |
| Max. process+thread count | 5423 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output |