[jira] [Commented] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs

2017-11-21 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262117#comment-16262117
 ] 

Chia-Ping Tsai commented on HBASE-19092:


{quote}
The alternatives are taking us away from where we want to be – Tags integrated 
into Cell – or they require a bunch of retrofit for which we do not have time.
Tag-use by CPs is rare in the scheme of things. An ugly typecast is ok for now 
I'd say.
{quote}
Currently the only concern from me is that encouraging cp user to use 
{{ExtendedCell}} may hurt the philosophy we try to tell cp user - DONT touch 
the -red button- internal methods which may corrupt your data.

Perhaps it is time to introduce the {{RawCell}}. All new tag-related methods be 
added to {{RawCell}}. If we are ready to put the tags methods in Cell, we can 
just remove them from {{RawCell}} as it is a subclass of {{Cell}} - no 
competability will be broken


> Make Tag IA.LimitedPrivate and expose for CPs
> -
>
> Key: HBASE-19092
> URL: https://issues.apache.org/jira/browse/HBASE-19092
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19092-branch-2.patch, 
> HBASE-19092-branch-2_5.patch, HBASE-19092-branch-2_5.patch, 
> HBASE-19092.branch-2.0.02.patch, HBASE-19092_001-branch-2.patch, 
> HBASE-19092_001.patch, HBASE-19092_002-branch-2.patch, HBASE-19092_002.patch, 
> HBASE-19092_3.patch
>
>
> We need to make tags as LimitedPrivate as some use cases are trying to use 
> tags like timeline server. The same topic was discussed in dev@ and also in 
> HBASE-18995.
> Shall we target this for beta1 - cc [~saint@gmail.com].
> So once we do this all related Util methods and APIs should also move to 
> LimitedPrivate Util classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-19200) make hbase-client only depend on ZKAsyncRegistry and ZNodePaths

2017-11-21 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19200.
---
Resolution: Fixed

HBASE-19321 has been resolved by using the timed wait method. Re-resolve this.

> make hbase-client only depend on ZKAsyncRegistry and ZNodePaths
> ---
>
> Key: HBASE-19200
> URL: https://issues.apache.org/jira/browse/HBASE-19200
> Project: HBase
>  Issue Type: Task
>  Components: Client, Zookeeper
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19200-v1.patch, HBASE-19200-v2.patch, 
> HBASE-19200-v3.patch, HBASE-19200-v4.patch, HBASE-19200-v5.patch, 
> HBASE-19200.patch
>
>
> So that we can move most of the zookeeper related code out of hbase-client 
> module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19321:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-1
   Status: Resolved  (was: Patch Available)

Pushed to master and branch-2.

Thanks [~wuguoquan] for the contributing.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster

2017-11-21 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19311:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-1
 Release Note: Introduce a AcidGuaranteesTestTool and expose as tool 
instead of TestAcidGuarantees. Now TestAcidGuarantees is just a UT.
   Status: Resolved  (was: Patch Available)

Pushed to master and branch-2. Thanks all for reviewing. Will open another 
issue for backporting.

> Promote TestAcidGuarantees to LargeTests and start mini cluster once to make 
> it faster
> --
>
> Key: HBASE-19311
> URL: https://issues.apache.org/jira/browse/HBASE-19311
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19311-v1.patch, HBASE-19311.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19324) Backport HBASE-19311 to branch-1.x

2017-11-21 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-19324:
-

 Summary: Backport HBASE-19311 to branch-1.x
 Key: HBASE-19324
 URL: https://issues.apache.org/jira/browse/HBASE-19324
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Duo Zhang






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262094#comment-16262094
 ] 

Hadoop QA commented on HBASE-19321:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
29s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 2s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
55m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
37s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19321 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898802/HBASE-19321.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux aaf85642733d 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3b2b22b5fa |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9961/testReport/ |
| modules | C: hbase-client U: hbase-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9961/console |
| Powered by | Apache Yetus 0.6.0   http://yetus.apache.org |


This message was automatically generated.



> ZKAsyncRegistry ctor would hang when zookeeper 

[jira] [Commented] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs

2017-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262083#comment-16262083
 ] 

Hadoop QA commented on HBASE-19092:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 18 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  8m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
27s{color} | {color:red} hbase-common: The patch generated 16 new + 288 
unchanged - 17 fixed = 304 total (was 305) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
33s{color} | {color:red} hbase-client: The patch generated 1 new + 322 
unchanged - 4 fixed = 323 total (was 326) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 7 new + 480 
unchanged - 36 fixed = 487 total (was 516) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch hbase-mapreduce passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} hbase-thrift: The patch generated 0 new + 33 
unchanged - 2 fixed = 33 total (was 35) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 3s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
53m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m  0s{color} 
| {color:red} hbase-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
34s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 38s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 30s{color} 
| {color:red} hbase-mapreduce in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} 

[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-21 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262050#comment-16262050
 ] 

Guanghao Zhang commented on HBASE-19301:


Check the code for AccessController. Even we use createConnection, because it 
still create a short-circuited connection. The short-circuited connection will 
bypass the RPC and the RPC context didn't change. So it still use the old RPC 
user to write ACL table and User.runAsLoginUser not work. How about we provide 
two method for RegionCoprocessorEnvironment? One is for create short-circuited 
connection and another one is for create normal connection. [~anoop.hbase] Any 
ideas?


> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301.patch, HBASE-19301_V2.patch, 
> HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262040#comment-16262040
 ] 

Duo Zhang commented on HBASE-19321:
---

{quote}
A temporary workaround is to call blockUntilConnected when constructing 
ZKAsyncRegistry but this is not a production level code. 
{quote}
I already said this is not a production level code, why do you keep asking 
'what's the implication in production scenario', and 'Isn't this short for real 
scenario'? I can not get your point.

Thanks.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262034#comment-16262034
 ] 

stack commented on HBASE-19092:
---

I'd be +1 on 003. What you think [~anoop.hbase] (or [~chia7712] -- are you on 
vacation still?)

> Make Tag IA.LimitedPrivate and expose for CPs
> -
>
> Key: HBASE-19092
> URL: https://issues.apache.org/jira/browse/HBASE-19092
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19092-branch-2.patch, 
> HBASE-19092-branch-2_5.patch, HBASE-19092-branch-2_5.patch, 
> HBASE-19092.branch-2.0.02.patch, HBASE-19092_001-branch-2.patch, 
> HBASE-19092_001.patch, HBASE-19092_002-branch-2.patch, HBASE-19092_002.patch, 
> HBASE-19092_3.patch
>
>
> We need to make tags as LimitedPrivate as some use cases are trying to use 
> tags like timeline server. The same topic was discussed in dev@ and also in 
> HBASE-18995.
> Shall we target this for beta1 - cc [~saint@gmail.com].
> So once we do this all related Util methods and APIs should also move to 
> LimitedPrivate Util classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262029#comment-16262029
 ] 

Ted Yu commented on HBASE-19321:


Where is the Thread.interrupt() call ?

How is 2 second interval determined ? Isn't this short for real scenario ?

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Guoquan Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262028#comment-16262028
 ] 

Guoquan Wu commented on HBASE-19321:


ok, thanks

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16868) Add a replicate_all flag to avoid misuse the namespaces and table-cfs config of replication peer

2017-11-21 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262026#comment-16262026
 ] 

Guanghao Zhang commented on HBASE-16868:


Thanks for reviewing. :-) Let me take a code rebase  and upload the new patch. 
Meanwhile, I will open a new issue for EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS. 
Thanks.

> Add a replicate_all flag to avoid misuse the namespaces and table-cfs config 
> of replication peer
> 
>
> Key: HBASE-16868
> URL: https://issues.apache.org/jira/browse/HBASE-16868
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-16868.master.001.patch, 
> HBASE-16868.master.002.patch, HBASE-16868.master.003.patch, 
> HBASE-16868.master.004.patch, HBASE-16868.master.005.patch, 
> HBASE-16868.master.006.patch, HBASE-16868.master.007.patch, 
> HBASE-16868.master.008.patch
>
>
> First add a new peer by shell cmd.
> {code}
> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase".
> {code}
> If we don't set namespaces and table cfs in peer config. It means replicate 
> all tables to the peer cluster.
> Then append a table to the peer config.
> {code}
> append_peer_tableCFs '1', {"table1" => []}
> {code}
> Then this peer will only replicate table1 to the peer cluster. It changes to 
> replicate only one table from replicate all tables in the cluster. It is very 
> easy to misuse in production cluster. So we should avoid appending table to a 
> peer which replicates all table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262024#comment-16262024
 ] 

Duo Zhang commented on HBASE-19321:
---

+1. Let's wait for the pre commit result.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19122) preCompact and preFlush can bypass by returning null scanner; shut it down

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262018#comment-16262018
 ] 

stack commented on HBASE-19122:
---

.004 addresses checkstyle (except the import order which I have trouble making 
sense of) and the failing unit test; I removed the unit test since it is a test 
that checks mem sizing when a flush is cancelled only flushes can't be 
cancelled by CPs any more... 

> preCompact and preFlush can bypass by returning null scanner; shut it down
> --
>
> Key: HBASE-19122
> URL: https://issues.apache.org/jira/browse/HBASE-19122
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, Scanners
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19122.master.001.patch, 
> HBASE-19122.master.002.patch, HBASE-19122.master.003.patch, 
> HBASE-19122.master.004.patch
>
>
> Noticed by [~anoop.hbase] during review of HBASE-18770, preCompact and 
> preFlush can bypass normal processing by returning null. They are not 
> bypasable by ordained route. We should shut down this avenue.
> The preCompact at least may be new coming in with:
> {code}
> tree dbf13093842f85a713f023d7219caccf8f4eb05f
> parent a4dcf51415616772e462091ce93622f070ea8810
> author zhangduo  Sat Apr 9 16:18:08 2016 +0800
> committer zhangduo  Sun Apr 10 09:26:28 2016 +0800
> HBASE-15527 Refactor Compactor related classes
> {code}
> Would have to dig in more to figure for sure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19122) preCompact and preFlush can bypass by returning null scanner; shut it down

2017-11-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19122:
--
Attachment: HBASE-19122.master.004.patch

> preCompact and preFlush can bypass by returning null scanner; shut it down
> --
>
> Key: HBASE-19122
> URL: https://issues.apache.org/jira/browse/HBASE-19122
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, Scanners
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19122.master.001.patch, 
> HBASE-19122.master.002.patch, HBASE-19122.master.003.patch, 
> HBASE-19122.master.004.patch
>
>
> Noticed by [~anoop.hbase] during review of HBASE-18770, preCompact and 
> preFlush can bypass normal processing by returning null. They are not 
> bypasable by ordained route. We should shut down this avenue.
> The preCompact at least may be new coming in with:
> {code}
> tree dbf13093842f85a713f023d7219caccf8f4eb05f
> parent a4dcf51415616772e462091ce93622f070ea8810
> author zhangduo  Sat Apr 9 16:18:08 2016 +0800
> committer zhangduo  Sun Apr 10 09:26:28 2016 +0800
> HBASE-15527 Refactor Compactor related classes
> {code}
> Would have to dig in more to figure for sure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Guoquan Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guoquan Wu updated HBASE-19321:
---
Attachment: HBASE-19321.master.001.patch

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Guoquan Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guoquan Wu updated HBASE-19321:
---
Status: Patch Available  (was: Open)

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guoquan Wu
> Attachments: HBASE-19321.master.001.patch
>
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19122) preCompact and preFlush can bypass by returning null scanner; shut it down

2017-11-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19122:
--
Attachment: HBASE-19122.master.003.patch

> preCompact and preFlush can bypass by returning null scanner; shut it down
> --
>
> Key: HBASE-19122
> URL: https://issues.apache.org/jira/browse/HBASE-19122
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, Scanners
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19122.master.001.patch, 
> HBASE-19122.master.002.patch, HBASE-19122.master.003.patch
>
>
> Noticed by [~anoop.hbase] during review of HBASE-18770, preCompact and 
> preFlush can bypass normal processing by returning null. They are not 
> bypasable by ordained route. We should shut down this avenue.
> The preCompact at least may be new coming in with:
> {code}
> tree dbf13093842f85a713f023d7219caccf8f4eb05f
> parent a4dcf51415616772e462091ce93622f070ea8810
> author zhangduo  Sat Apr 9 16:18:08 2016 +0800
> committer zhangduo  Sun Apr 10 09:26:28 2016 +0800
> HBASE-15527 Refactor Compactor related classes
> {code}
> Would have to dig in more to figure for sure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19122) preCompact and preFlush can bypass by returning null scanner; shut it down

2017-11-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19122:
--
Attachment: HBASE-19122.master.002.patch

> preCompact and preFlush can bypass by returning null scanner; shut it down
> --
>
> Key: HBASE-19122
> URL: https://issues.apache.org/jira/browse/HBASE-19122
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, Scanners
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19122.master.001.patch, 
> HBASE-19122.master.002.patch, HBASE-19122.master.003.patch
>
>
> Noticed by [~anoop.hbase] during review of HBASE-18770, preCompact and 
> preFlush can bypass normal processing by returning null. They are not 
> bypasable by ordained route. We should shut down this avenue.
> The preCompact at least may be new coming in with:
> {code}
> tree dbf13093842f85a713f023d7219caccf8f4eb05f
> parent a4dcf51415616772e462091ce93622f070ea8810
> author zhangduo  Sat Apr 9 16:18:08 2016 +0800
> committer zhangduo  Sun Apr 10 09:26:28 2016 +0800
> HBASE-15527 Refactor Compactor related classes
> {code}
> Would have to dig in more to figure for sure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262008#comment-16262008
 ] 

Duo Zhang commented on HBASE-19321:
---

Assigned to you [~wuguoquan]. Please upload the patch. Thanks.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: wuguoquan
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16868) Add a replicate_all flag to avoid misuse the namespaces and table-cfs config of replication peer

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262011#comment-16262011
 ] 

stack commented on HBASE-16868:
---

Thanks for chiming in [~ashish singhi] Good to commit I'd say [~zghaobac] (with 
suggested change sir!). Your release note is very nice. Do you want to add the 
why you gave to me above? I think I get it now (smile).

> Add a replicate_all flag to avoid misuse the namespaces and table-cfs config 
> of replication peer
> 
>
> Key: HBASE-16868
> URL: https://issues.apache.org/jira/browse/HBASE-16868
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-16868.master.001.patch, 
> HBASE-16868.master.002.patch, HBASE-16868.master.003.patch, 
> HBASE-16868.master.004.patch, HBASE-16868.master.005.patch, 
> HBASE-16868.master.006.patch, HBASE-16868.master.007.patch, 
> HBASE-16868.master.008.patch
>
>
> First add a new peer by shell cmd.
> {code}
> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase".
> {code}
> If we don't set namespaces and table cfs in peer config. It means replicate 
> all tables to the peer cluster.
> Then append a table to the peer config.
> {code}
> append_peer_tableCFs '1', {"table1" => []}
> {code}
> Then this peer will only replicate table1 to the peer cluster. It changes to 
> replicate only one table from replicate all tables in the cluster. It is very 
> easy to misuse in production cluster. So we should avoid appending table to a 
> peer which replicates all table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reassigned HBASE-19321:
-

Assignee: wuguoquan

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: wuguoquan
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262006#comment-16262006
 ] 

Anoop Sam John commented on HBASE-19320:


In 2.0 we have direct BB pool to which we read reqs into. These DBBs are passed 
to NIO for reading reqs into. The NIO layer DBBs are not getting used then. 
JFYI.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19323) Make netty engine default in hbase2

2017-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262003#comment-16262003
 ] 

Hadoop QA commented on HBASE-19323:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
52m 21s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 15s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.ipc.TestNettyIPC |
|   | hadoop.hbase.ipc.TestBlockingIPC |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19323 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898789/0001-HBASE-19323-Make-netty-engine-default-in-hbase2.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7df4f80eebc2 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3b2b22b5fa |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9958/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9958/testReport/ |
| modules | C: 

[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread wuguoquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262001#comment-16262001
 ] 

wuguoquan commented on HBASE-19321:
---

I have fixed this issue, but I can not assign this issue to myself, if someone 
can assign it to me?

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19322) System tables hbase:quota|hbase:acl will be in offline state when cluster startup first time with rsgroup feature

2017-11-21 Thread xinxin fan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261992#comment-16261992
 ] 

xinxin fan commented on HBASE-19322:


Thanks [~ted_yu]

hbase:quota and hbase:acl didn't be assigned to default group only when the 
cluster start up first time and it will be ok if restarting the master.

Let me check if the rsgroupStartupWorker can be started after the creating of 
hbase:quota  and hbase:acl  






















































   

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-21 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261986#comment-16261986
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Thanks for the detailed review.
bq.We only do this when it a region with replicas or do we do it always (would 
be good if former, we want assignment to run fast).
Yes only for the replica regions.
bq.Please remind me what is the rule for replica assign? Just that they need to 
be on different servers? Nothing about ordering? (Hmm... seems like replica has 
to go out first). How does the patch to the balancer ensure this ordering?
Our initial requirement is that replicas for sure should be in different 
servers if there are enough number of servers. Ordering is not of importance. 
Coming to the balancer, in our code base only StochasticLB knows about replicas 
while actually balancing the cluster. We have tried FavoredStocasticLB and it 
does not know about replicas and infact messes with the replica assignment 
itself (by corrupting the META entries for replicas). That is a big change 
which we need to do later. We have confirmed this with [~enis] also offline 
some time back.
Also as in said in previous comment balancer does not come into picture while 
doing round robin assignment of a new table reigons. It just tries to do round 
robin based on available servers. 
bq.is there a hole where you can't see an ongoing Assigment? It has been 
queue'd and is being worked on but but you have no means of querying where a 
region is being assigned
Yes exactly. We don know about it. It not only applies for replica regions any 
new create table regions has the same issue. The assignment queued just uses 
the current regions in the queue to do the assignments.  But for those regions 
it is ok we don't mind how they are distributed but for replicas it is very 
important. when we have enough servers if the replicas are not distributed then 
we don server the purpose of replicas. If the servers are less than the 
replicas then it is ok to assign the replicas to the same RS. In future we are 
planning to even avoid this and fail the assignments itself.
bq.If round robin, are we not moving through the list of servers? Is the issue 
only when cluster is small – three servers or so?
Hope you mean before this patch right? We are moving through the list of 
servers but all the regions (including replicas) do not go to the assignment 
queue together. So what ever is getting processed from the assignment queue 
there it does round robin but the next set of regions that is processed again 
does round robin and we end up in same RS.
bq.On patch, don't renumber protobuf fields.
Oh yes. I did that so that the steps are in order. Will change it and will try 
to remove some duplicate code.
bq.If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?
Yes. If it is a normal region we just go with the old code only and if the 
replica is not avaliable in the existing code there is way to assign all such 
region that don't find a suitable server to some servers randomly. Which is 
fine for us too because replicas are more than the available number of servers.
Actually there is more to do with AM and replicas. We know the issues but not 
yet ready with patches. Like on a rolling restart like case the AM will keep 
moving the replicas to RS that are running. So finally when the last one is 
closed all the region would have moved there and META will only have that 
entry. Now when new RS are started it will try to do retain assignment and 
again replica regions may get colocated and only a balancer can solve it. We 
need to see how best we can do in these cases. But all that later (out of scope 
here).


> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs

2017-11-21 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-19092:
---
Attachment: HBASE-19092_3.patch

Updated patch for master. Addresses most of the review comments. 
We have APIs in ExtendedCell to read the Tags. 
Tag.asList(Cell) that I added by mistake in last patch has been removed.
Now if Tags try to setSeqId we will throw Exception.
I will rebase patch to branch-2 once this is committed. Since it is making 
difficult to port the patch directly. 

> Make Tag IA.LimitedPrivate and expose for CPs
> -
>
> Key: HBASE-19092
> URL: https://issues.apache.org/jira/browse/HBASE-19092
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19092-branch-2.patch, 
> HBASE-19092-branch-2_5.patch, HBASE-19092-branch-2_5.patch, 
> HBASE-19092.branch-2.0.02.patch, HBASE-19092_001-branch-2.patch, 
> HBASE-19092_001.patch, HBASE-19092_002-branch-2.patch, HBASE-19092_002.patch, 
> HBASE-19092_3.patch
>
>
> We need to make tags as LimitedPrivate as some use cases are trying to use 
> tags like timeline server. The same topic was discussed in dev@ and also in 
> HBASE-18995.
> Shall we target this for beta1 - cc [~saint@gmail.com].
> So once we do this all related Util methods and APIs should also move to 
> LimitedPrivate Util classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs

2017-11-21 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-19092:
---
Status: Patch Available  (was: Open)

> Make Tag IA.LimitedPrivate and expose for CPs
> -
>
> Key: HBASE-19092
> URL: https://issues.apache.org/jira/browse/HBASE-19092
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19092-branch-2.patch, 
> HBASE-19092-branch-2_5.patch, HBASE-19092-branch-2_5.patch, 
> HBASE-19092.branch-2.0.02.patch, HBASE-19092_001-branch-2.patch, 
> HBASE-19092_001.patch, HBASE-19092_002-branch-2.patch, HBASE-19092_002.patch, 
> HBASE-19092_3.patch
>
>
> We need to make tags as LimitedPrivate as some use cases are trying to use 
> tags like timeline server. The same topic was discussed in dev@ and also in 
> HBASE-18995.
> Shall we target this for beta1 - cc [~saint@gmail.com].
> So once we do this all related Util methods and APIs should also move to 
> LimitedPrivate Util classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19310) Verify IntegrationTests don't rely on Rules outside of JUnit context

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261977#comment-16261977
 ] 

Ted Yu commented on HBASE-19310:


+1 on 002

> Verify IntegrationTests don't rely on Rules outside of JUnit context
> 
>
> Key: HBASE-19310
> URL: https://issues.apache.org/jira/browse/HBASE-19310
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Reporter: Romil Choksi
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19310.001.branch-2.patch, 
> HBASE-19310.002.branch-2.patch
>
>
> {noformat}
> 2017-11-16 00:43:41,204 INFO  [main] mapreduce.IntegrationTestImportTsv: 
> Running test testGenerateAndLoad.
> Exception in thread "main" java.lang.NullPointerException
>   at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:461)
>   at 
> org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.testGenerateAndLoad(IntegrationTestImportTsv.java:189)
>   at 
> org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.run(IntegrationTestImportTsv.java:229)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.main(IntegrationTestImportTsv.java:239)
> {noformat}
> (Potential line-number skew)
> {code}
>   @Test
>   public void testGenerateAndLoad() throws Exception {
> LOG.info("Running test testGenerateAndLoad.");
> final TableName table = TableName.valueOf(name.getMethodName());
> {code}
> The JUnit framework sets the test method name inside of the JUnit {{Rule}}. 
> When we invoke the test directly (ala {{hbase 
> org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv}}), this 
> {{getMethodName()}} returns {{null}} and we get the above stacktrace.
> Should make a pass over the ITs with main methods and {{Rule}}'s to make sure 
> we don't have this lurking. Another alternative is to just remove the main 
> methods and just force use of {{IntegrationTestsDriver}} instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16868) Add a replicate_all flag to avoid misuse the namespaces and table-cfs config of replication peer

2017-11-21 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261976#comment-16261976
 ] 

Ashish Singhi commented on HBASE-16868:
---

+1 on the patch.
Nit(can address on the commit): 
{code}
31If replicate_all flag is false, then all user tables cannot be 
replicate to
32peer cluster. But you can set NAMESPACES or TABLECFS to include some 
tables
33which will be replicated.
{code}
It will be good if we can explain the user here, how to set NAMESPACES or 
TABLECFS. We can just mention the command which one to use to do it.

> Add a replicate_all flag to avoid misuse the namespaces and table-cfs config 
> of replication peer
> 
>
> Key: HBASE-16868
> URL: https://issues.apache.org/jira/browse/HBASE-16868
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-16868.master.001.patch, 
> HBASE-16868.master.002.patch, HBASE-16868.master.003.patch, 
> HBASE-16868.master.004.patch, HBASE-16868.master.005.patch, 
> HBASE-16868.master.006.patch, HBASE-16868.master.007.patch, 
> HBASE-16868.master.008.patch
>
>
> First add a new peer by shell cmd.
> {code}
> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase".
> {code}
> If we don't set namespaces and table cfs in peer config. It means replicate 
> all tables to the peer cluster.
> Then append a table to the peer config.
> {code}
> append_peer_tableCFs '1', {"table1" => []}
> {code}
> Then this peer will only replicate table1 to the peer cluster. It changes to 
> replicate only one table from replicate all tables in the cluster. It is very 
> easy to misuse in production cluster. So we should avoid appending table to a 
> peer which replicates all table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs

2017-11-21 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261972#comment-16261972
 ] 

ramkrishna.s.vasudevan commented on HBASE-19092:


bq.. An ugly typecast is ok for now I'd say.
Thanks [~saint@gmail.com] for the confirmation.

> Make Tag IA.LimitedPrivate and expose for CPs
> -
>
> Key: HBASE-19092
> URL: https://issues.apache.org/jira/browse/HBASE-19092
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19092-branch-2.patch, 
> HBASE-19092-branch-2_5.patch, HBASE-19092-branch-2_5.patch, 
> HBASE-19092.branch-2.0.02.patch, HBASE-19092_001-branch-2.patch, 
> HBASE-19092_001.patch, HBASE-19092_002-branch-2.patch, HBASE-19092_002.patch
>
>
> We need to make tags as LimitedPrivate as some use cases are trying to use 
> tags like timeline server. The same topic was discussed in dev@ and also in 
> HBASE-18995.
> Shall we target this for beta1 - cc [~saint@gmail.com].
> So once we do this all related Util methods and APIs should also move to 
> LimitedPrivate Util classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19221) NoClassDefFoundError: org/hamcrest/SelfDescribing while running IT tests in 2.0-alpha4

2017-11-21 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261968#comment-16261968
 ] 

ramkrishna.s.vasudevan commented on HBASE-19221:


Any recent fix solved this issue? Just wanted to know. 

> NoClassDefFoundError: org/hamcrest/SelfDescribing while running IT tests in 
> 2.0-alpha4
> --
>
> Key: HBASE-19221
> URL: https://issues.apache.org/jira/browse/HBASE-19221
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
>
> Copying the mail from the dev@
> {code}
> I tried running some IT test cases using the alpha-4 RC. I found this issue
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/hamcrest/SelfDescribing
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
> at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> ...
>at 
> org.apache.hadoop.hbase.IntegrationTestsDriver.doWork(IntegrationTestsDriver.java:111)
> at 
> org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:154)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at 
> org.apache.hadoop.hbase.IntegrationTestsDriver.main(IntegrationTestsDriver.java:47)
> The same when run against latest master it runs without any issues
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19323) Make netty engine default in hbase2

2017-11-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19323:
--
Status: Patch Available  (was: Open)

> Make netty engine default in hbase2
> ---
>
> Key: HBASE-19323
> URL: https://issues.apache.org/jira/browse/HBASE-19323
> Project: HBase
>  Issue Type: Task
>  Components: rpc
>Reporter: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: 
> 0001-HBASE-19323-Make-netty-engine-default-in-hbase2.patch
>
>
> HBASE-17263 added netty rpc server. This issue is about making it default 
> given it has seen good service across two singles-days at scale. Netty 
> handles the scenario seen in HBASE-19320 (See tail of HBASE-19320 for 
> suggestion to netty the default)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261955#comment-16261955
 ] 

stack commented on HBASE-19320:
---

bq, Yes, I thought this was the plan...

Its not default [1]. Filed HBASE-19323 "Make netty engine default in hbase2"

1. 
https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit#heading=h.hpbw1fac8ixd



> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19323) Make netty engine default in hbase2

2017-11-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19323:
--
Attachment: 0001-HBASE-19323-Make-netty-engine-default-in-hbase2.patch

One line change (There are a few references to the old SimpleRpcServer in a few 
tests but doesn't seem to matter what the transport..)

> Make netty engine default in hbase2
> ---
>
> Key: HBASE-19323
> URL: https://issues.apache.org/jira/browse/HBASE-19323
> Project: HBase
>  Issue Type: Task
>  Components: rpc
>Reporter: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: 
> 0001-HBASE-19323-Make-netty-engine-default-in-hbase2.patch
>
>
> HBASE-17263 added netty rpc server. This issue is about making it default 
> given it has seen good service across two singles-days at scale. Netty 
> handles the scenario seen in HBASE-19320 (See tail of HBASE-19320 for 
> suggestion to netty the default)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19323) Make netty engine default in hbase2

2017-11-21 Thread stack (JIRA)
stack created HBASE-19323:
-

 Summary: Make netty engine default in hbase2
 Key: HBASE-19323
 URL: https://issues.apache.org/jira/browse/HBASE-19323
 Project: HBase
  Issue Type: Task
  Components: rpc
Reporter: stack
 Fix For: 2.0.0-beta-1


HBASE-17263 added netty rpc server. This issue is about making it default given 
it has seen good service across two singles-days at scale. Netty handles the 
scenario seen in HBASE-19320 (See tail of HBASE-19320 for suggestion to netty 
the default)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19322) System tables hbase:quota|hbase:acl will be in offline state when cluster startup first time with rsgroup feature

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261950#comment-16261950
 ] 

Ted Yu commented on HBASE-19322:


Thanks for reporting.

Sounds like hbase:quota and hbase:acl should be assigned to default group.

Do you want to come up with a patch ?

> System tables hbase:quota|hbase:acl will be in offline state when cluster 
> startup first time with rsgroup feature
> -
>
> Key: HBASE-19322
> URL: https://issues.apache.org/jira/browse/HBASE-19322
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 2.0.0-alpha-4
>Reporter: xinxin fan
>
> When cluster start up first time with rsgroup feature, system tables 
> hbase:quota and hbase:acl  will be in OFFLINE state:
> {code:java}
> hbase:quota,,1511254877213.0627adae8630c21f4456984713cdffc8. state=OFFLINE, 
> ts=Tue Nov 21 17:03:37 CST 2017 (0s ago), server=localhost,1,1
> {code}
> It seems that the balancer doesn't know which server to assign the 
> regions,since that  rsgroup information of the two system tables found to be 
> null.
> I read the code and found a issue in rsgroup startup procedure : the rsgroup 
> starts up before the creating of the two system tables(hbase:quota, 
> hbase:acl), so rsgroupStartupWorker only adds hbase:meta and hbase:namespace 
> into default group by following function:
> {code:java}
> specialTables = 
> masterServices.listTableNamesByNamespace(NamespaceDescriptor.SYSTEM_NAMESPACE_NAME_STR)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16868) Add a replicate_all flag to avoid misuse the namespaces and table-cfs config of replication peer

2017-11-21 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261946#comment-16261946
 ] 

Guanghao Zhang commented on HBASE-16868:


This replicate_all flag is useful to avoid misuse of replication peer config. 
And on our cluster we have more config: EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS 
for replication peer. Let me tell more about our use case. We have two online 
serve cluster and one offline cluster for MR/Spark job. For online cluster, all 
tables will replicate to each other. And not all tables will replicate to 
offline cluster, because not all tables need OLAP job. We have hundreds of 
tables and if only one table don't need replicate to offline cluster, then you 
will config a lot of tables in replication peer config. So we add a new config 
option is EXCLUDE_TABLECFS. Then you only need config one table (which don't 
need replicate) in EXCLUDE_TABLECFS.

Then when the replicate_all flag is false, you can config NAMESPACE or TABLECFS 
means which namespace/tables need replicate to peer cluster. When replicate_all 
flag is true, you can config EXCLUDE_NAMESPACE or EXCLUDE_TABLECFS means which 
namespace/tables can't replicate to peer cluster. I plan contribute this to 
2.0, too. Any comments for this? [~ashish singhi]





> Add a replicate_all flag to avoid misuse the namespaces and table-cfs config 
> of replication peer
> 
>
> Key: HBASE-16868
> URL: https://issues.apache.org/jira/browse/HBASE-16868
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-16868.master.001.patch, 
> HBASE-16868.master.002.patch, HBASE-16868.master.003.patch, 
> HBASE-16868.master.004.patch, HBASE-16868.master.005.patch, 
> HBASE-16868.master.006.patch, HBASE-16868.master.007.patch, 
> HBASE-16868.master.008.patch
>
>
> First add a new peer by shell cmd.
> {code}
> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase".
> {code}
> If we don't set namespaces and table cfs in peer config. It means replicate 
> all tables to the peer cluster.
> Then append a table to the peer config.
> {code}
> append_peer_tableCFs '1', {"table1" => []}
> {code}
> Then this peer will only replicate table1 to the peer cluster. It changes to 
> replicate only one table from replicate all tables in the cluster. It is very 
> easy to misuse in production cluster. So we should avoid appending table to a 
> peer which replicates all table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261945#comment-16261945
 ] 

Hadoop QA commented on HBASE-19319:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hbase-procedure: The patch generated 1 new + 14 
unchanged - 2 fixed = 15 total (was 16) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
53s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
51m 39s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
58s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 56s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster |
|   | 
hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint |
|   | hadoop.hbase.master.normalizer.TestSimpleRegionNormalizerOnCluster |
|   | hadoop.hbase.regionserver.TestRegionReplicasWithModifyTable |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19319 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898763/HBASE-19319.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux bb1821d0ea3c 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 

[jira] [Commented] (HBASE-19291) Use common header and footer for JSP pages

2017-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261940#comment-16261940
 ] 

Hudson commented on HBASE-19291:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4095 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4095/])
HBASE-19291 Use common header and footer for JSP pages (appy: rev 
3b2b22b5fac1175302b320b7ca1ed766326924cc)
* (add) hbase-server/src/main/resources/hbase-webapps/regionserver/footer.jsp
* (edit) 
hbase-server/src/main/resources/hbase-webapps/regionserver/storeFile.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/snapshot.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/procedures.jsp
* (edit) 
hbase-server/src/main/resources/hbase-webapps/regionserver/processRS.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/master/footer.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/regionserver/header.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/master/header.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/regionserver/region.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/zk.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/processMaster.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/snapshotsStats.jsp


> Use common header and footer for JSP pages
> --
>
> Key: HBASE-19291
> URL: https://issues.apache.org/jira/browse/HBASE-19291
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19291.master.001.patch
>
>
> Use header and footer in our *.jsp pages to avoid unnecessary redundancy 
> (copy-paste of code)
> (Been sitting in my local repo for long, best to get following pesky 
> user-facing things fixed before the next major release)
> Misc edits:
> - Due to redundancy, new additions make it to some places but not others. For 
> eg there are missing links to "/logLevel", "/processRS.jsp" in few places.
> - Fix processMaster.jsp wrongly pointing to rs-status instead of 
> master-status (probably due to copy paste from processRS.jsp)
> - Deleted a bunch of extraneous "" in processMaster.jsp & processRS.jsp
> - Added missing  tag in snapshot.jsp
> - Deleted fossils of html5shiv.js. It's uses and the js itself were deleted 
> in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f"
> - Fixed wrongly matched heading tags
> - Deleted some unused variables
> Tested:
> Ran standalone cluster and opened each page to make sure it looked right.
> Sidenote:
> Looks like HBASE-3835 started the work of converting from jsp to jamon, but 
> the work didn't finish. Now we have a mix of jsp and jamon. Needs 
> reconciling, but later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19200) make hbase-client only depend on ZKAsyncRegistry and ZNodePaths

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261937#comment-16261937
 ] 

Ted Yu commented on HBASE-19200:


First, TestAcidGuarantees hung.
With workaround, TestAdmin2.testCheckHBaseAvailableWithoutCluster hangs.

I think this should be reverted. Once CURATOR-443 is fixed and we upgrade to 
next curator release, this along with proper fix for the (currently) hanging 
tests can go in again.

> make hbase-client only depend on ZKAsyncRegistry and ZNodePaths
> ---
>
> Key: HBASE-19200
> URL: https://issues.apache.org/jira/browse/HBASE-19200
> Project: HBase
>  Issue Type: Task
>  Components: Client, Zookeeper
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19200-v1.patch, HBASE-19200-v2.patch, 
> HBASE-19200-v3.patch, HBASE-19200-v4.patch, HBASE-19200-v5.patch, 
> HBASE-19200.patch
>
>
> So that we can move most of the zookeeper related code out of hbase-client 
> module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (HBASE-19200) make hbase-client only depend on ZKAsyncRegistry and ZNodePaths

2017-11-21 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-19200:


> make hbase-client only depend on ZKAsyncRegistry and ZNodePaths
> ---
>
> Key: HBASE-19200
> URL: https://issues.apache.org/jira/browse/HBASE-19200
> Project: HBase
>  Issue Type: Task
>  Components: Client, Zookeeper
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19200-v1.patch, HBASE-19200-v2.patch, 
> HBASE-19200-v3.patch, HBASE-19200-v4.patch, HBASE-19200-v5.patch, 
> HBASE-19200.patch
>
>
> So that we can move most of the zookeeper related code out of hbase-client 
> module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log

2017-11-21 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261933#comment-16261933
 ] 

binlijin commented on HBASE-19290:
--

bq.  Since expectedTasksPerRS only relay on availableRSs and numTasks, I think 
it is better to move them to getAvailableRSs() and change a name?
I do not think so,  and i think getAvailableRSs and calculateAvailableSplitters 
is clear.

> Reduce zk request when doing split log
> --
>
> Key: HBASE-19290
> URL: https://issues.apache.org/jira/browse/HBASE-19290
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261932#comment-16261932
 ] 

Duo Zhang commented on HBASE-19321:
---

I always say that, you can do what you think is right, if you want to revert 
then just revert, a test broken is enough to revert a commit. But I also have 
my own work plan, and also my judgement on which issue is more important. If 
you think this one is high priority then just do it.

Thanks.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19322) System tables hbase:quota|hbase:acl will be in offline state when cluster startup first time with rsgroup feature

2017-11-21 Thread xinxin fan (JIRA)
xinxin fan created HBASE-19322:
--

 Summary: System tables hbase:quota|hbase:acl will be in offline 
state when cluster startup first time with rsgroup feature
 Key: HBASE-19322
 URL: https://issues.apache.org/jira/browse/HBASE-19322
 Project: HBase
  Issue Type: Bug
  Components: rsgroup
Affects Versions: 2.0.0-alpha-4
Reporter: xinxin fan


When cluster start up first time with rsgroup feature, system tables 
hbase:quota and hbase:acl  will be in OFFLINE state:

{code:java}
hbase:quota,,1511254877213.0627adae8630c21f4456984713cdffc8. state=OFFLINE, 
ts=Tue Nov 21 17:03:37 CST 2017 (0s ago), server=localhost,1,1
{code}

It seems that the balancer doesn't know which server to assign the 
regions,since that  rsgroup information of the two system tables found to be 
null.

I read the code and found a issue in rsgroup startup procedure : the rsgroup 
starts up before the creating of the two system tables(hbase:quota, hbase:acl), 
so rsgroupStartupWorker only adds hbase:meta and hbase:namespace into default 
group by following function:

{code:java}
specialTables = 
masterServices.listTableNamesByNamespace(NamespaceDescriptor.SYSTEM_NAMESPACE_NAME_STR)
{code}
 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261928#comment-16261928
 ] 

Ted Yu commented on HBASE-19321:


Sounds like you're a manager now.

This is high priority issue - without the fix, HBASE-19200 should be reverted.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log

2017-11-21 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261925#comment-16261925
 ] 

binlijin commented on HBASE-19290:
--

bq.  I can see it's doing that. The question really meant - why such 
interesting choice? It's not usual thing to do i.e. throttle for first request 
and start hammering servers after that. If it was something you chose by design 
- please add a comment about the behavior and explaining reasoning; if not by 
design, then probably remove the 'if condition' and always sleep.
The design is not chose by me, and i do not see this design has any problem, 
there are 2 available splitters, and grabbed one task then try to grab the 
second task and doing split task as fast as possible. My patch do not change 
the design, and just trying to issue less zk request.

> Reduce zk request when doing split log
> --
>
> Key: HBASE-19290
> URL: https://issues.apache.org/jira/browse/HBASE-19290
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261922#comment-16261922
 ] 

Duo Zhang commented on HBASE-19321:
---

If you too do not have much time, I could find another person to fix this. What 
do you think? Thanks.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261920#comment-16261920
 ] 

Duo Zhang commented on HBASE-19321:
---

Do not have much time. Thanks.

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19321) ZKAsyncRegistry ctor would hang when zookeeper cluster is not available

2017-11-21 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19321:
---
Issue Type: Bug  (was: Test)
   Summary: ZKAsyncRegistry ctor would hang when zookeeper cluster is not 
available  (was: TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs)

> ZKAsyncRegistry ctor would hang when zookeeper cluster is not available
> ---
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261918#comment-16261918
 ] 

Duo Zhang commented on HBASE-19311:
---

Let me commit.

> Promote TestAcidGuarantees to LargeTests and start mini cluster once to make 
> it faster
> --
>
> Key: HBASE-19311
> URL: https://issues.apache.org/jira/browse/HBASE-19311
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Attachments: HBASE-19311-v1.patch, HBASE-19311.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261917#comment-16261917
 ] 

Ted Yu commented on HBASE-19321:


You made all the previous changes.

It would be better you fix it.

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261915#comment-16261915
 ] 

Duo Zhang commented on HBASE-19321:
---

Yeah there is method

{code}
public boolean blockUntilConnected(int maxWaitTime, TimeUnit units) throws 
InterruptedException;
{code}

I think 1 or 2 seconds is enough? You can provide a patch, and also add the 
Thread.interrupt you want.

Thanks.

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261914#comment-16261914
 ] 

Ted Yu commented on HBASE-19321:


curator provides: blockUntilConnected(int, TimeUnit)

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19291) Use common header and footer for JSP pages

2017-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261912#comment-16261912
 ] 

Hudson commented on HBASE-19291:


FAILURE: Integrated in Jenkins build HBase-2.0 #893 (See 
[https://builds.apache.org/job/HBase-2.0/893/])
HBASE-19291 Use common header and footer for JSP pages (appy: rev 
8f0f820f22c2cdf715ae5c8acfe4753bf8a6350b)
* (edit) hbase-server/src/main/resources/hbase-webapps/master/snapshot.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/regionserver/region.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/master/footer.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/regionserver/footer.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/regionserver/header.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/zk.jsp
* (edit) 
hbase-server/src/main/resources/hbase-webapps/regionserver/processRS.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/procedures.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* (add) hbase-server/src/main/resources/hbase-webapps/master/header.jsp
* (edit) 
hbase-server/src/main/resources/hbase-webapps/regionserver/storeFile.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/processMaster.jsp
* (edit) hbase-server/src/main/resources/hbase-webapps/master/snapshotsStats.jsp


> Use common header and footer for JSP pages
> --
>
> Key: HBASE-19291
> URL: https://issues.apache.org/jira/browse/HBASE-19291
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19291.master.001.patch
>
>
> Use header and footer in our *.jsp pages to avoid unnecessary redundancy 
> (copy-paste of code)
> (Been sitting in my local repo for long, best to get following pesky 
> user-facing things fixed before the next major release)
> Misc edits:
> - Due to redundancy, new additions make it to some places but not others. For 
> eg there are missing links to "/logLevel", "/processRS.jsp" in few places.
> - Fix processMaster.jsp wrongly pointing to rs-status instead of 
> master-status (probably due to copy paste from processRS.jsp)
> - Deleted a bunch of extraneous "" in processMaster.jsp & processRS.jsp
> - Added missing  tag in snapshot.jsp
> - Deleted fossils of html5shiv.js. It's uses and the js itself were deleted 
> in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f"
> - Fixed wrongly matched heading tags
> - Deleted some unused variables
> Tested:
> Ran standalone cluster and opened each page to make sure it looked right.
> Sidenote:
> Looks like HBASE-3835 started the work of converting from jsp to jamon, but 
> the work didn't finish. Now we have a mix of jsp and jamon. Needs 
> reconciling, but later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261909#comment-16261909
 ] 

Ted Yu edited comment on HBASE-19321 at 11/22/17 3:16 AM:
--

Can variant of blockUntilConnected be used / added which takes into account 
zookeeper connection issue ?


was (Author: yuzhih...@gmail.com):
Can variant of blockUntilConnected be used / added which takes into account 
timeout ?

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261909#comment-16261909
 ] 

Ted Yu commented on HBASE-19321:


Can variant of blockUntilConnected be used / added which takes into account 
timeout ?

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261902#comment-16261902
 ] 

Duo Zhang commented on HBASE-19321:
---

OK, the test is used to test we will fail if no cluster.

Mark it as Ignored I'd say. Will be fixed automatically after we remove the 
blockUntilConnected.

Thanks.

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261899#comment-16261899
 ] 

Duo Zhang commented on HBASE-19321:
---

Let me take a look. If no cluster then no doubt we can not connect to zk... It 
does not make sense to create a zk registry.

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261896#comment-16261896
 ] 

Ted Yu commented on HBASE-19321:


Looking at the stack trace in hung state:
{code}
"Time-limited test" #814 daemon prio=5 os_prio=31 tid=0x7fa39d5d7000 
nid=0x47a13 in Object.wait() [0x700022aa9000]
   java.lang.Thread.State: WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  at java.lang.Object.wait(Object.java:502)
  at 
org.apache.curator.framework.state.ConnectionStateManager.blockUntilConnected(ConnectionStateManager.java:224)
  - locked <0x000791798dd8> (a 
org.apache.curator.framework.state.ConnectionStateManager)
  at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.blockUntilConnected(CuratorFrameworkImpl.java:266)
  at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.blockUntilConnected(CuratorFrameworkImpl.java:272)
  at 
org.apache.hadoop.hbase.client.ZKAsyncRegistry.(ZKAsyncRegistry.java:84)
{code}
It seems that blockUntilConnected() never finished.


> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261897#comment-16261897
 ] 

Ted Yu commented on HBASE-19321:


[~Apache9]:
Can you take a look ?

> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19321:
---
Description: 
>From 
>https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
> :
{code}
org.junit.runners.model.TestTimedOutException: test timed out after 30 
milliseconds
at 
org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
{code}
It seems this started hanging after HBASE-19313

  was:
>From 
>https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
> :
{code}
org.junit.runners.model.TestTimedOutException: test timed out after 30 
milliseconds
at 
org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
{code}
It seems this started hanging after HBASE-19301


> TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
> --
>
> Key: HBASE-19321
> URL: https://issues.apache.org/jira/browse/HBASE-19321
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
>  :
> {code}
> org.junit.runners.model.TestTimedOutException: test timed out after 30 
> milliseconds
>   at 
> org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
> {code}
> It seems this started hanging after HBASE-19313



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19321) TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs

2017-11-21 Thread Ted Yu (JIRA)
Ted Yu created HBASE-19321:
--

 Summary: TestAdmin2#testCheckHBaseAvailableWithoutCluster hangs
 Key: HBASE-19321
 URL: https://issues.apache.org/jira/browse/HBASE-19321
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBASE-Flaky-Tests/23477/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCheckHBaseAvailableWithoutCluster/
> :
{code}
org.junit.runners.model.TestTimedOutException: test timed out after 30 
milliseconds
at 
org.apache.hadoop.hbase.client.TestAdmin2.testCheckHBaseAvailableWithoutCluster(TestAdmin2.java:573)
{code}
It seems this started hanging after HBASE-19301



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261878#comment-16261878
 ] 

Yu Li commented on HBASE-19320:
---

bq. Can we also make the netty rpc server default for 2.0?
Yes, I thought this was the plan...

We've been running netty rpc server in production for more than a year and went 
through two big sale with it, everything looks good (although we still use the 
netty3 version for some historical reason, I believe netty4 would be better).

Just let me know what we could help/need to do to make it default, thanks.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261862#comment-16261862
 ] 

Duo Zhang commented on HBASE-19320:
---

OK a config to limit the size, better than nothing...

And for the rpc client, netty is the default for 2.0. But for server the 
default is still the old NIO one I think. [~carp84] Can we also make the netty 
rpc server default for 2.0? Thanks.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261852#comment-16261852
 ] 

huaxiang sun commented on HBASE-19320:
--

The fix is in jdk8u102 and jdk9, FYI. 
http://www.oracle.com/technetwork/java/javase/8u102-relnotes-3021767.html?printOnly=1

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261849#comment-16261849
 ] 

huaxiang sun commented on HBASE-19320:
--

That is great to learn [~Apache9]! Netty by default for 2.0?

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19313) Call blockUntilConnected when constructing ZKAsyncRegistry(temporary workaround)

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261840#comment-16261840
 ] 

Duo Zhang commented on HBASE-19313:
---

{quote}
Why is interrupt status not restored ?
{quote}
Just a workaround so do not think too much.

{quote}
Since this is workaround, what's the implication in production scenario ?
{quote}
The construction of ZKAsyncRegistry will not be non blocking and become a 
synchronous operation.

> Call blockUntilConnected when constructing ZKAsyncRegistry(temporary 
> workaround)
> 
>
> Key: HBASE-19313
> URL: https://issues.apache.org/jira/browse/HBASE-19313
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client, Zookeeper
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19313.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19312) Find out why sometimes we need to spend more than one second to get the cluster id

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261835#comment-16261835
 ] 

Duo Zhang commented on HBASE-19312:
---

{quote}
How much work would this be Duo Zhang (This would be preferred all-around I'd 
say especially after the work you've done cleaning up zk in client...)
{quote}

Not too much especially that we do not need watcher and is read only. The 
difficult is that the construction of ZooKeeper is synchronous, and also the 
connection state management is a bit complicated.

The curator team has replied in CURATOR-443 and set the fix version to 4.0.1. 
Let's wait for a while to see if they can give a solution.

Thanks.

> Find out why sometimes we need to spend more than one second to get the 
> cluster id
> --
>
> Key: HBASE-19312
> URL: https://issues.apache.org/jira/browse/HBASE-19312
> Project: HBase
>  Issue Type: Bug
>  Components: asyncclient, Client, Zookeeper
>Reporter: Duo Zhang
>Priority: Blocker
> Fix For: 2.0.0
>
>
> See the discussion in HBASE-19266.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261796#comment-16261796
 ] 

Duo Zhang commented on HBASE-19320:
---

I observed this problem years ago in java6, so until now the Java team still 
haven’t fixed it? The thread local cache is just a nightmare for an rpc server 
with hundreds of handler threads.

I used to maintain a DBB pool with fixed size and if there is a large message I 
will write it chunk by chunk. After that I start to use netty and the problem 
is gone. Netty has its own buffer pool which works like a jemalloc. So, let’s 
start using netty by default?

Thanks.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16137) Fix findbugs warning introduced by hbase-14730

2017-11-21 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261793#comment-16261793
 ] 

huaxiang sun commented on HBASE-16137:
--

Hi [~psomogyi], I assign this jira to you, thanks!

> Fix findbugs warning introduced by hbase-14730
> --
>
> Key: HBASE-16137
> URL: https://issues.apache.org/jira/browse/HBASE-16137
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> From stack:
> "Lads. This patch makes for a new findbugs warning: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/2390/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
> If you are good w/ the code, i can fix the findbugs warning... just say."



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-16137) Fix findbugs warning introduced by hbase-14730

2017-11-21 Thread huaxiang sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun reassigned HBASE-16137:


Assignee: Peter Somogyi  (was: huaxiang sun)

> Fix findbugs warning introduced by hbase-14730
> --
>
> Key: HBASE-16137
> URL: https://issues.apache.org/jira/browse/HBASE-16137
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0
>Reporter: huaxiang sun
>Assignee: Peter Somogyi
>Priority: Minor
>
> From stack:
> "Lads. This patch makes for a new findbugs warning: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/2390/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
> If you are good w/ the code, i can fix the findbugs warning... just say."



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-21 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19319:
-
Status: Patch Available  (was: Open)

> Fix bug in synchronizing over ProcedureEvent
> 
>
> Key: HBASE-19319
> URL: https://issues.apache.org/jira/browse/HBASE-19319
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19319.master.001.patch
>
>
> Following synchronizes over local variable rather than the original 
> ProcedureEvent object. Clearly a bug since this code block won't follow 
> exclusion with many of the synchronized methods in ProcedureEvent class.
> {code}
>  @Override
>   public void wakeEvents(final int count, final ProcedureEvent... events) {
> final boolean traceEnabled = LOG.isTraceEnabled();
> schedLock();
> try {
>   int waitingCount = 0;
>   for (int i = 0; i < count; ++i) {
> final ProcedureEvent event = events[i];
> synchronized (event) {
>   if (!event.isReady()) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-21 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19319:
-
Attachment: HBASE-19319.master.001.patch

> Fix bug in synchronizing over ProcedureEvent
> 
>
> Key: HBASE-19319
> URL: https://issues.apache.org/jira/browse/HBASE-19319
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19319.master.001.patch
>
>
> Following synchronizes over local variable rather than the original 
> ProcedureEvent object. Clearly a bug since this code block won't follow 
> exclusion with many of the synchronized methods in ProcedureEvent class.
> {code}
>  @Override
>   public void wakeEvents(final int count, final ProcedureEvent... events) {
> final boolean traceEnabled = LOG.isTraceEnabled();
> schedLock();
> try {
>   int waitingCount = 0;
>   for (int i = 0; i < count; ++i) {
> final ProcedureEvent event = events[i];
> synchronized (event) {
>   if (!event.isReady()) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-21 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261781#comment-16261781
 ] 

Appy commented on HBASE-19319:
--

Ping [~stack]/[~uagashe] for review.

> Fix bug in synchronizing over ProcedureEvent
> 
>
> Key: HBASE-19319
> URL: https://issues.apache.org/jira/browse/HBASE-19319
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19319.master.001.patch
>
>
> Following synchronizes over local variable rather than the original 
> ProcedureEvent object. Clearly a bug since this code block won't follow 
> exclusion with many of the synchronized methods in ProcedureEvent class.
> {code}
>  @Override
>   public void wakeEvents(final int count, final ProcedureEvent... events) {
> final boolean traceEnabled = LOG.isTraceEnabled();
> schedLock();
> try {
>   int waitingCount = 0;
>   for (int i = 0; i < count; ++i) {
> final ProcedureEvent event = events[i];
> synchronized (event) {
>   if (!event.isReady()) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261779#comment-16261779
 ] 

huaxiang sun edited comment on HBASE-19320 at 11/22/17 12:52 AM:
-

Attach two screenshots for this issue.

The first one is query DirectByteBuffer whose capacity > 4M. The second one is 
the gc root path to illustrate who holds this DM. I checked all of these 
objects, they are have the same gc root path.

Adding these objects' capacity together, it is more than 2GB.


was (Author: huaxiang):
Attach two screenshots for this issue.

The first one is query DirectByteBuffer whose capacity > 5M. The second one is 
the gc root path to illustrate who holds this DM. I checked all of these 
objects, they are have the same gc root path.

Adding these objects' capacity together, it is more than 2GB.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261779#comment-16261779
 ] 

huaxiang sun edited comment on HBASE-19320 at 11/22/17 12:51 AM:
-

Attach two screenshots for this issue.

The first one is query DirectByteBuffer whose capacity > 5M. The second one is 
the gc root path to illustrate who holds this DM. I checked all of these 
objects, they are have the same gc root path.

Adding these objects' capacity together, it is more than 2GB.


was (Author: huaxiang):
Attach two screenshots for this issue.

The first one is query DirectByteBuffer whose capacity > 5M. The second one is 
the gc root path to illustrate who holds this DM. I checked all of these 
objects, they are have the same gc root path.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261779#comment-16261779
 ] 

huaxiang sun commented on HBASE-19320:
--

Attach two screenshots for this issue.

The first one is query DirectByteBuffer whose capacity > 5M. The second one is 
the gc root path to illustrate who holds this DM. I checked all of these 
objects, they are have the same gc root path.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun updated HBASE-19320:
-
Attachment: Screen Shot 2017-11-21 at 4.43.36 PM.png
Screen Shot 2017-11-21 at 4.44.22 PM.png

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261769#comment-16261769
 ] 

stack commented on HBASE-19320:
---

@fyi [~anoop.hbase] This is a good one sir.


> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun updated HBASE-19320:
-
Affects Version/s: 1.2.6

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261768#comment-16261768
 ] 

stack commented on HBASE-18946:
---

bq. While doing roundrobinAssignment contact the AM to know the current state 
of replica regions and choose a server accordingly. 

We only do this when it a region with replicas or do we do it always (would be 
good if former, we want assignment to run fast).

Yeah, if round robin, its round robin (smile).

Please remind me what is the rule for replica assign? Just that they need to be 
on different servers? Nothing about ordering? (Hmm... seems like replica has to 
go out first). How does the patch to the balancer ensure this ordering?

is there a hole where you can't see an ongoing Assigment? It has been queue'd 
and is being worked on but but you have no means of querying where a region is 
being assigned (i.e. we are about to assign a replica and we want to avoid 
assigning to the same location as where we just assigned?).

If round robin, are we not moving through the list of servers? Is the issue 
only when cluster is small -- three servers or so?


On patch, don't renumber protobuf fields.

What is happening here (BTW, repeats code):
{code}
1263List serverRegions =
1264assignments.computeIfAbsent(serverName, k -> new 
ArrayList<>());
1265if (!RegionReplicaUtil.isDefaultReplica(region)) {
1266  if (!replicaAvailable(region, serverName)) {
1267assignRegionToServer(cluster, serverName, serverRegions, 
region);
1268serverIdx = (j + serverIdx + 1) % numServers;
1269assigned = true;
1270break;
1271  }
1272} else if (!cluster.wouldLowerAvailability(region, serverName)) 
{
1273  assignRegionToServer(cluster, serverName, serverRegions, 
region);
1274  serverIdx = (j + serverIdx + 1) % numServers; // remain from 
next server
...
{code}

If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?


Good stuff.




> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-21 Thread huaxiang sun (JIRA)
huaxiang sun created HBASE-19320:


 Summary: document the mysterious direct memory leak in hbase 
 Key: HBASE-19320
 URL: https://issues.apache.org/jira/browse/HBASE-19320
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: huaxiang sun
Assignee: huaxiang sun


Recently we run into a direct memory leak case, which takes some time to trace 
and debug. Internally discussed with our [~saint@gmail.com], we thought we 
had some findings and want to share with the community.

Basically, it is the issue described in 
http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of our 
hbase clusters.

Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19122) preCompact and preFlush can bypass by returning null scanner; shut it down

2017-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261740#comment-16261740
 ] 

Hadoop QA commented on HBASE-19122:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
4s{color} | {color:red} hbase-server: The patch generated 4 new + 102 unchanged 
- 0 fixed = 106 total (was 102) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 0s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
53m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m  1s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestHRegionWithInMemoryFlush |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19122 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898731/HBASE-19122.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 1a1485ef09a4 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 984e0ecfc4 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9953/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9953/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9953/testReport/ |
| 

[jira] [Commented] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-21 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261693#comment-16261693
 ] 

Appy commented on HBASE-19319:
--

Two of the three ProcedureEvent functions in ProcedureScheduler don't even use 
scheduler in any way - waitEvent and suspendEvent. They only work on event's 
internals. There's no reason to put them in scheduler. Moving them to PEvent 
class.
On the same lines, am making ProcedureEvent class to be entry point for waking 
up events (just like all PEvent related functions).

ProcedureScheduler doesn't "own" events i.e. it doesn't keep track of events, 
or manage them, or anything. So we shouldn't put ProcedureEvent "utility" 
functions in ProcedureScheduler.


> Fix bug in synchronizing over ProcedureEvent
> 
>
> Key: HBASE-19319
> URL: https://issues.apache.org/jira/browse/HBASE-19319
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>
> Following synchronizes over local variable rather than the original 
> ProcedureEvent object. Clearly a bug since this code block won't follow 
> exclusion with many of the synchronized methods in ProcedureEvent class.
> {code}
>  @Override
>   public void wakeEvents(final int count, final ProcedureEvent... events) {
> final boolean traceEnabled = LOG.isTraceEnabled();
> schedLock();
> try {
>   int waitingCount = 0;
>   for (int i = 0; i < count; ++i) {
> final ProcedureEvent event = events[i];
> synchronized (event) {
>   if (!event.isReady()) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2017-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261687#comment-16261687
 ] 

Hadoop QA commented on HBASE-17852:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 19 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} hbase-backup: The patch generated 3 new + 179 
unchanged - 18 fixed = 182 total (was 197) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
26s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
47m  8s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
28s{color} | {color:green} hbase-backup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
23s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-17852 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898743/HBASE-17852-v8.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux bd11d8af32a0 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3b2b22b5fa |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 

[jira] [Created] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-21 Thread Appy (JIRA)
Appy created HBASE-19319:


 Summary: Fix bug in synchronizing over ProcedureEvent
 Key: HBASE-19319
 URL: https://issues.apache.org/jira/browse/HBASE-19319
 Project: HBase
  Issue Type: Bug
Reporter: Appy
Assignee: Appy


Following synchronizes over local variable rather than the original 
ProcedureEvent object. Clearly a bug since this code block won't follow 
exclusion with many of the synchronized methods in ProcedureEvent class.
{code}
 @Override
  public void wakeEvents(final int count, final ProcedureEvent... events) {
final boolean traceEnabled = LOG.isTraceEnabled();
schedLock();
try {
  int waitingCount = 0;
  for (int i = 0; i < count; ++i) {
final ProcedureEvent event = events[i];
synchronized (event) {
  if (!event.isReady()) {
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16137) Fix findbugs warning introduced by hbase-14730

2017-11-21 Thread Peter Somogyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261659#comment-16261659
 ] 

Peter Somogyi commented on HBASE-16137:
---

I missed Anoop's comment about HBASE-16180. Actually that ignores the warning.

> Fix findbugs warning introduced by hbase-14730
> --
>
> Key: HBASE-16137
> URL: https://issues.apache.org/jira/browse/HBASE-16137
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> From stack:
> "Lads. This patch makes for a new findbugs warning: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/2390/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
> If you are good w/ the code, i can fix the findbugs warning... just say."



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-10925) Do not OOME, throw RowTooBigException instead

2017-11-21 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261635#comment-16261635
 ] 

churro morales commented on HBASE-10925:


What do you guys think about adding another configurable option to the client 
where you can potentially skip large rows if the configuration is set.  

A nice example is if you have jobs running and don't want them to fail because 
of a large row.  You can simply catch the exception and then resume the scan 
with the (row_key + pad a empty byte) thus your jobs / scans wont fail.  I 
think it would be best to have this as a parameter to the Scan but having an 
API change would not be ideal.



> Do not OOME, throw RowTooBigException instead
> -
>
> Key: HBASE-10925
> URL: https://issues.apache.org/jira/browse/HBASE-10925
> Project: HBase
>  Issue Type: Improvement
>  Components: Usability
>Affects Versions: 0.99.0
>Reporter: stack
>Assignee: Mikhail Antonov
> Fix For: 0.99.0
>
> Attachments: HBASE-10925.patch, HBASE-10925.patch, HBASE-10925.patch
>
>
> If 10M columns in a row, throw a RowTooBigException rather than OOME when 
> Get'ing or Scanning w/o in-row scan flag set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16137) Fix findbugs warning introduced by hbase-14730

2017-11-21 Thread Peter Somogyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261628#comment-16261628
 ] 

Peter Somogyi commented on HBASE-16137:
---

Is this still valid? The linked report is dead.
Looks like HBASE-15118 touched the area where HBASE-14730 had modification so 
this ticket could be closed.

> Fix findbugs warning introduced by hbase-14730
> --
>
> Key: HBASE-16137
> URL: https://issues.apache.org/jira/browse/HBASE-16137
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> From stack:
> "Lads. This patch makes for a new findbugs warning: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/2390/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
> If you are good w/ the code, i can fix the findbugs warning... just say."



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log

2017-11-21 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261597#comment-16261597
 ] 

Appy commented on HBASE-19290:
--

{quote}
bq. So there are 2 available splitters, and one grabbed task, we don't stop 
here and keep hammering zk?
Yes.
{quote}
I can see it's doing that. The question really meant - why such interesting 
choice? It's not usual thing to do i.e. throttle for first request and start 
hammering servers after that. If it was something you chose by design - please 
add a comment about the behavior and explaining reasoning.

bq. The while loop will enter only if when seq_start == taskReadySeq.get(), and 
when every splitLogZNode's children changed the taskReadySeq will increment, so 
it will not enter the while (seq_start == taskReadySeq.get()) {} and kill 
trying to grab task and issue zk request.
Makes sense. thanks.


> Reduce zk request when doing split log
> --
>
> Key: HBASE-19290
> URL: https://issues.apache.org/jira/browse/HBASE-19290
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19290) Reduce zk request when doing split log

2017-11-21 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261597#comment-16261597
 ] 

Appy edited comment on HBASE-19290 at 11/21/17 10:29 PM:
-

{quote}
bq. So there are 2 available splitters, and one grabbed task, we don't stop 
here and keep hammering zk?
Yes.
{quote}
I can see it's doing that. The question really meant - why such interesting 
choice? It's not usual thing to do i.e. throttle for first request and start 
hammering servers after that. If it was something you chose by design - please 
add a comment about the behavior and explaining reasoning; if not by design, 
then probably remove the 'if condition' and always sleep.

bq. The while loop will enter only if when seq_start == taskReadySeq.get(), and 
when every splitLogZNode's children changed the taskReadySeq will increment, so 
it will not enter the while (seq_start == taskReadySeq.get()) {} and kill 
trying to grab task and issue zk request.
Makes sense. thanks.



was (Author: appy):
{quote}
bq. So there are 2 available splitters, and one grabbed task, we don't stop 
here and keep hammering zk?
Yes.
{quote}
I can see it's doing that. The question really meant - why such interesting 
choice? It's not usual thing to do i.e. throttle for first request and start 
hammering servers after that. If it was something you chose by design - please 
add a comment about the behavior and explaining reasoning.

bq. The while loop will enter only if when seq_start == taskReadySeq.get(), and 
when every splitLogZNode's children changed the taskReadySeq will increment, so 
it will not enter the while (seq_start == taskReadySeq.get()) {} and kill 
trying to grab task and issue zk request.
Makes sense. thanks.


> Reduce zk request when doing split log
> --
>
> Key: HBASE-19290
> URL: https://issues.apache.org/jira/browse/HBASE-19290
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2017-11-21 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-17852:
--
Attachment: HBASE-17852-v8.patch

v8  some checkstyle fixes

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-17852-v1.patch, HBASE-17852-v2.patch, 
> HBASE-17852-v3.patch, HBASE-17852-v4.patch, HBASE-17852-v5.patch, 
> HBASE-17852-v6.patch, HBASE-17852-v7.patch, HBASE-17852-v8.patch
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-15816) Provide client with ability to set priority on Operations

2017-11-21 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261581#comment-16261581
 ] 

churro morales commented on HBASE-15816:


[~anoopamz] totally forgot about this.  The release note you created is great, 
sorry again must have slipped through the cracks. Thank you for putting up the 
release notes. 

> Provide client with ability to set priority on Operations 
> --
>
> Key: HBASE-15816
> URL: https://issues.apache.org/jira/browse/HBASE-15816
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: churro morales
>Assignee: churro morales
> Fix For: 2.0.0, 3.0.0, 1.4.0
>
> Attachments: HBASE-15816-v1.patch, HBASE-15816.patch, 
> HBASE-15816.v1.branch-1.patch, HBASE-15816.v2.patch
>
>
> First round will just be to expose the ability to set priorities for client 
> operations.  For more background: 
> http://mail-archives.apache.org/mod_mbox/hbase-dev/201604.mbox/%3CCA+RK=_BG_o=q8HMptcP2WauAinmEsL+15f3YEJuz=qbpcya...@mail.gmail.com%3E
> Next step would be to remove AnnotationReadingPriorityFunction and have the 
> client send priorities explicitly.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19318) MasterRpcServices#getSecurityCapabilities explicitly checks for the HBase AccessController implementation

2017-11-21 Thread Josh Elser (JIRA)
Josh Elser created HBASE-19318:
--

 Summary: MasterRpcServices#getSecurityCapabilities explicitly 
checks for the HBase AccessController implementation
 Key: HBASE-19318
 URL: https://issues.apache.org/jira/browse/HBASE-19318
 Project: HBase
  Issue Type: Bug
  Components: master, security
Reporter: Sharmadha Sainath
Assignee: Josh Elser
Priority: Critical
 Fix For: 1.4.0, 1.3.2, 1.2.7, 2.0.0-beta-1


Sharmadha brought a failure to my attention trying to use Ranger with HBase 2.0 
where the {{grant}} command was erroring out unexpectedly. The cluster had the 
Ranger-specific coprocessors deployed, per what was previously working on the 
HBase 1.1 line.

After some digging, I found that the the Master is actually making a check 
explicitly for a Coprocessor that has the name 
{{org.apache.hadoop.hbase.security.access.AccessController}} (short name or 
full name), instead of looking for a deployed coprocessor which can be assigned 
to {{AccessController}} (which is what Ranger does). We have the 
CoprocessorHost methods to do the latter already implemented; it strikes me 
that we just accidentally used the wrong method in MasterRpcServices.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16868) Add a replicate_all flag to avoid misuse the namespaces and table-cfs config of replication peer

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261570#comment-16261570
 ] 

stack commented on HBASE-16868:
---

Thanks [~zghaobac] for explanation. Is this good by you [~ashish singhi]? I 
don't know this area well enough.

> Add a replicate_all flag to avoid misuse the namespaces and table-cfs config 
> of replication peer
> 
>
> Key: HBASE-16868
> URL: https://issues.apache.org/jira/browse/HBASE-16868
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-16868.master.001.patch, 
> HBASE-16868.master.002.patch, HBASE-16868.master.003.patch, 
> HBASE-16868.master.004.patch, HBASE-16868.master.005.patch, 
> HBASE-16868.master.006.patch, HBASE-16868.master.007.patch, 
> HBASE-16868.master.008.patch
>
>
> First add a new peer by shell cmd.
> {code}
> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase".
> {code}
> If we don't set namespaces and table cfs in peer config. It means replicate 
> all tables to the peer cluster.
> Then append a table to the peer config.
> {code}
> append_peer_tableCFs '1', {"table1" => []}
> {code}
> Then this peer will only replicate table1 to the peer cluster. It changes to 
> replicate only one table from replicate all tables in the cluster. It is very 
> easy to misuse in production cluster. So we should avoid appending table to a 
> peer which replicates all table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19315) Incorrect snapshot version is used for 2.0.0-beta-1

2017-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261558#comment-16261558
 ] 

Hudson commented on HBASE-19315:


FAILURE: Integrated in Jenkins build HBase-2.0 #892 (See 
[https://builds.apache.org/job/HBase-2.0/892/])
HBASE-19315 Incorrect snapshot version is used for 2.0.0-beta-1 (stack: rev 
bcd367e2935c6b2f85dcc16c63dbc1675cd0d51c)
* (edit) hbase-assembly/pom.xml
* (edit) hbase-rsgroup/pom.xml
* (edit) hbase-shaded/hbase-shaded-client/pom.xml
* (edit) hbase-annotations/pom.xml
* (edit) hbase-examples/pom.xml
* (edit) hbase-archetypes/hbase-client-project/pom.xml
* (edit) hbase-archetypes/pom.xml
* (edit) hbase-build-support/pom.xml
* (edit) hbase-replication/pom.xml
* (edit) hbase-metrics-api/pom.xml
* (edit) hbase-procedure/pom.xml
* (edit) hbase-backup/pom.xml
* (edit) hbase-checkstyle/pom.xml
* (edit) hbase-mapreduce/pom.xml
* (edit) hbase-testing-util/pom.xml
* (edit) hbase-protocol-shaded/pom.xml
* (edit) hbase-thrift/pom.xml
* (edit) hbase-client/pom.xml
* (edit) hbase-build-configuration/pom.xml
* (edit) hbase-http/pom.xml
* (edit) hbase-metrics/pom.xml
* (edit) hbase-shell/pom.xml
* (edit) hbase-archetypes/hbase-archetype-builder/pom.xml
* (edit) hbase-shaded/pom.xml
* (edit) hbase-hadoop-compat/pom.xml
* (edit) hbase-it/pom.xml
* (edit) hbase-rest/pom.xml
* (edit) hbase-archetypes/hbase-shaded-client-project/pom.xml
* (edit) hbase-shaded/hbase-shaded-check-invariants/pom.xml
* (edit) hbase-shaded/hbase-shaded-mapreduce/pom.xml
* (edit) hbase-common/pom.xml
* (edit) hbase-external-blockcache/pom.xml
* (edit) hbase-hadoop2-compat/pom.xml
* (edit) hbase-server/pom.xml
* (edit) hbase-endpoint/pom.xml
* (edit) hbase-protocol/pom.xml
* (edit) hbase-zookeeper/pom.xml
* (edit) pom.xml
* (edit) hbase-resource-bundle/pom.xml
* (edit) hbase-build-support/hbase-error-prone/pom.xml


> Incorrect snapshot version is used for 2.0.0-beta-1 
> 
>
> Key: HBASE-19315
> URL: https://issues.apache.org/jira/browse/HBASE-19315
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19315.branch-2.001.patch
>
>
> Maven complains that used snapshot version is incorrect. Dot is used instead 
> of hyphen.
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hbase:hbase-error-prone:jar:2.0.0-beta-1.SNAPSHOT
> [WARNING] 'version' uses an unsupported snapshot version format, should be 
> '*-SNAPSHOT' instead. @ line 30, column 12



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19291) Use common header and footer for JSP pages

2017-11-21 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19291:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Use common header and footer for JSP pages
> --
>
> Key: HBASE-19291
> URL: https://issues.apache.org/jira/browse/HBASE-19291
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19291.master.001.patch
>
>
> Use header and footer in our *.jsp pages to avoid unnecessary redundancy 
> (copy-paste of code)
> (Been sitting in my local repo for long, best to get following pesky 
> user-facing things fixed before the next major release)
> Misc edits:
> - Due to redundancy, new additions make it to some places but not others. For 
> eg there are missing links to "/logLevel", "/processRS.jsp" in few places.
> - Fix processMaster.jsp wrongly pointing to rs-status instead of 
> master-status (probably due to copy paste from processRS.jsp)
> - Deleted a bunch of extraneous "" in processMaster.jsp & processRS.jsp
> - Added missing  tag in snapshot.jsp
> - Deleted fossils of html5shiv.js. It's uses and the js itself were deleted 
> in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f"
> - Fixed wrongly matched heading tags
> - Deleted some unused variables
> Tested:
> Ran standalone cluster and opened each page to make sure it looked right.
> Sidenote:
> Looks like HBASE-3835 started the work of converting from jsp to jamon, but 
> the work didn't finish. Now we have a mix of jsp and jamon. Needs 
> reconciling, but later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >